Julia is unusual in that it is a dynamically typed language (meaning you don’t have to declare variable types “statically”, in program text), while at the same time supporting a rich mechanism for communicating type information to the JIT compiler and runtime. Effectively, Julia has both dynamic and static typing. When provided, static type information can have non-trivial impact on Julia code behavior, both in terms of functionality and performance.
This distinctive typing seems to be one of the defining features of Julia. To keep tutorials to a reasonable length, I cover basics here and delay further case studies (parameterized types, overloading, multiple dispatch) until the next one.
Preview
- A quick tour of Julia type taxonomy.
- Drill into type annotations and understand their impact on: code documentation, correctness, performance.
- Julia functions are always “virtual”. Julia supports functional OOP as a paradigm.
Types and inheritance
Everything in Julia that is a value also has a type and the types are first-class objects (think T.class
in Java or T.__type__
in Python). Subtyping relationships are set up and queried with the <: subtype operator:
julia> typeof(1)
Int64
julia> supertype(typeof(1))
Signed
julia> typeof(1) <: supertype(typeof(2))
true
julia> Int64 <: Signed
true
julia> Int64 >: Signed # there is also a 'supertype operator', used trivially here but more useful for parameterized types
false
Types known to a given runtime session form a tree (a directed graph, actually) rooted at Any
.
Speaking of [type] trees
I found it handy to explore subtrees within this graph of types using the following helper function. It builds on subtypes()
from InteractiveUtils.jl, which would be already imported and availalbe if you’re in REPL but may need an import
in a script:
import InteractiveUtils
function subtypetree(T, depth = 0) # you might also want to add 'max_depth'...
println('\t' ^ depth, T)
for t in InteractiveUtils.subtypes(T)
subtypetree(t, depth + 1)
end
end
Arithmetic types
There is a collection of (ahem) typical types for arithmetic:
Int8
, …,Int64
,Int128
, plus unsigned variants,Float16
,Float32
,Float64
,BigInt
andBigFloat
,
organized in this hierarchy:
julia> subtypetree(Number)
Number
Complex
Real
AbstractFloat
BigFloat
Float16
Float32
Float64
AbstractIrrational
Irrational
Integer
Bool
Signed
BigInt
Int128
Int16
Int32
Int64
Int8
Unsigned
UInt128
UInt16
UInt32
UInt64
UInt8
Rational
The first thing that struck me here is Julia’s keeping with its emphasis on performance: Julia integers are not “big” by default (contrast with Python 3). And the rich spectrum of arithmetic bit widths is meaningful: Float16
and Float32
could be useful for GPU computing, while wide Int
-types could work with SSE/AVX/etc instructions and/or support efficient interfacing with native code. (I say “could” because I have no idea yet if that’s really the case.)
Primitive types
Turns out all of these arithmetic types are not what you’d call “built into the compiler” but are rather defined in the language itself, as standard primitive types1:
primitive type Bool <: Integer 8 end
primitive type Int64 <: Signed 64 end
The syntax is
primitive type «name» <: «supertype» «bits» end
where the supertype is optional (and defaults to Any
). I can apparently define my own primitive type, a 24-bit integer2:
julia> primitive type Int24 <: Signed 24 end
julia> subtypetree(Number)
Number
Complex
Real
...
Integer
Bool
Signed
...
Int24
...
(This is great… but how do I construct an Int24
or define arithmetic? Julia docs are not clear on that – I think I know how to proceed but that is outside of today’s scope.)
Concrete vs abstract types
I’ve already used a struct
in the knapsack benchmark to represent knapsack items as instances of this type:
struct Item
value ::Int64
weight ::Int64
end
These are straightforward: Item
s contain fields (the type is composite) and can be instantiated (the type is concrete). They can also be examined at runtime using Julia’s reflection facilities:
julia> sizeof(Item)
16
julia> fieldcount(Item)
2
julia> fieldnames(Item)
(:value, :weight)
julia> fieldtypes(Item)
(Int64, Int64)
julia> function showfields(T)
for i in 1 : fieldcount(T)
println(fieldoffset(T, i), '\t', fieldname(T, i), "\t::", fieldtype(T, i))
end
end
showfields (generic function with 1 method)
julia> showfields(Item)
0 value ::Int64
8 weight ::Int64
but a few things here are less obvious:
struct
s default to being immutable unless explicitly marked as mutable.- they are always final, i.e. cannot be further inherited from3. (This is also true of primitive and, in fact, any non-abstract types.)
Defaulting to immutability is not an arbitrary choice. It can be exploited by the compiler to optimize performance and memory usage by “interning” values that are indistinguishable if they are equal. Compare
julia> i1 = Item(1, 2)
Item(1, 2)
julia> i2 = Item(1, 2)
Item(1, 2)
julia> i1 == i2 # equal
true
julia> i1 === i2 # actually, the same "interned" object
true
julia> i1 == Item(1, 3)
false
with
julia> mutable struct MutableItem # mutable version of 'Item'
value ::Int64
weight ::Int64
end
julia> i1 = MutableItem(1, 2)
MutableItem(1, 2)
julia> i2 = MutableItem(1, 2)
MutableItem(1, 2)
julia> i1 == i2 # not equal!
false # <- surprised? looks like '==' needs to be defined for custom mutable types...
Julia also has abstract types which can’t be instantiated, but are instead used to organize the type graph via shared parent nodes. You could also say they act as “marker” or “trait” base classes, like Number
or Signed
above. Since all non-abstract Julia types are final, any supertype is necessarily an abstract type (Any
if not specified explicitly).
julia> abstract type MyInt <: Int32 end
ERROR: invalid subtyping in definition of MyInt
Stacktrace:
[1] ...
julia> abstract type MyInt <: supertype(Int32) end
julia> subtypetree(Number)
Number
Complex
Real
...
Integer
Bool
Signed
...
Int24
...
MyInt
...
Type parameters
To various degrees, all of the three major categories of Julia types (primitive, composite, abstract) are available in other dynamic languages, either natively or via some libraries. Julia also offers something that gets it if not into the realm of uber-powerful (and uber-complicated) C++ metaprogramming, then definitely into the realm of Java generics: all three type categories can be further parameterized with other types and values. I’ll explore this in a future tutorial.
Type annotations
And now to the meat of this tutorial: type annotations. A type annotation in Julia looks like <thing>::<type>
4 – it is an in-place modifier to a <thing>
introduced by the ::
operator.
Typeasserts vs variable declarations
The way I read the documentation, Julia type annotations can be applied to two types of <thing>
s:
[typeassert] expressions computing a value (and recall that everything in Julia is an expression):
x = y ::Float64 # promises to the runtime that at this point in the execution 'y' will be a Float64
[variable type declaration] left-hand sides of assignments or declarations that introduce (local) variables:
x ::Float64 = y # declares a new local 'x', marks it as always containing Float64 values, and initializes with 'y' converted to Float64
This second case also covers typed fields of
struct
s and named tuples:struct Point x ::Float64 # this field will always contain only Float64 values y # this field can contain any Julia value end
The first case is a typeassert. The second kind of annotation marks the name/field to its left as constrained to values compatible with the given type, and also ensures that throughout the variable’s scope all subsequent initializations of and assignments to it are filtered through an implicit conversion.
As a consequence, there are differences in runtime behavior and the information communicated to the system:
- With a typeassert the compiler will create code that at runtime will check the annotated value for type compatibility and throw a
TypeError
if the check fails – but it will not attempt to coerce the computed value to the annotation type in any way. Type-asserted syntax <exp>::T is precisely equivalent to a call, possibly inlined, to typeassert(<exp>, T) followed by making use of the <exp> value. - With a variable type declaration any assignment <lhs>::T = <rhs> will effectively translate into <lhs> = convert(T, <rhs>), i.e. contain a call, possibly inlined, to convert(T, <rhs>). And at runtime, every such assignment will attempt to coerce its right hand-side value to T, possibly resulting in a value that’s only an approximation. Should this conversion fail, an exception will be thrown:
- if no such conversion exists at all, a
MethodError
is thrown; - if T is an
Integer
(sub)type and cannot represent the expression value, anInexactError
is thrown.
- if no such conversion exists at all, a
To appreciate the difference, compare
julia> function foo()
x ::Float64 = 1 # implies 'convert(Float64, 1)'
x, typeof(x)
end
foo (generic function with 1 method)
julia> foo()
(1.0, Float64)
with
julia> function foo()
x = 1 ::Float64 # implies 'typeassert(1, Float64)'
x, typeof(x)
end
foo (generic function with 1 method)
julia> foo()
ERROR: TypeError: in typeassert, expected Float64, got Int64
Literal 1 is of a (machine-dependent) Int
type:
julia> typeof(1)
Int64
and even though it can be converted to a Float64
without loss, such a conversion is not even attempted in the second version of foo()
.
What about function signatures?
Unsurprisingly, it is also possible to type-annotate function arguments and return types:
function bar(x ::Float64) ::Float32
sin(2π * x)
end
Function arguments
If you’re coming from languages like C++ or Java where type conversions can happen as part of argument passing, you might think that x ::Float64 in bar()
is like a local (typed) variable declaration, similar to the second case above, perhaps implying a call to something like convert(Float64, x) everywhere before bar()
is invoked. That is not the case in Julia: no conversions ever take place as part of Julia function argument passing. In fact, Julia argument type annotations are actually more like those typeasserts: foo(x) will expect x to be a Float64
already.
There is a subtle difference from an in-place typeassert, however: with the above definition of bar()
there will be no need to generate an implicit call to typeassert() at all because I will only be allowed to call it with Float64
s. If I need sin(2π * x) for a Float64
input x
, no problem. For any other type5, say, Int64
, I will get a flat rejection not because a method call was tried and failed during Int64
-to-Float64
input type conversion (TypeError
) but because the requisite method (named “foo” and taking a single argument of type Int64
) did not exist (MethodError
). And since Float64
is a concrete type and, again, all concrete types are final in Julia, the universe of possible outcomes here shrinks dramatically:
julia> bar(0.75)
-1.0f0
julia> bar(1.)
-2.4492937f-16
julia> bar(1)
ERROR: MethodError: no method matching bar(::Int64)
Closest candidates are:
bar(::Float64) at ...
This may seem a little draconian, but it is connected to how Julia’s multiple dispatch works and is further ameliorated by Julia’s system of promoting function arguments to a common type.
Function return types
Specifying bar()
return type to be Float32
is a way to ensure that value being returned is passed through a convert(Float32, …). Whether this is desired depends on software design. I can imagine situations where it could be used as a way to safely return “special” values:
function sqrt_or_nothing(x ::Float64) ::Union{Float64, Nothing}
x < 0.0 ? nothing : √x
end
julia> @show sqrt_or_nothing(2.0)
sqrt_or_nothing(2.0) = 1.4142135623730951
1.414213562373095
julia> @show sqrt_or_nothing(-2.0)
sqrt_or_nothing(-2.0) = nothing
Alternatively, it might be easier to reason about your code behavior if most functions are strict about their return value types. Otherwise, it seems like it could be easy in Julia to accidentally return different types along different value return paths, which could cause inefficiencies or maybe even errors downstream:
function bar_clipped(x ::Float64)
x < 0.0 ? 0 : sin(2π * x)
end
julia> typeof(bar_clipped(0.75))
Float64
julia> typeof(bar_clipped(-0.75))
Int64 # oops, use 0.0 literal instead of 0 above
When are implicit conversions done?
Most of the cases of implicit calls to convert(…) have already been mentioned. Julia documentation offers this complete list:
- Assigning to an array converts to the array’s element type.
- Assigning to a field of an object converts to the declared type of the field.
- Constructing an object with new converts to the object’s declared field types.
- Assigning to a variable with a declared type (e.g. local x::T) converts to that type.
- A function with a declared return type converts its return value to that type.
- Passing a value to ccall converts it to the corresponding argument type.
Case study: functional OOP
So far, Julia type annotations appeared to be potentially beneficial (for code maintenance, performance), yet somehow optional, feature of the language. Let me now show a situation where annotation are truly necessary.
We saw how every function argument in Julia is always associated with a type. Is it possible for there to be multiple functions that all have the same name but different parameter types?
Not only is the answer “yes”, it is actually kind of like “yes, it is meant to happen a lot“: Julia thrives on maintaining multiple versions of the “same” function (called “methods”) and figuring out which version to invoke for a given set of inputs. These versions are distinguished by annotating parameters with different types. Enter “multiple dispatch”, a core paradigm of Julia programming6. The intuition is that Julia functions are essentially “always virtual”: unlike other languages where a class method needs to be marked in a special way to support “late binding” (method dispatch based on runtime, not compile, type of an object), Julia runtime system always dispatches all functions on the concrete runtime types of all their arguments. Because no argument position is “special” and the method does not “belong” to any particular parameter type, this form of polymorphism usually opts for language design with standalone functions, that is functions that do not live inside any “classes”. Some people call such designs “functional OOP”. It makes a lot of sense for math-style coding due to symmetries in function parameters.
Now, one way to ease into Julia multiple dispatch is to consider its simplest edge case: single dispatch.
Serializing Julia objects to JSON
I am going to implement a simple serializer of Julia objects to JSON. This will be a toy example, useless in any kind of production setting. My initial point is the “interface” method to_JSON() calling visit() that in turn starts as a single fallback that always fails:
function to_JSON(io ::IO, obj)
visit(obj, io)
end
function visit(obj, io ::IO)
error("default visit() called for obj type: ", typeof(obj))
end
I am going to anchor my design in the “virtual nature” of visit(obj, io), with execution routed based on the runtime type of obj. Looking at my Number
type trees above, I can see that this overload can cover JSON numbers and booleans:
function visit(obj ::Real, io ::IO)
print(io, obj)
end
This overload will kick in for any obj that belongs to a type derived from Real
– because that is more specific than Any
and because Julia dispatch algorithm will always choose the most specific method signature to call in every situation.
Strings are equally easy, but the method body needs to quote them7, so I need a new method overload:
function visit(obj ::AbstractString, io ::IO)
print(io, '"')
print(io, obj)
print(io, '"')
end
So far, I have told Julia to dispatch execution to either Real
(and its subtypes, which include ints, floats, and booleans) or AbstractString
(and its subtypes, including String
and everything that is string-like).
By the way, the actual dispatch decision is based on (obj, io) but io happens to be the same across all visit()s, so the dispatch is effectively on a single argument obj.
Why do I suggest thinking of visit() as “virtual”? Think of visit(obj,…) as equivalent to obj.visit(…) in a language like Python, Java, C++, where the version of visit() to use depends on the runtime type of obj.
Also note how abstract parent types come in handy: I don’t need to code explicit visit(Float32,…), visit(Float64,…), visit(BigFloat,…) for all possible (and future!) leaves of the type tree because I can handle things at the level of abstract parent nodes.
By this point, my imlementation roadmap should be apparent: I am going to keep adding more visit() overloads with obj parameter types chosen so as to partition the type universe into subtrees that correctly “carve out” each supported type of anything I expect to find inside my input. Taking the next step, for objects that can contain other objects the virtual nature of visit() becomes critical:
function visit(obj ::AbstractArray, io ::IO)
print(io, '[')
for i in 1 : length(obj)
i > 1 && print(io, ", ")
visit(obj[i], io)
end
print(io, ']')
end
function visit(obj ::AbstractDict, io ::IO)
print(io, '{')
first = true
for (k, v) in obj
first ? first = false : print(io, ", ")
visit(k ::AbstractString, io) # assert that key is a string
print(io, " : ")
visit(v, io)
end
print(io, '}')
end
The nested visit()s are already virtual, nothing else needs to be done to pick up a particular overload. There are also no “if-obj-type-is-…” condition checks – everything is as clean as in “pure” textbook OOP.
Just a handful of lines of code so far, and yet they already work on a variety of inputs:
julia> to_JSON(stdout, [1, false, zeros(2), [1.2345, 12, Dict("a" => true, "b" => 2.3, "c" => [1, 2, 3.4])]])
[1, false, [0.0, 0.0], [1.2345, 12, {"c" : [1.0, 2.0, 3.4], "b" : 2.3, "a" : true}]]
Not bad, looks like valid JSON to me.
Now, suppose that I would like to extend the set of supported “JSON-compatible” Julia types to also include tuples. I would like to output them as JSON arrays. All I need to do is add another overload:
function visit(obj ::Tuple, io ::IO)
print(io, '[')
for i in 1 : length(obj)
i > 1 && print(io, ", ")
visit(obj[i], io)
end
print(io, ']')
end
julia> to_JSON(stdout, [("tuple", (1, "nested", "tuple")) 1, false, zeros(2), [1.2345, 12, Dict("a" => true, "b" => 2.3, "c" => [1, 2, 3.4])]])
[["tuple", [1, "nested", "tuple"]], 1, false, [0.0, 0.0], [1.2345, 12, {"c" : [1.0, 2.0, 3.4], "b" : 2.3, "a" : true}]]
If you’re interested in playing with various design alternatives yourself, you can find this entire example here.
Summary
In summary, I would like to call out some things about Julia typing that make Julia feel different from many languages:
- Julia supports user-definable primitive types which do not need to be “boxed”.
- All superclasses are abstract and all concrete classes are final.
- Mutability is part of type definition, not of argument/variable/field.
- Multiple dispatch is a primary paradigm (“all methods are virtual”). This design is consistent with lack of classic “objects” that forcefully bundle state with behavior.
- Duck-typing also works in Julia, perhaps even without any performance loss. But certain core Julia features necessitate some static typing.
- This is what the public documentation says. Examining boot.jl of my Julia install shows that many of these types are actually implemented in C. ^
- As I write this, bit widths must be multiples of 8. ^
- Just like classes marked with
final
keyword in C++ or Java. ^ - I use an extra blank before
::
but you don’t have to. ^ - To simply the narrative, I am glossing over possibilities for promotion of
x
to another type. ^ - From what I can tell so far, that is. ^
- Note that this toy serializer doesn’t bother with backslash escapes, Unicode, etc. ^