Julia type annotations: what Python 'type hints' wish they were

Julia is unusual in that it is a dynamically typed language (meaning you don’t have to declare variable types “statically”, in program text), while at the same time supporting a rich mechanism for communicating type information to the JIT compiler and runtime. Effectively, Julia has both dynamic and static typing. When provided, static type information can have non-trivial impact on Julia code behavior, both in terms of functionality and performance.

This distinctive typing seems to be one of the defining features of Julia. To keep tutorials to a reasonable length, I cover basics here and delay further case studies (parameterized types, overloading, multiple dispatch) until the next one.

Preview

  • A quick tour of Julia type taxonomy.
  • Drill into type annotations and understand their impact on: code documentation, correctness, performance.
  • Julia functions are always “virtual”. Julia supports functional OOP as a paradigm.

Types and inheritance

Everything in Julia that is a value also has a type and the types are first-class objects (think T.class in Java or T.__type__ in Python). Subtyping relationships are set up and queried with the <: subtype operator:

julia> typeof(1)
Int64

julia> supertype(typeof(1))
Signed

julia> typeof(1) <: supertype(typeof(2))
true

julia> Int64 <: Signed
true

julia> Int64 >: Signed # there is also a 'supertype operator', used trivially here but more useful for parameterized types
false

Types known to a given runtime session form a tree (a directed graph, actually) rooted at Any.

Speaking of [type] trees

I found it handy to explore subtrees within this graph of types using the following helper function. It builds on subtypes() from InteractiveUtils.jl, which would be already imported and availalbe if you’re in REPL but may need an import in a script:

import InteractiveUtils

function subtypetree(T, depth = 0) # you might also want to add 'max_depth'...
    println('\t' ^ depth, T)
    for t in InteractiveUtils.subtypes(T)
        subtypetree(t, depth + 1)
    end
end

Arithmetic types

There is a collection of (ahem) typical types for arithmetic:

  • Int8, …, Int64, Int128, plus unsigned variants,
  • Float16, Float32, Float64,
  • BigInt and BigFloat,

organized in this hierarchy:

julia> subtypetree(Number)
Number
    Complex
    Real
        AbstractFloat
            BigFloat
            Float16
            Float32
            Float64
        AbstractIrrational
            Irrational
        Integer
            Bool
            Signed
                BigInt
                Int128
                Int16
                Int32
                Int64
                Int8
            Unsigned
                UInt128
                UInt16
                UInt32
                UInt64
                UInt8
        Rational

The first thing that struck me here is Julia’s keeping with its emphasis on performance: Julia integers are not “big” by default (contrast with Python 3). And the rich spectrum of arithmetic bit widths is meaningful: Float16 and Float32 could be useful for GPU computing, while wide Int-types could work with SSE/AVX/etc instructions and/or support efficient interfacing with native code. (I say “could” because I have no idea yet if that’s really the case.)

Primitive types

Turns out all of these arithmetic types are not what you’d call “built into the compiler” but are rather defined in the language itself, as standard primitive types1:

primitive type Bool  <: Integer   8 end
primitive type Int64 <: Signed   64 end

The syntax is

primitive type «name» <: «supertype» «bits» end

where the supertype is optional (and defaults to Any). I can apparently define my own primitive type, a 24-bit integer2:

julia> primitive type Int24 <: Signed 24 end

julia> subtypetree(Number)
Number
    Complex
    Real
        ...
        Integer
            Bool
            Signed
                ...
                Int24
                ...

(This is great… but how do I construct an Int24 or define arithmetic? Julia docs are not clear on that – I think I know how to proceed but that is outside of today’s scope.)

Concrete vs abstract types

I’ve already used a struct in the knapsack benchmark to represent knapsack items as instances of this type:

struct Item
    value   ::Int64
    weight  ::Int64
end

These are straightforward: Items contain fields (the type is composite) and can be instantiated (the type is concrete). They can also be examined at runtime using Julia’s reflection facilities:

julia> sizeof(Item)
16

julia> fieldcount(Item)
2

julia> fieldnames(Item)
(:value, :weight)

julia> fieldtypes(Item)
(Int64, Int64)

julia> function showfields(T)
           for i in 1 : fieldcount(T)
               println(fieldoffset(T, i), '\t', fieldname(T, i), "\t::", fieldtype(T, i))
           end
       end
showfields (generic function with 1 method)

julia> showfields(Item)
0   value   ::Int64
8   weight  ::Int64

but a few things here are less obvious:

  1. structs default to being immutable unless explicitly marked as mutable.
  2. they are always final, i.e. cannot be further inherited from3. (This is also true of primitive and, in fact, any non-abstract types.)

Defaulting to immutability is not an arbitrary choice. It can be exploited by the compiler to optimize performance and memory usage by “interning” values that are indistinguishable if they are equal. Compare

julia> i1 = Item(1, 2)
Item(1, 2)

julia> i2 = Item(1, 2)
Item(1, 2)

julia> i1 == i2  # equal
true

julia> i1 === i2 # actually, the same "interned" object
true

julia> i1 == Item(1, 3)
false

with

julia> mutable struct MutableItem # mutable version of 'Item'
           value   ::Int64
           weight  ::Int64
       end

julia> i1 = MutableItem(1, 2)
MutableItem(1, 2)

julia> i2 = MutableItem(1, 2)
MutableItem(1, 2)

julia> i1 == i2  # not equal!
false # <- surprised? looks like '==' needs to be defined for custom mutable types...

Julia also has abstract types which can’t be instantiated, but are instead used to organize the type graph via shared parent nodes. You could also say they act as “marker” or “trait” base classes, like Number or Signed above. Since all non-abstract Julia types are final, any supertype is necessarily an abstract type (Any if not specified explicitly).

julia> abstract type MyInt <: Int32 end
ERROR: invalid subtyping in definition of MyInt
Stacktrace:
 [1] ...

julia> abstract type MyInt <: supertype(Int32) end

julia> subtypetree(Number)
Number
    Complex
    Real
        ...
        Integer
            Bool
            Signed
                ...
                Int24
                ...
                MyInt
        ...

Type parameters

To various degrees, all of the three major categories of Julia types (primitive, composite, abstract) are available in other dynamic languages, either natively or via some libraries. Julia also offers something that gets it if not into the realm of uber-powerful (and uber-complicated) C++ metaprogramming, then definitely into the realm of Java generics: all three type categories can be further parameterized with other types and values. I’ll explore this in a future tutorial.

Type annotations

And now to the meat of this tutorial: type annotations. A type annotation in Julia looks like <thing>::<type>4 – it is an in-place modifier to a <thing> introduced by the :: operator.

Typeasserts vs variable declarations

The way I read the documentation, Julia type annotations can be applied to two types of <thing>s:

  1. [typeassert] expressions computing a value (and recall that everything in Julia is an expression):

    x = y ::Float64 # promises to the runtime that at this point in the execution 'y' will be a Float64
    
  2. [variable type declaration] left-hand sides of assignments or declarations that introduce (local) variables:

    x ::Float64 = y # declares a new local 'x', marks it as always containing Float64 values, and initializes with 'y' converted to Float64
    

    This second case also covers typed fields of structs and named tuples:

    struct Point
      x ::Float64 # this field will always contain only Float64 values
      y           # this field can contain any Julia value
    end
    

The first case is a typeassert. The second kind of annotation marks the name/field to its left as constrained to values compatible with the given type, and also ensures that throughout the variable’s scope all subsequent initializations of and assignments to it are filtered through an implicit conversion.

As a consequence, there are differences in runtime behavior and the information communicated to the system:

  1. With a typeassert the compiler will create code that at runtime will check the annotated value for type compatibility and throw a TypeError if the check fails – but it will not attempt to coerce the computed value to the annotation type in any way. Type-asserted syntax <exp>::T is precisely equivalent to a call, possibly inlined, to typeassert(<exp>, T) followed by making use of the <exp> value.
  2. With a variable type declaration any assignment <lhs>::T = <rhs> will effectively translate into <lhs> = convert(T, <rhs>), i.e. contain a call, possibly inlined, to convert(T, <rhs>). And at runtime, every such assignment will attempt to coerce its right hand-side value to T, possibly resulting in a value that’s only an approximation. Should this conversion fail, an exception will be thrown:
    • if no such conversion exists at all, a MethodError is thrown;
    • if T is an Integer (sub)type and cannot represent the expression value, an InexactError is thrown.

To appreciate the difference, compare

julia> function foo()
         x ::Float64 = 1 # implies 'convert(Float64, 1)'
         x, typeof(x)
       end
foo (generic function with 1 method)

julia> foo()
(1.0, Float64)

with

julia> function foo()
         x = 1 ::Float64 # implies 'typeassert(1, Float64)'
         x, typeof(x)
       end
foo (generic function with 1 method)

julia> foo()
ERROR: TypeError: in typeassert, expected Float64, got Int64

Literal 1 is of a (machine-dependent) Int type:

julia> typeof(1)
Int64

and even though it can be converted to a Float64 without loss, such a conversion is not even attempted in the second version of foo().

As I write this, Julia does not yet support type declarations for global variables – this is the reason I wrapped the above examples into functions.

What about function signatures?

Unsurprisingly, it is also possible to type-annotate function arguments and return types:

function bar(x ::Float64) ::Float32
    sin(2π * x)
end

Function arguments

If you’re coming from languages like C++ or Java where type conversions can happen as part of argument passing, you might think that x ::Float64 in bar() is like a local (typed) variable declaration, similar to the second case above, perhaps implying a call to something like convert(Float64, x) everywhere before bar() is invoked. That is not the case in Julia: no conversions ever take place as part of Julia function argument passing. In fact, Julia argument type annotations are actually more like those typeasserts: foo(x) will expect x to be a Float64 already.

There is a subtle difference from an in-place typeassert, however: with the above definition of bar() there will be no need to generate an implicit call to typeassert() at all because I will only be allowed to call it with Float64s. If I need sin(2π * x) for a Float64 input x, no problem. For any other type5, say, Int64, I will get a flat rejection not because a method call was tried and failed during Int64-to-Float64 input type conversion (TypeError) but because the requisite method (named “foo” and taking a single argument of type Int64) did not exist (MethodError). And since Float64 is a concrete type and, again, all concrete types are final in Julia, the universe of possible outcomes here shrinks dramatically:

julia> bar(0.75)
-1.0f0

julia> bar(1.)
-2.4492937f-16

julia> bar(1)
ERROR: MethodError: no method matching bar(::Int64)
Closest candidates are:
  bar(::Float64) at ...

This may seem a little draconian, but it is connected to how Julia’s multiple dispatch works and is further ameliorated by Julia’s system of promoting function arguments to a common type.

Function return types

Specifying bar() return type to be Float32 is a way to ensure that value being returned is passed through a convert(Float32, …). Whether this is desired depends on software design. I can imagine situations where it could be used as a way to safely return “special” values:

function sqrt_or_nothing(x ::Float64) ::Union{Float64, Nothing}
    x < 0.0 ? nothing : √x
end
julia> @show sqrt_or_nothing(2.0)
sqrt_or_nothing(2.0) = 1.4142135623730951
1.414213562373095

julia> @show sqrt_or_nothing(-2.0)
sqrt_or_nothing(-2.0) = nothing

Alternatively, it might be easier to reason about your code behavior if most functions are strict about their return value types. Otherwise, it seems like it could be easy in Julia to accidentally return different types along different value return paths, which could cause inefficiencies or maybe even errors downstream:

function bar_clipped(x ::Float64)
    x < 0.0 ? 0 : sin(2π * x)
end
julia> typeof(bar_clipped(0.75))
Float64

julia> typeof(bar_clipped(-0.75))
Int64 # oops, use 0.0 literal instead of 0 above

When are implicit conversions done?

Most of the cases of implicit calls to convert(…) have already been mentioned. Julia documentation offers this complete list:

  • Assigning to an array converts to the array’s element type.
  • Assigning to a field of an object converts to the declared type of the field.
  • Constructing an object with new converts to the object’s declared field types.
  • Assigning to a variable with a declared type (e.g. local x::T) converts to that type.
  • A function with a declared return type converts its return value to that type.
  • Passing a value to ccall converts it to the corresponding argument type.

Case study: functional OOP

So far, Julia type annotations appeared to be potentially beneficial (for code maintenance, performance), yet somehow optional, feature of the language. Let me now show a situation where annotation are truly necessary.

We saw how every function argument in Julia is always associated with a type. Is it possible for there to be multiple functions that all have the same name but different parameter types?

Not only is the answer “yes”, it is actually kind of like “yes, it is meant to happen a lot“: Julia thrives on maintaining multiple versions of the “same” function (called “methods”) and figuring out which version to invoke for a given set of inputs. These versions are distinguished by annotating parameters with different types. Enter “multiple dispatch”, a core paradigm of Julia programming6. The intuition is that Julia functions are essentially “always virtual”: unlike other languages where a class method needs to be marked in a special way to support “late binding” (method dispatch based on runtime, not compile, type of an object), Julia runtime system always dispatches all functions on the concrete runtime types of all their arguments. Because no argument position is “special” and the method does not “belong” to any particular parameter type, this form of polymorphism usually opts for language design with standalone functions, that is functions that do not live inside any “classes”. Some people call such designs “functional OOP”. It makes a lot of sense for math-style coding due to symmetries in function parameters.

Now, one way to ease into Julia multiple dispatch is to consider its simplest edge case: single dispatch.

Serializing Julia objects to JSON

I am going to implement a simple serializer of Julia objects to JSON. This will be a toy example, useless in any kind of production setting. My initial point is the “interface” method to_JSON() calling visit() that in turn starts as a single fallback that always fails:

function to_JSON(io ::IO, obj)
    visit(obj, io)
end

function visit(obj, io ::IO)
    error("default visit() called for obj type: ", typeof(obj))
end

I am going to anchor my design in the “virtual nature” of visit(obj, io), with execution routed based on the runtime type of obj. Looking at my Number type trees above, I can see that this overload can cover JSON numbers and booleans:

function visit(obj ::Real, io ::IO)
    print(io, obj)
end

This overload will kick in for any obj that belongs to a type derived from Real – because that is more specific than Any and because Julia dispatch algorithm will always choose the most specific method signature to call in every situation.

Strings are equally easy, but the method body needs to quote them7, so I need a new method overload:

function visit(obj ::AbstractString, io ::IO)
    print(io, '"')
    print(io, obj)
    print(io, '"')
end

So far, I have told Julia to dispatch execution to either Real (and its subtypes, which include ints, floats, and booleans) or AbstractString (and its subtypes, including String and everything that is string-like).

By the way, the actual dispatch decision is based on (obj, io) but io happens to be the same across all visit()s, so the dispatch is effectively on a single argument obj.

Why do I suggest thinking of visit() as “virtual”? Think of visit(obj,…) as equivalent to obj.visit(…) in a language like Python, Java, C++, where the version of visit() to use depends on the runtime type of obj.

Also note how abstract parent types come in handy: I don’t need to code explicit visit(Float32,…), visit(Float64,…), visit(BigFloat,…) for all possible (and future!) leaves of the type tree because I can handle things at the level of abstract parent nodes.

By this point, my imlementation roadmap should be apparent: I am going to keep adding more visit() overloads with obj parameter types chosen so as to partition the type universe into subtrees that correctly “carve out” each supported type of anything I expect to find inside my input. Taking the next step, for objects that can contain other objects the virtual nature of visit() becomes critical:

function visit(obj ::AbstractArray, io ::IO)
    print(io, '[')
    for i in 1 : length(obj)
        i > 1 && print(io, ", ")
        visit(obj[i], io)
    end
    print(io, ']')
end

function visit(obj ::AbstractDict, io ::IO)
    print(io, '{')
    first = true
    for (k, v) in obj
        first ? first = false : print(io, ", ")
        visit(k ::AbstractString, io) # assert that key is a string
        print(io, " : ")
        visit(v, io)
    end
    print(io, '}')
end

The nested visit()s are already virtual, nothing else needs to be done to pick up a particular overload. There are also no “if-obj-type-is-…” condition checks – everything is as clean as in “pure” textbook OOP.

Just a handful of lines of code so far, and yet they already work on a variety of inputs:

julia> to_JSON(stdout, [1, false, zeros(2), [1.2345, 12, Dict("a" => true, "b" => 2.3, "c" => [1, 2, 3.4])]])
[1, false, [0.0, 0.0], [1.2345, 12, {"c" : [1.0, 2.0, 3.4], "b" : 2.3, "a" : true}]]

Not bad, looks like valid JSON to me.

Now, suppose that I would like to extend the set of supported “JSON-compatible” Julia types to also include tuples. I would like to output them as JSON arrays. All I need to do is add another overload:

function visit(obj ::Tuple, io ::IO)
    print(io, '[')
    for i in 1 : length(obj)
        i > 1 && print(io, ", ")
        visit(obj[i], io)
    end
    print(io, ']')
end
julia> to_JSON(stdout, [("tuple", (1, "nested", "tuple"))  1, false, zeros(2), [1.2345, 12, Dict("a" => true, "b" => 2.3, "c" => [1, 2, 3.4])]])
[["tuple", [1, "nested", "tuple"]], 1, false, [0.0, 0.0], [1.2345, 12, {"c" : [1.0, 2.0, 3.4], "b" : 2.3, "a" : true}]]
Observe how no earlier visit()s needed to be modified to become “aware” of this new support for tuples. In other words, a method added later hooks into a set of mutual method invocations coded earlier, with no ostensible “recompilation”.

If you’re interested in playing with various design alternatives yourself, you can find this entire example here.

Summary

In summary, I would like to call out some things about Julia typing that make Julia feel different from many languages:

  • Julia supports user-definable primitive types which do not need to be “boxed”.
  • All superclasses are abstract and all concrete classes are final.
  • Mutability is part of type definition, not of argument/variable/field.
  • Multiple dispatch is a primary paradigm (“all methods are virtual”). This design is consistent with lack of classic “objects” that forcefully bundle state with behavior.
  • Duck-typing also works in Julia, perhaps even without any performance loss. But certain core Julia features necessitate some static typing.

  1. This is what the public documentation says. Examining boot.jl of my Julia install shows that many of these types are actually implemented in C. ^
  2. As I write this, bit widths must be multiples of 8. ^
  3. Just like classes marked with final keyword in C++ or Java. ^
  4. I use an extra blank before :: but you don’t have to. ^
  5. To simply the narrative, I am glossing over possibilities for promotion of x to another type. ^
  6. From what I can tell so far, that is. ^
  7. Note that this toy serializer doesn’t bother with backslash escapes, Unicode, etc. ^
Previous
Next