Abstract and Parametric Types in Julia

Introduction

When I first started using Julia, I thought I should write my code as generic as possible using abstract types whereever I could so that no matter whether users would like to use my functions with floating point precision or with exact rationals, they could do so and obtain the corresponding result.

Later I realized that these are not just always substituted by the corresponding concrete types – which causes HUGE overheads in computation time (type-unsafe code). Let’s look at an example.

[su_box title=”Example” style=”glass” box_color=”#08ef8d” radius=”8″]
I have a class C which for the sake of this example really just holds one value. I would want it to be instantiable with any kind of real number (especially Float64 and Rational{Int64}):

julia> type C
         val::Real
       end

Furthermore the function f should return the val for any given class C incremented by one:

julia> function f(c::C)
         return c.val + 1
       end

[/su_box]

This code now works but is just the kind of generic type overhead described above. One can see this by calling

julia> @code_warntype f(C(1.))
#> Variables:
#> c::C
#> 
#> Body:
#> begin # none, line 2:
#> return (top(getfield))(c::C,:val)::Real + 1::Any
#> end::Any

(Note: this requires at least julia v.0.4.0)

We can see that although the C we submit here is definetely a Float64, the function f can not know about it and will thus have to deal with the generic Real type (causing a lot of boxing/unboxing in the llvm code).

[su_box title=”HIGHLIGHT” style=”glass” box_color=”#e22426″ radius=”8″]

This raises two questions

  1. Why can’t julia just produce one version of the code for Float64, Rational{Int64}, etc. and how can we circumvent this problem?
  2. And what are abstract types even good for, if they can not be used like this and force us to write explicit code for Float64 and Rational{Int64} instead of just one version for Real?

[/su_box]

 

Answer to the first question

The class definition for C doesn’t allow us to know from the outside what is stored in its field. And, as stated in the manual,

[…] the compiler uses the types of objects, not their values, to determine how to build code.

Thus to make the calling function aware of the type inside of C, C has to be a specialized type for each type of its value. This is where parametric types come in. A parametric type expands to such specializations such that the type of the field is part of the class-type.

julia> type C{T<:Real}
         val::T
       end

When we now call f like

julia> f(C{Int64}(5))

this just magically works with good performance, because we have made the type of val in C explicit. The especially cool thing is now that – because the type parameter (T) of our type C can be infered from the types of the arguments to the constructor we can even leave out the explicit declaration:

julia> f(C(5))

without loss of performance.

Answer to the second question

Reconsider the function f above. Let’s think about not always adding 1 but a constant of Real type.

julia> function f(c::C,d::Real)
         return c.val + d
       end

No matter what kind of d we choose when calling this function, we will always get the performant result, because e.g. 3.3 is always a Float64 etc., and thus the calling types are all explicit. All functions can be written this generic, as long as it contains only operations that are defined for all the corresponding types (e.g. here + is defined for Numbers).

Summary

Functions written for abstract types can delegate to concrete types whenever the concrete type is explicit from the metaprogramming information. They may contain operations that are just as generic. At the very bottom of the call chain there must thus be functions defined for all concrete types by overloading (e.g. addition in Base is explicitly defined for +(a::Int64,b::Int64), +(a::Float64,b::Float64), …). To make this information explicit for composite types, these must be made parametric.

Final note: +(a::Int64,b::Float64) is not defined explicitly. But as +(x::Number, y::Number) = +(promote(x,y)...), here the implicit call to convert(a::Int64,b::Float64) converts the Int to a Float.