Julia Nomenclature

These are some terms that get thrown around a lot by julia programmers. This is a brief writeup of a few of them.

Closures

Closures are when a function is created (normally via returning from anotehr function) that references some variable in the enclosing scope. We say that it closes over those variables.

Simple

This closure closes over count

Input:

function call_counter()
    count = 0
    return function()
        count+=1
        return count
    end
end

Output:

call_counter (generic function with 1 method)

Input:

counter_1 = call_counter()

Output:

#7 (generic function with 1 method)

Input:

counter_1()

Output:

1

Input:

counter_1()

Output:

2

Input:

counter_2 = call_counter()
counter_2()

Output:

1

Useful

I use this to control early stopping when training neural networks. This closes over best_loss and remaining_patience

Input:

function make_earlystopping(T=Float64; patience=0)
    best_loss::T = typemax(T)
    remaining_patience = patience
    function should_stop!(latest_loss)
        if latest_loss < best_loss
            remaining_patience = patience
            @info "Improved" remaining_patience

            best_loss = latest_loss::T
            return false
        else
            remaining_patience -= 1
            @info "Got worse" remaining_patience
            if remaining_patience < 0
                @info "!!Stopping!!"
                return true
            else
                return false
            end
        end
    end
    return should_stop!
end

Output:

make_earlystopping (generic function with 2 methods)

Input:

should_stop! = make_earlystopping(Int; patience=3)

Output:

should_stop! (generic function with 1 method)

Input:

should_stop!(99)
should_stop!(97)
should_stop!(100)
should_stop!(100)
should_stop!(101)
should_stop!(102)

Output:

┌ Info: Improved
│   remaining_patience = 3
└ @ Main In[6]:7
┌ Info: Improved
│   remaining_patience = 3
└ @ Main In[6]:7
┌ Info: Got worse
│   remaining_patience = 2
└ @ Main In[6]:13
┌ Info: Got worse
│   remaining_patience = 1
└ @ Main In[6]:13
┌ Info: Got worse
│   remaining_patience = 0
└ @ Main In[6]:13
┌ Info: Got worse
│   remaining_patience = -1
└ @ Main In[6]:13
┌ Info: !!Stopping!!
└ @ Main In[6]:15
true

you may be using closures without realising it

e.g. the following closes over model


function runall(dates)
    model = Model()
    pmap(dates) do the_day
        simulate(model, the_day) 
    end
end

Parallelism

3 types:

Multiprocessing / Distributed

Multithreading / Shared Memory

Asynchronous / Coroutines


Dynamic Dispatch vs Static Dispatch

Input:

foo(x::Int) = x
foo(x::Float64) = 2*x
foo(x::Char) = 3*x

function dynamic_dispatch()
    total = 0 
    for ii in 1:1000
        x = rand() > 2.0 ?  1 : 1.0
        total += foo(x)
    end
    total
end

Output:

dynamic_dispatch (generic function with 1 method)

Input:

using BenchmarkTools

@btime dynamic_dispatch()

Output:

  2.377 μs (0 allocations: 0 bytes)
2000.0

Input:

function static_dispatch()
    total = 0 
    for ii in 1:1000
        x = rand() > 10 ?  1 : 1.0
        total += foo(1.0)
    end
    total
end


@btime static_dispatch()

Output:

  1.716 μs (0 allocations: 0 bytes)
2000.0

Type Stability

Closely related to Dynamic vs Static Dispatch

Input:

function type_unstable(x)
    if x < 50.0
        return 1
    else
        return 2.0
    end
end

function demo_type_unstable()
    sum(type_unstable, 1.0:100.0)
end

Output:

demo_type_unstable (generic function with 1 method)

Input:

@btime demo_type_unstable();

Output:

  334.707 ns (0 allocations: 0 bytes)

Input:

function type_stable(x)
    if x < 50.0
        return 1.0
    else
        return 2.0
    end
end

function demo_type_stable()
    sum(type_stable, 1.0:100.0)
end

Output:

demo_type_stable (generic function with 1 method)

Input:

@btime demo_type_stable();

Output:

  235.601 ns (0 allocations: 0 bytes)

Input:

@code_warntype type_unstable(1)

Output:

Body::Union{Float64, Int64}
1 ─ %1 = (Base.sitofp)(Float64, x)::Float64
│   %2 = (Base.lt_float)(%1, 50.0)::Bool
│   %3 = (Base.eq_float)(%1, 50.0)::Bool
│   %4 = (Base.eq_float)(%1, 9.223372036854776e18)::Bool
│   %5 = (Base.fptosi)(Int64, %1)::Int64
│   %6 = (Base.slt_int)(x, %5)::Bool
│   %7 = (Base.or_int)(%4, %6)::Bool
│   %8 = (Base.and_int)(%3, %7)::Bool
│   %9 = (Base.or_int)(%2, %8)::Bool
└──      goto #3 if not %9
2 ─      return 1
3 ─      return 2.0

Input:

@code_warntype type_stable(1)

Output:

Body::Float64
1 ─ %1 = (Base.sitofp)(Float64, x)::Float64
│   %2 = (Base.lt_float)(%1, 50.0)::Bool
│   %3 = (Base.eq_float)(%1, 50.0)::Bool
│   %4 = (Base.eq_float)(%1, 9.223372036854776e18)::Bool
│   %5 = (Base.fptosi)(Int64, %1)::Int64
│   %6 = (Base.slt_int)(x, %5)::Bool
│   %7 = (Base.or_int)(%4, %6)::Bool
│   %8 = (Base.and_int)(%3, %7)::Bool
│   %9 = (Base.or_int)(%2, %8)::Bool
└──      goto #3 if not %9
2 ─      return 1.0
3 ─      return 2.0

Type Piracy

If your package did not define the

You are doing a type piracy, and this is a bad thing.
By doing type piracy you can break code in other models even if they don’t import your definitions.

Input:

methods(mapreduce)

Output:

# 4 methods for generic function "mapreduce":
[1] mapreduce(f, op, a::Number) in Base at reduce.jl:324
[2] mapreduce(f, op, itr::Base.SkipMissing{#s623} where #s623<:AbstractArray) in Base at missing.jl:202
[3] mapreduce(f, op, A::AbstractArray; dims, kw...) in Base at reducedim.jl:304
[4] mapreduce(f, op, itr; kw...) in Base at reduce.jl:205

Lets define a new method, to reduce the magnitude first element by the first argument and the second by the second

we are going to call it mapreduce because is is kind of mapping this reduction in magnitude. And because this is a slightly forced example.

Input:

function Base.mapreduce(y1, y2, xs::Array)
    ys = [y1, y2]
    return sign.(xs) .* (abs.(xs) .- abs.(ys))
end

Input:

mapreduce(5, 8, [10, -10])

Output:

2-element Array{Int64,1}:
  5
 -2

Lets sum some numbers

Input:

sum([1,2,3])

Output:

DimensionMismatch("arrays could not be broadcast to a common size")



Stacktrace:

 [1] _bcs1 at ./broadcast.jl:438 [inlined]

 [2] _bcs at ./broadcast.jl:432 [inlined]

 [3] broadcast_shape at ./broadcast.jl:426 [inlined]

 [4] combine_axes at ./broadcast.jl:421 [inlined]

 [5] _axes at ./broadcast.jl:208 [inlined]

 [6] axes at ./broadcast.jl:206 [inlined]

 [7] combine_axes at ./broadcast.jl:422 [inlined]

 [8] combine_axes at ./broadcast.jl:421 [inlined]

 [9] instantiate at ./broadcast.jl:255 [inlined]

 [10] materialize(::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(*),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(sign),Tuple{Array{Int64,1}}},Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(-),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(abs),Tuple{Array{Int64,1}}},Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(abs),Tuple{Array{Function,1}}}}}}}) at ./broadcast.jl:753

 [11] mapreduce(::Function, ::Function, ::Array{Int64,1}) at ./In[19]:3

 [12] _sum at ./reducedim.jl:653 [inlined]

 [13] _sum at ./reducedim.jl:652 [inlined]

 [14] #sum#550 at ./reducedim.jl:648 [inlined]

 [15] sum(::Array{Int64,1}) at ./reducedim.jl:648

 [16] top-level scope at In[21]:1

Glue Packages

Sometimes to make two packages work together, you have to make them aware of each others types.

For example to implement

convert(::Type(DataFrame), axisarray::AxisArray)

where

Then the only way to do this without type piracy is to do it either DataFrames.jl or AxisArrays.jl. But that isn’t possible without adding a dependency which isn’t great.

So instead we have a Glue Package, eg, DataFrameAxisArrayBuddies.jl, that adds this method. It is piracy but it is fairly safe, since it is adding behavour to types that would normally be a method error as is. Misdemenor type piracy.

Wrapper Types and Delegation Pattern

I would argue that this is a core part of polymorphism via composition.

In the following example, we construct SampledVector, which is a vector-like type that has fast access to the total so that it can quickly calculate the mean. It is a wrapper of the Vector type, and it delegates several methods to it.

Even though it overloads Statistics.mean, and push!, size and getindex from Base, we do not commit type piracy, as we alrways own one of the types – the SampleVector.

Input:

using Statistics

mutable struct SamplesVector{T} <: AbstractVector{T}
    backing::Vector{T}
    total::T
end
SamplesVector(xs) = SamplesVector(xs, +(xs...))  # can't use `sum` as we broke that

Output:

SamplesVector

Input:

Statistics.mean(sv::SamplesVector) = sv.total/length(sv.backing)

Input:

function Base.push!(sv::SamplesVector, x)
    sv.total += x
    push!(sv.backing, x)
    return sv
end

delegate size and getindex

Input:

Base.size(sv::SamplesVector) = size(sv.backing)
Base.getindex(sv:: SamplesVector, ii) = getindex(sv.backing, ii)

Demo

Input:

samples = SamplesVector([1,2,3,4,5,6])
mean(samples)

Output:

3.5

Input:

push!(samples,7)
mean(samples)

Output:

4.0

Input:

samples[1]

Output:

1

Input:

samples[3:end]

Output:

5-element Array{Int64,1}:
 3
 4
 5
 6
 7

Input:

push!(samples,7)
mean(samples)

Output:

4.375

Input:

samples

Output:

8-element SamplesVector{Int64}:
 1
 2
 3
 4
 5
 6
 7
 7

Tim Holy Traits

Traits as something that naturally falls out of functions that can be performed on types at compile time, and on having multiple dispatch. See previous post for details..