Introduction to¶

System information (for reproducibility)

versioninfo()

Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 4

What's Julia?¶

Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments

History:
- Project started in 2009. First public release in 2012
- Creators: Jeff Bezanson, Alan Edelman, Stefan Karpinski, Viral Shah
- First major release v1.0 was released on Aug 8, 2018
- Current stable release v1.1.0
Aim to solve the notorious two language problem: Prototype code goes into high-level languages like R/Python, production code goes into low-level language like C/C++.

Julia aims to:

Walks like Python. Runs like C.

See https://julialang.org/benchmarks/ for the details of benchmark.

Write high-level, abstract code that closely resembles mathematical formulas
- yet produces fast, low-level machine code that has traditionally only been generated by static languages.
Julia is more than just "Fast R" or "Fast Matlab"
- Performance comes from features that work well together.
- You can't just take the magic dust that makes Julia fast and sprinkle it on [language of choice]

Learning resources¶

The (free) online course Introduction to Julia, by Jane Herriman.
Cheat sheet: The Fast Track to Julia.
Browse the Julia documentation.
For Matlab users, read Noteworthy Differences From Matlab.

For R users, read Noteworthy Differences From R.

For Python users, read Noteworthy Differences From Python.
The Learning page on Julia's website has pointers to many other learning resources.

Julia REPL (Read-Evaluation-Print-Loop)¶

The Julia REPL, or Julia shell, has at least five modes.

Default mode is the Julian prompt julia>. Type backspace in other modes to enter default mode.
Help mode help?>. Type ? to enter help mode. ?search_term does a fuzzy search for search_term.
Shell mode shell>. Type ; to enter shell mode.
Package mode (@v1.4) pkg>. Type ] to enter package mode for managing Julia packages (install, uninstall, update, ...).
Search mode (reverse-i-search). Press ctrl+R to enter search model.
With RCall.jl package installed, we can enter the R mode by typing $ (shift+4) at Julia REPL.

Some survival commands in Julia REPL:

quit() or Ctrl+D: exit Julia.
Ctrl+C: interrupt execution.
Ctrl+L: clear screen.
Append ; (semi-colon) to suppress displaying output from a command. Same as Matlab.
include("filename.jl") to source a Julia code file.

Seek help¶

Online help from REPL: ?function_name.
Google (of course).
Julia documentation: https://docs.julialang.org/en/.
Look up source code: @edit sin(π).
Discourse: https://discourse.julialang.org.
Friends.

Which IDE?¶

Julia homepage lists many choices: Juno, VS Code, Vim, ...
Unfortunately at the moment there are no mature RStudio- or Matlab-like IDE for Julia yet.
For dynamic document, e.g., homework, I recommend Jupyter Notebook or JupyterLab.
For extensive Julia coding, myself has been happily using the editor VS Code with extensions Julia and VS Code Jupyter Notebook Previewer installed.

Julia package system¶

Each Julia package is a Git repository. Each Julia package name ends with .jl. E.g., Distributions.jl package lives at https://github.com/JuliaStats/Distributions.jl.
Google search with PackageName.jl usually leads to the package on github.com.
The package ecosystem is rapidly maturing; a complete list of registered packages (which are required to have a certain level of testing and documentation) is at http://pkg.julialang.org/.
For example, the package called Distributions.jl is added with
```
# in Pkg mode
(@v1.4) pkg> add Distributions
```
and "removed" (although not completely deleted) with
```
# in Pkg mode
(@v1.4) pkg> rm Distributions
```
The package manager provides a dependency solver that determines which packages are actually required to be installed.

Non-registered packages are added by cloning the relevant Git repository. E.g.,

# in Pkg mode
(@v1.4) pkg> add https://github.com/OpenMendel/SnpArrays.jl

A package needs only be added once, at which point it is downloaded into your local .julia/packages directory in your home directory.

readdir(Sys.islinux() ? ENV["JULIA_PATH"] * "/pkg/packages" : ENV["HOME"] * "/.julia/packages")

418-element Array{String,1}:
 ".DS_Store"
 "AMD"
 "AbstractFFTs"
 "AbstractMCMC"
 "AbstractTrees"
 "AccurateArithmetic"
 "Adapt"
 "AlgebraicMultigrid"
 "AlgorithmsFromTheBook"
 "ArgCheck"
 "ArnoldiMethod"
 "Arpack"
 "Arpack_jll"
 ⋮
 "ZMQ"
 "ZeroMQ_jll"
 "ZipFile"
 "Zlib_jll"
 "Zstd_jll"
 "Zygote"
 "ZygoteRules"
 "libass_jll"
 "libfdk_aac_jll"
 "libvorbis_jll"
 "x264_jll"
 "x265_jll"

Directory of a specific package can be queried by pathof():

using Distributions

pathof(Distributions)

"/Users/huazhou/.julia/packages/Distributions/dTXqn/src/Distributions.jl"

If you start having problems with packages that seem to be unsolvable, you may try just deleting your .julia directory and reinstalling all your packages.
Periodically, one should run update in Pkg mode, which checks for, downloads and installs updated versions of all the packages you currently have installed.
status lists the status of all installed packages.
Using functions in package.
```
using Distributions
```
This pulls all of the exported functions in the module into your local namespace, as you can check using the whos() command. An alternative is
```
import Distributions
```
Now, the functions from the Distributions package are available only using
```
Distributions.<FUNNAME>
```
All functions, not only exported functions, are always available like this.

Calling R from Julia¶

The RCall.jl package allows us to embed R code inside of Julia.
There are also PyCall.jl, MATLAB.jl, JavaCall.jl, CxxWrap.jl packages for interfacing with other languages.

using RCall

x = randn(1000)
R"""
hist($x, main="I'm plotting a Julia vector")
"""

RObject{VecSxp}
$breaks
 [1] -4.0 -3.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5  0.0  0.5  1.0  1.5  2.0  2.5  3.0
[16]  3.5

$counts
 [1]   1   2   5  21  55  85 134 195 179 163  90  46  15   8   1

$density
 [1] 0.002 0.004 0.010 0.042 0.110 0.170 0.268 0.390 0.358 0.326 0.180 0.092
[13] 0.030 0.016 0.002

$mids
 [1] -3.75 -3.25 -2.75 -2.25 -1.75 -1.25 -0.75 -0.25  0.25  0.75  1.25  1.75
[13]  2.25  2.75  3.25

$xname
[1] "`#JL`$x"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"

R"""
library(ggplot2)
qplot($x)
"""

RObject{VecSxp}

┌ Warning: RCall.jl: `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
└ @ RCall /Users/huazhou/.julia/packages/RCall/g7dhB/src/io.jl:113

x = R"""
rnorm(10)
"""

RObject{RealSxp}
 [1]  2.113069911 -1.682524682 -0.001947635 -0.188793705 -0.015248555
 [6] -1.076648574  1.077374336  1.274972800 -0.380181292  0.039603064

# collect R variable into Julia workspace
y = collect(x)

10-element Array{Float64,1}:
  2.113069910576006
 -1.6825246817294395
 -0.0019476347548470132
 -0.18879370539048243
 -0.015248555015719784
 -1.0766485735456024
  1.07737433553574
  1.2749727997552491
 -0.3801812920432254
  0.039603063609110235

Access Julia variables in R REPL mode:

julia> x = rand(5) # Julia variable
R> y <- $x

Pass Julia expression in R REPL mode:
```
R> y <- $(rand(5))
```
Put Julia variable into R environment:
```
julia> @rput x
R> x
```
Get R variable into Julia environment:
```
R> r <- 2
Julia> @rget r
```
If you want to call Julia within R, check out the XRJulia package by John Chambers.

Some basic Julia code¶

# an integer, same as int in R
y = 1
typeof(y)

Int64

# a Float64 number, same as double in R
y = 1.0
typeof(y)

Float64

# Greek letters:  `\pi<tab>`
π

π = 3.1415926535897...

typeof(π)

Irrational{:π}

# Greek letters:  `\theta<tab>`
θ = y + π

4.141592653589793

# emoji! `\:kissing_cat:<tab>`
😽 = 5.0

5.0

# `\alpha<tab>\hat<tab>`
α̂ = π

π = 3.1415926535897...

# vector of Float64 0s
x = zeros(5)

5-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0

# vector Int64 0s
x = zeros(Int, 5)

5-element Array{Int64,1}:
 0
 0
 0
 0
 0

# matrix of Float64 0s
x = zeros(5, 3)

5×3 Array{Float64,2}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

# matrix of Float64 1s
x = ones(5, 3)

5×3 Array{Float64,2}:
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0

# define array without initialization
x = Matrix{Float64}(undef, 5, 3)

5×3 Array{Float64,2}:
 5.0e-324      5.0e-324  0.0
 6.0e-323      0.0       0.0
 1.0e-323      0.0       0.0
 2.26291e-314  0.0       0.0
 0.0           0.0       0.0

# fill a matrix by 0s
fill!(x, 0)

5×3 Array{Float64,2}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

# initialize an array to be constant 2.5
fill(2.5, (5, 3))

5×3 Array{Float64,2}:
 2.5  2.5  2.5
 2.5  2.5  2.5
 2.5  2.5  2.5
 2.5  2.5  2.5
 2.5  2.5  2.5

# rational number
a = 3//5

3//5

typeof(a)

Rational{Int64}

b = 3//7

3//7

a + b

36//35

# uniform [0, 1) random numbers
x = rand(5, 3)

5×3 Array{Float64,2}:
 0.507598  0.186048   0.628075
 0.989433  0.870866   0.22553
 0.776182  0.189382   0.174717
 0.883318  0.0313765  0.0643843
 0.599743  0.234695   0.42736

# uniform random numbers (in single precision)
x = rand(Float16, 5, 3)

5×3 Array{Float16,2}:
 0.06445  0.698     0.6523
 0.251    0.3008    0.7363
 0.621    0.3223    0.99
 0.5      0.000977  0.7666
 0.1201   0.542     0.11914

# random numbers from {1,...,5}
x = rand(1:5, 5, 3)

5×3 Array{Int64,2}:
 3  3  1
 5  3  2
 3  4  2
 1  4  5
 1  5  1

# standard normal random numbers
x = randn(5, 3)

5×3 Array{Float64,2}:
  0.868721    1.07412    1.14506
  0.352555   -1.0598     0.132082
  0.51563     0.517356  -0.936765
 -0.0663545   0.162099   1.48518
 -0.0847069   1.88382    0.888104

# range
1:10

1:10

typeof(1:10)

UnitRange{Int64}

1:2:10

1:2:9

typeof(1:2:10)

StepRange{Int64,Int64}

# integers 1-10
x = collect(1:10)

10-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

# or equivalently
[1:10...]

10-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

# Float64 numbers 1-10
x = collect(1.0:10)

10-element Array{Float64,1}:
  1.0
  2.0
  3.0
  4.0
  5.0
  6.0
  7.0
  8.0
  9.0
 10.0

# convert to a specific type
convert(Vector{Float64}, 1:10)

10-element Array{Float64,1}:
  1.0
  2.0
  3.0
  4.0
  5.0
  6.0
  7.0
  8.0
  9.0
 10.0

Matrices and vectors¶

Dimensions¶

x = randn(5, 3)

5×3 Array{Float64,2}:
 -0.0679719  -0.499419     0.464304
 -0.51311     0.391731    -0.276571
  1.51131    -1.37069      0.378619
  1.25128     1.97676     -0.699069
  0.0620681   0.00853839   1.42408

size(x)

(5, 3)

size(x, 1) # nrow() in R

5

size(x, 2) # ncol() in R

3

# total number of elements
length(x)

15

Indexing¶

# 5 × 5 matrix of random Normal(0, 1)
x = randn(5, 5)

5×5 Array{Float64,2}:
 -0.680602   0.612964    0.0529561   0.332952  -0.907096
 -2.17088   -0.713974    0.104878    1.60923   -1.02306
  0.901428  -0.526219    0.29907    -0.254599  -1.16023
  0.108891  -2.5805     -1.36034     0.462398  -0.584862
  1.42368    0.0481646   0.544287    0.178462  -0.278758

# first column
x[:, 1]

5-element Array{Float64,1}:
 -0.6806017015112786
 -2.170878023180469
  0.9014281956035037
  0.10889141932478424
  1.423675222437071

# first row
x[1, :]

5-element Array{Float64,1}:
 -0.6806017015112786
  0.6129643293403908
  0.05295607589387902
  0.33295206179339176
 -0.907095749058918

# sub-array
x[1:2, 2:3]

2×2 Array{Float64,2}:
  0.612964  0.0529561
 -0.713974  0.104878

# getting a subset of a matrix creates a copy, but you can also create "views"
z = view(x, 1:2, 2:3)

2×2 view(::Array{Float64,2}, 1:2, 2:3) with eltype Float64:
  0.612964  0.0529561
 -0.713974  0.104878

# same as
@views z = x[1:2, 2:3]

2×2 view(::Array{Float64,2}, 1:2, 2:3) with eltype Float64:
  0.612964  0.0529561
 -0.713974  0.104878

# change in z (view) changes x as well
z[2, 2] = 0.0
x

5×5 Array{Float64,2}:
 -0.680602   0.612964    0.0529561   0.332952  -0.907096
 -2.17088   -0.713974    0.0         1.60923   -1.02306
  0.901428  -0.526219    0.29907    -0.254599  -1.16023
  0.108891  -2.5805     -1.36034     0.462398  -0.584862
  1.42368    0.0481646   0.544287    0.178462  -0.278758

# y points to same data as x
y = x

5×5 Array{Float64,2}:
 -0.680602   0.612964    0.0529561   0.332952  -0.907096
 -2.17088   -0.713974    0.0         1.60923   -1.02306
  0.901428  -0.526219    0.29907    -0.254599  -1.16023
  0.108891  -2.5805     -1.36034     0.462398  -0.584862
  1.42368    0.0481646   0.544287    0.178462  -0.278758

# x and y point to same data
pointer(x), pointer(y)

(Ptr{Float64} @0x00000001165dd490, Ptr{Float64} @0x00000001165dd490)

# changing y also changes x
y[:, 1] .= 0
x

5×5 Array{Float64,2}:
 0.0   0.612964    0.0529561   0.332952  -0.907096
 0.0  -0.713974    0.0         1.60923   -1.02306
 0.0  -0.526219    0.29907    -0.254599  -1.16023
 0.0  -2.5805     -1.36034     0.462398  -0.584862
 0.0   0.0481646   0.544287    0.178462  -0.278758

# create a new copy of data
z = copy(x)

5×5 Array{Float64,2}:
 0.0   0.612964    0.0529561   0.332952  -0.907096
 0.0  -0.713974    0.0         1.60923   -1.02306
 0.0  -0.526219    0.29907    -0.254599  -1.16023
 0.0  -2.5805     -1.36034     0.462398  -0.584862
 0.0   0.0481646   0.544287    0.178462  -0.278758

pointer(x), pointer(z)

(Ptr{Float64} @0x00000001165dd490, Ptr{Float64} @0x000000011617e210)

Concatenate matrices¶

# 3-by-1 vector
[1, 2, 3]

3-element Array{Int64,1}:
 1
 2
 3

# 1-by-3 array
[1 2 3]

1×3 Array{Int64,2}:
 1  2  3

# multiple assignment by tuple
x, y, z = randn(5, 3), randn(5, 2), randn(3, 5)

([-0.07946595327478304 0.5366206902515758 0.1307347273879236; 1.7279696671313747 1.5241267746650093 1.8724583017639604; … ; -0.12094504677379282 0.14661049014819597 0.34087683471834973; 0.19686837885575847 -0.8183917613580928 -0.30594493713762694], [1.5025239642692927 -0.23889787694491535; -0.285007379207401 -0.2892545052687034; … ; -0.7192051824276394 -0.6163762786059073; 0.0014003052026974325 -1.494586453184664], [-1.01305622020185 -0.4749736708134258 … 0.369403295977617 1.465847037042864; -0.7013788710067068 -0.0674974235178728 … 1.718854144356147 -0.28177382391727085; 0.892424382198883 -0.2568741330038262 … -0.3219874137495171 1.6817461231108182])

[x y] # 5-by-5 matrix

5×5 Array{Float64,2}:
 -0.079466   0.536621   0.130735   1.50252     -0.238898
  1.72797    1.52413    1.87246   -0.285007    -0.289255
  0.793023   1.44182    0.2899    -0.167062    -0.133781
 -0.120945   0.14661    0.340877  -0.719205    -0.616376
  0.196868  -0.818392  -0.305945   0.00140031  -1.49459

[x y; z] # 8-by-5 matrix

8×5 Array{Float64,2}:
 -0.079466   0.536621    0.130735    1.50252     -0.238898
  1.72797    1.52413     1.87246    -0.285007    -0.289255
  0.793023   1.44182     0.2899     -0.167062    -0.133781
 -0.120945   0.14661     0.340877   -0.719205    -0.616376
  0.196868  -0.818392   -0.305945    0.00140031  -1.49459
 -1.01306   -0.474974    0.329336    0.369403     1.46585
 -0.701379  -0.0674974  -0.0956562   1.71885     -0.281774
  0.892424  -0.256874   -0.131193   -0.321987     1.68175

Dot operation¶

Dot operation in Julia is elementwise operation, similar to Matlab.

x = randn(5, 3)

5×3 Array{Float64,2}:
 -1.29425  -0.26645    1.58016
 -2.10411   1.11568   -0.520928
  1.38334  -1.25722   -0.933339
 -0.64551  -1.33595   -1.02398
 -1.35096   0.170744  -0.30874

y = ones(5, 3)

5×3 Array{Float64,2}:
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0

x .* y # same x * y in R

5×3 Array{Float64,2}:
 -1.29425  -0.26645    1.58016
 -2.10411   1.11568   -0.520928
  1.38334  -1.25722   -0.933339
 -0.64551  -1.33595   -1.02398
 -1.35096   0.170744  -0.30874

x .^ (-2)

5×3 Array{Float64,2}:
 0.596983  14.0854     0.400493
 0.225872   0.803376   3.68506
 0.522571   0.632673   1.14795
 2.39991    0.560295   0.953705
 0.547914  34.3011    10.4909

sin.(x)

5×3 Array{Float64,2}:
 -0.962005  -0.263309   0.999956
 -0.861126   0.898211  -0.497685
  0.982481  -0.951236  -0.803612
 -0.601606  -0.972551  -0.854186
 -0.975934   0.169916  -0.303858

Basic linear algebra¶

x = randn(5)

5-element Array{Float64,1}:
  0.22752087078790206
  0.06040469380904837
  1.6253138729701064
 -0.5616584061757285
 -0.3163419010248666

using LinearAlgebra
# vector L2 norm
norm(x)

1.7642539564697661

# same as
sqrt(sum(abs2, x))

1.7642539564697661

y = randn(5) # another vector
# dot product
dot(x, y) # x' * y

2.3343380467299286

# same as
x'y

2.3343380467299286

x, y = randn(5, 3), randn(3, 2)
# matrix multiplication, same as %*% in R
x * y

5×2 Array{Float64,2}:
 -0.557807   1.71908
 -0.1451     1.63133
 -0.376415  -1.28496
 -0.952232  -0.634984
 -0.181337   2.31377

x = randn(3, 3)

3×3 Array{Float64,2}:
  0.604551  -0.663884  -0.427566
  0.171046   0.706153   1.02315
 -0.609924   0.250137   1.49812

# conjugate transpose
x'

3×3 Adjoint{Float64,Array{Float64,2}}:
  0.604551  0.171046  -0.609924
 -0.663884  0.706153   0.250137
 -0.427566  1.02315    1.49812

b = rand(3)
x'b # same as x' * b

3-element Array{Float64,1}:
  0.0977277195001287
 -0.20812044826774032
  0.7389752569455617

# trace
tr(x)

2.8088199654743127

det(x)

0.8667979988514273

rank(x)

3

Sparse matrices¶

using SparseArrays

# 10-by-10 sparse matrix with sparsity 0.1
X = sprandn(10, 10, .1)

10×10 SparseMatrixCSC{Float64,Int64} with 2 stored entries:
  [4, 4]  =  -0.664695
  [7, 9]  =  -0.50595

# dump() in Julia is like str() in R
dump(X)

SparseMatrixCSC{Float64,Int64}
  m: Int64 10
  n: Int64 10
  colptr: Array{Int64}((11,)) [1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3]
  rowval: Array{Int64}((2,)) [4, 7]
  nzval: Array{Float64}((2,)) [-0.6646953796712529, -0.5059500670834203]

# convert to dense matrix; be cautious when dealing with big data
Xfull = convert(Matrix{Float64}, X)

10×10 Array{Float64,2}:
 0.0  0.0  0.0   0.0       0.0  0.0  0.0  0.0   0.0      0.0
 0.0  0.0  0.0   0.0       0.0  0.0  0.0  0.0   0.0      0.0
 0.0  0.0  0.0   0.0       0.0  0.0  0.0  0.0   0.0      0.0
 0.0  0.0  0.0  -0.664695  0.0  0.0  0.0  0.0   0.0      0.0
 0.0  0.0  0.0   0.0       0.0  0.0  0.0  0.0   0.0      0.0
 0.0  0.0  0.0   0.0       0.0  0.0  0.0  0.0   0.0      0.0
 0.0  0.0  0.0   0.0       0.0  0.0  0.0  0.0  -0.50595  0.0
 0.0  0.0  0.0   0.0       0.0  0.0  0.0  0.0   0.0      0.0
 0.0  0.0  0.0   0.0       0.0  0.0  0.0  0.0   0.0      0.0
 0.0  0.0  0.0   0.0       0.0  0.0  0.0  0.0   0.0      0.0

# convert a dense matrix to sparse matrix
sparse(Xfull)

10×10 SparseMatrixCSC{Float64,Int64} with 2 stored entries:
  [4, 4]  =  -0.664695
  [7, 9]  =  -0.50595

# syntax for sparse linear algebra is same as dense linear algebra
β = ones(10)
X * β

10-element Array{Float64,1}:
  0.0
  0.0
  0.0
 -0.6646953796712529
  0.0
  0.0
 -0.5059500670834203
  0.0
  0.0
  0.0

# many functions apply to sparse matrices as well
sum(X)

-1.1706454467546732

Control flow and loops¶

if-elseif-else-end

if condition1
    # do something
elseif condition2
    # do something
else
    # do something
end

for loop

for i in 1:10
    println(i)
end

Nested for loop:

for i in 1:10
    for j in 1:5
        println(i * j)
    end
end

Same as

for i in 1:10, j in 1:5
    println(i * j)
end

Exit loop:

for i in 1:10
    # do something
    if condition1
        break # skip remaining loop
    end
end

Exit iteration:

for i in 1:10
    # do something
    if condition1
        continue # skip to next iteration
    end
    # do something
end

Functions¶

In Julia, all arguments to functions are passed by reference, in contrast to R and Matlab.
Function names ending with ! indicates that function mutates at least one argument, typically the first.
```
sort!(x) # vs sort(x)
```
Function definition

function func(req1, req2; key1=dflt1, key2=dflt2)
    # do stuff
    return out1, out2, out3
end

Required arguments are separated with a comma and use the positional notation.
Optional arguments need a default value in the signature.
Semicolon is not required in function call.
return statement is optional.
Multiple outputs can be returned as a tuple, e.g., return out1, out2, out3.

Anonymous functions, e.g., x -> x^2, is commonly used in collection function or list comprehensions.
```
map(x -> x^2, y) # square each element in x
```
Functions can be nested:

function outerfunction()
    # do some outer stuff
    function innerfunction()
        # do inner stuff
        # can access prior outer definitions
    end
    # do more outer stuff
end

Functions can be vectorized using the Dot syntax:

# defined for scalar
function myfunc(x)
    return sin(x^2)
end

x = randn(5, 3)
myfunc.(x)

5×3 Array{Float64,2}:
 0.166046    0.301768   0.0391553
 0.889782    0.0197388  0.95851
 0.333145    0.203833   0.377999
 0.465292    0.102677   0.0260136
 0.0223636  -0.451376   0.899379

Collection function (think this as the series of apply functions in R).

Apply a function to each element of a collection:

map(f, coll) # or
map(coll) do elem
    # do stuff with elem
    # must contain return
end

map(x -> sin(x^2), x)

5×3 Array{Float64,2}:
 0.166046    0.301768   0.0391553
 0.889782    0.0197388  0.95851
 0.333145    0.203833   0.377999
 0.465292    0.102677   0.0260136
 0.0223636  -0.451376   0.899379

map(x) do elem
    elem = elem^2
    return sin(elem)
end

5×3 Array{Float64,2}:
 0.166046    0.301768   0.0391553
 0.889782    0.0197388  0.95851
 0.333145    0.203833   0.377999
 0.465292    0.102677   0.0260136
 0.0223636  -0.451376   0.899379

# Mapreduce
mapreduce(x -> sin(x^2), +, x)

4.354325467105569

# same as
sum(x -> sin(x^2), x)

4.354325467105569

List comprehension

[sin(2i + j) for i in 1:5, j in 1:3] # similar to Python

5×3 Array{Float64,2}:
  0.14112   -0.756802  -0.958924
 -0.958924  -0.279415   0.656987
  0.656987   0.989358   0.412118
  0.412118  -0.544021  -0.99999
 -0.99999   -0.536573   0.420167

Type system¶

Every variable in Julia has a type.
When thinking about types, think about sets.
Everything is a subtype of the abstract type Any.
An abstract type defines a set of types
- Consider types in Julia that are a Number:

We can explore type hierarchy with typeof(), supertype(), and subtypes().

typeof(1.0), typeof(1)

(Float64, Int64)

supertype(Float64)

AbstractFloat

subtypes(AbstractFloat)

4-element Array{Any,1}:
 BigFloat
 Float16
 Float32
 Float64

# Is Float64 a subtype of AbstractFloat?
Float64 <: AbstractFloat

true

# On 64bit machine, Int == Int64
Int == Int64

true

# convert to Float64
convert(Float64, 1)

1.0

# same as
Float64(1)

1.0

# Float32 vector
x = randn(Float32, 5)

5-element Array{Float32,1}:
 -0.69296366
  0.25062925
 -1.6592916
  0.022112986
  0.71937364

# convert to Float64
convert(Vector{Float64}, x)

5-element Array{Float64,1}:
 -0.6929636597633362
  0.2506292462348938
 -1.6592916250228882
  0.02211298607289791
  0.7193736433982849

# same as
Float64.(x)

5-element Array{Float64,1}:
 -0.6929636597633362
  0.2506292462348938
 -1.6592916250228882
  0.02211298607289791
  0.7193736433982849

# convert Float64 to Int64
convert(Int, 1.0)

1

convert(Int, 1.5) # should use round(1.5)

InexactError: Int64(1.5)

Stacktrace:
 [1] Int64 at ./float.jl:710 [inlined]
 [2] convert(::Type{Int64}, ::Float64) at ./number.jl:7
 [3] top-level scope at In[100]:1

round(Int, 1.5)

2

Multiple dispatch¶

Multiple dispatch lies in the core of Julia design. It allows built-in and user-defined functions to be overloaded for different combinations of argument types.
Let's consider a simple "doubling" function:

g(x) = x + x

g (generic function with 1 method)

g(1.5)

3.0

This definition is too broad, since some things, e.g., strings, can't be added

g("hello world")

MethodError: no method matching +(::String, ::String)
Closest candidates are:
  +(::Any, ::Any, !Matched::Any, !Matched::Any...) at operators.jl:529

Stacktrace:
 [1] g(::String) at ./In[102]:1
 [2] top-level scope at In[104]:1

This definition is correct but too restrictive, since any Number can be added.

g(x::Float64) = x + x

g (generic function with 2 methods)

This definition will automatically work on the entire type tree above!

g(x::Number) = x + x

g (generic function with 3 methods)

This is a lot nicer than

function g(x)
    if isa(x, Number)
        return x + x
    else
        throw(ArgumentError("x should be a number"))
    end
end

methods(func) function display all methods defined for func.

methods(g)

When calling a function with multiple definitions, Julia will search from the narrowest signature to the broadest signature.
@which func(x) marco tells which method is being used for argument signature x.

# an Int64 input
@which g(1)

# a Vector{Float64} input
@which g(randn(5))

Just-in-time compilation (JIT)¶

Following figures are taken from Arch D. Robinson's slides Introduction to Writing High Performance Julia.

Julia's efficiency results from its capability to infer the types of all variables within a function and then call LLVM to generate optimized machine code at run-time.

Consider the g (doubling) function defined earlier. This function will work on any type which has a method for +.

g(2), g(2.0)

(4, 4.0)

Step 1: Parse Julia code into abstract syntax tree (AST).

@code_lowered g(2)

CodeInfo(
1 ─ %1 = x + x
└──      return %1
)

Step 2: Type inference according to input type.

@code_warntype g(2)

Variables
  #self#::Core.Compiler.Const(g, false)
  x::Int64

Body::Int64
1 ─ %1 = (x + x)::Int64
└──      return %1

@code_warntype g(2.0)

Variables
  #self#::Core.Compiler.Const(g, false)
  x::Float64

Body::Float64
1 ─ %1 = (x + x)::Float64
└──      return %1

Step 3: Compile into LLVM bytecode (equivalent of R bytecode generated by compiler package).

@code_llvm g(2)

;  @ In[106]:1 within `g'
define i64 @julia_g_19487(i64) {
top:
; ┌ @ int.jl:53 within `+'
   %1 = shl i64 %0, 1
; └
  ret i64 %1
}

@code_llvm g(2.0)

;  @ In[105]:1 within `g'
define double @julia_g_19488(double) {
top:
; ┌ @ float.jl:401 within `+'
   %1 = fadd double %0, %0
; └
  ret double %1
}

We didn't provide a type annotation. But different LLVM code gets generated depending on the argument type!

In R or Python, g(2) and g(2.0) would use the same code for both.

In Julia, g(2) and g(2.0) dispatches to optimized code for Int64 and Float64, respectively.

For integer input x, LLVM compiler is smart enough to know x + x is simple shifting x by 1 bit, which is faster than addition.

Step 4: Lowest level is the assembly code, which is machine dependent.

@code_native g(2)

	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ In[106]:1 within `g'
; │┌ @ In[106]:1 within `+'
	leaq	(%rdi,%rdi), %rax
; │└
	retq
	nopw	%cs:(%rax,%rax)
	nop
; └

@code_native g(2.0)

	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ In[105]:1 within `g'
; │┌ @ In[105]:1 within `+'
	vaddsd	%xmm0, %xmm0, %xmm0
; │└
	retq
	nopw	%cs:(%rax,%rax)
	nop
; └

Profiling Julia code¶

Julia has several built-in tools for profiling. The @time marco outputs run time and heap allocation.

# a function defined earlier
function tally(x::Array)
    s = zero(eltype(x))
    for v in x
        s += v
    end
    s
end

using Random
Random.seed!(123)
a = rand(20_000_000)
@time tally(a) # first run: include compile time

  0.034518 seconds (5.43 k allocations: 251.211 KiB)

1.0000233387279043e7

@time tally(a)

  0.025372 seconds (1 allocation: 16 bytes)

1.0000233387279043e7

For more robust benchmarking, the BenchmarkTools.jl package is highly recommended.

using BenchmarkTools

@benchmark tally($a)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     22.857 ms (0.00% GC)
  median time:      24.148 ms (0.00% GC)
  mean time:        24.326 ms (0.00% GC)
  maximum time:     28.404 ms (0.00% GC)
  --------------
  samples:          206
  evals/sample:     1

The Profile module gives line by line profile results.

using Profile

Profile.clear()
@profile tally(a)
Profile.print(format=:flat)

 Count  Overhead File                    Line Function
 =====  ======== ====                    ==== ========
  1450         0 In[118]                    5 tally(::Array{Float64,1})
     1         1 @Base/abstractdict.jl    638 _oidd_nextind
     1         0 @Base/abstractdict.jl    641 iterate
     2         0 @Base/array.jl           394 getindex
  1450         0 @Base/array.jl           763 iterate
     1         0 @Base/array.jl           914 push!
     1         1 @Base/array.jl           825 setindex!
     2         2 @Base/boot.jl            405 Array
     2         0 @Base/boot.jl            424 Array
  1594       138 @Base/boot.jl            331 eval
     1         0 .../inferenceresult.jl    12 InferenceResult
     1         1 .../inferenceresult.jl    64 matching_cache_argtypes(::Core....
     1         0 ...r/inferencestate.jl   106 Core.Compiler.InferenceState(::...
     1         0 ...r/inferencestate.jl   116 Core.Compiler.InferenceState(::...
     3         0 ...ompiler/optimize.jl   169 optimize(::Core.Compiler.Optimi...
     2         0 ...ler/ssair/driver.jl   107 just_construct_ssa(::Core.CodeI...
     2         0 ...ler/ssair/driver.jl   112 run_passes(::Core.CodeInfo, ::I...
     1         0 ...ler/ssair/driver.jl   121 run_passes(::Core.CodeInfo, ::I...
     1         0 ...ler/ssair/passes.jl   725 getfield_elim_pass!(::Core.Comp...
     1         0 ...r/ssair/slot2ssa.jl   595 construct_ssa!(::Core.CodeInfo,...
     1         0 ...r/ssair/slot2ssa.jl   868 construct_ssa!(::Core.CodeInfo,...
     1         0 ...r/ssair/slot2ssa.jl   378 domsort_ssa!(::Core.Compiler.IR...
     1         1 ...mpiler/typeinfer.jl   126 cache_result(::Core.Compiler.In...
     3         0 ...mpiler/typeinfer.jl    33 typeinf(::Core.Compiler.Inferen...
     1         0 ...mpiler/typeinfer.jl    67 typeinf(::Core.Compiler.Inferen...
     2         0 ...mpiler/typeinfer.jl   572 typeinf_ext(::Core.MethodInstan...
     4         0 ...mpiler/typeinfer.jl   574 typeinf_ext(::Core.MethodInstan...
     6         0 ...mpiler/typeinfer.jl   605 typeinf_ext(::Core.MethodInstan...
  1594         0 @Base/essentials.jl      712 #invokelatest#1
  1594         0 @Base/essentials.jl      711 invokelatest
  1450      1450 @Base/int.jl             409 <
  1450         0 @Base/int.jl             416 <
  1594         0 @Base/task.jl            358 (::IJulia.var"#15#18")()
  1594         0 ...ia/src/eventloop.jl     8 eventloop(::ZMQ.Socket)
  1594         0 .../execute_request.jl    67 execute_request(::ZMQ.Socket, :...
  1594         0 .../SoftGlobalScope.jl   218 softscope_include_string(::Modu...
Total snapshots: 6409

One can use ProfileView package for better visualization of profile data:

using ProfileView

ProfileView.view()

# check type stability
@code_warntype tally(a)

Variables
  #self#::Core.Compiler.Const(tally, false)
  x::Array{Float64,1}
  s::Float64
  @_4::Union{Nothing, Tuple{Float64,Int64}}
  v::Float64

Body::Float64
1 ─ %1  = Main.eltype(x)::Core.Compiler.Const(Float64, false)
│         (s = Main.zero(%1))
│   %3  = x::Array{Float64,1}
│         (@_4 = Base.iterate(%3))
│   %5  = (@_4 === nothing)::Bool
│   %6  = Base.not_int(%5)::Bool
└──       goto #4 if not %6
2 ┄ %8  = @_4::Tuple{Float64,Int64}::Tuple{Float64,Int64}
│         (v = Core.getfield(%8, 1))
│   %10 = Core.getfield(%8, 2)::Int64
│         (s = s + v)
│         (@_4 = Base.iterate(%3, %10))
│   %13 = (@_4 === nothing)::Bool
│   %14 = Base.not_int(%13)::Bool
└──       goto #4 if not %14
3 ─       goto #2
4 ┄       return s

# check LLVM bitcode
@code_llvm tally(a)

;  @ In[118]:3 within `tally'
define double @julia_tally_19666(%jl_value_t addrspace(10)* nonnull align 16 dereferenceable(40)) {
top:
;  @ In[118]:4 within `tally'
; ┌ @ array.jl:763 within `iterate' @ array.jl:763
; │┌ @ array.jl:221 within `length'
    %1 = addrspacecast %jl_value_t addrspace(10)* %0 to %jl_value_t addrspace(11)*
    %2 = bitcast %jl_value_t addrspace(11)* %1 to %jl_array_t addrspace(11)*
    %3 = getelementptr inbounds %jl_array_t, %jl_array_t addrspace(11)* %2, i64 0, i32 1
    %4 = load i64, i64 addrspace(11)* %3, align 8
; │└
   %5 = icmp slt i64 %4, 1
   br i1 %5, label %L43, label %L18

L18:                                              ; preds = %top
; │┌ @ array.jl:787 within `getindex'
    %6 = bitcast %jl_value_t addrspace(11)* %1 to double addrspace(13)* addrspace(11)*
    %7 = load double addrspace(13)*, double addrspace(13)* addrspace(11)* %6, align 8
; └└
  %value_phi413 = load double, double addrspace(13)* %7, align 8
;  @ In[118]:5 within `tally'
; ┌ @ float.jl:401 within `+'
   %8 = fadd double %value_phi413, 0.000000e+00
; └
; ┌ @ array.jl:763 within `iterate'
; │┌ @ int.jl:416 within `<' @ int.jl:409
    %9 = icmp eq i64 %4, 1
; │└
   br i1 %9, label %L43, label %L37

L37:                                              ; preds = %L18, %L37
   %10 = phi i64 [ %value_phi514, %L37 ], [ 1, %L18 ]
   %11 = phi double [ %14, %L37 ], [ %8, %L18 ]
   %value_phi514 = phi i64 [ %13, %L37 ], [ 2, %L18 ]
; │┌ @ array.jl:787 within `getindex'
    %12 = getelementptr inbounds double, double addrspace(13)* %7, i64 %10
; │└
; │┌ @ int.jl:53 within `+'
    %13 = add i64 %value_phi514, 1
; └└
  %value_phi4 = load double, double addrspace(13)* %12, align 8
;  @ In[118]:5 within `tally'
; ┌ @ float.jl:401 within `+'
   %14 = fadd double %11, %value_phi4
; └
; ┌ @ array.jl:763 within `iterate'
; │┌ @ int.jl:416 within `<' @ int.jl:409
    %15 = icmp ult i64 %value_phi514, %4
; │└
   br i1 %15, label %L37, label %L43

L43:                                              ; preds = %L37, %L18, %top
   %value_phi9 = phi double [ 0.000000e+00, %top ], [ %8, %L18 ], [ %14, %L37 ]
; └
;  @ In[118]:7 within `tally'
  ret double %value_phi9
}

@code_native tally(a)

	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ In[118]:4 within `tally'
; │┌ @ array.jl:763 within `iterate' @ array.jl:763
; ││┌ @ In[118]:3 within `length'
	movq	8(%rdi), %rax
; ││└
	testq	%rax, %rax
	jle	L47
; ││┌ @ array.jl:787 within `getindex'
	movq	(%rdi), %rcx
	vxorpd	%xmm0, %xmm0, %xmm0
; │└└
; │ @ In[118]:5 within `tally'
; │┌ @ float.jl:401 within `+'
	vaddsd	(%rcx), %xmm0, %xmm0
; │└
; │┌ @ array.jl:763 within `iterate'
; ││┌ @ int.jl:416 within `<' @ int.jl:409
	cmpq	$1, %rax
; ││└
	je	L46
; └└
; ┌ @ array.jl within `tally'
	movl	$1, %edx
	nop
; └
; ┌ @ In[118]:5 within `tally'
; │┌ @ float.jl:401 within `+'
L32:
	vaddsd	(%rcx,%rdx,8), %xmm0, %xmm0
; │└
; │┌ @ array.jl:763 within `iterate'
; ││┌ @ int.jl:416 within `<' @ int.jl:409
	addq	$1, %rdx
	cmpq	%rax, %rdx
; ││└
	jb	L32
; │└
; │ @ In[118]:7 within `tally'
L46:
	retq
L47:
	vxorps	%xmm0, %xmm0, %xmm0
; │ @ In[118]:7 within `tally'
	retq
	nopw	%cs:(%rax,%rax)
	nop
; └

Exercise: Annotate the loop in tally function by @simd and look for the difference in LLVM bitcode and machine code.

Memory profiling¶

Detailed memory profiling requires a detour. First let's write a script bar.jl, which contains the workload function tally and a wrapper for profiling.

;cat bar.jl

using Profile

function tally(x::Array)
    s = zero(eltype(x))
    for v in x
        s += v
    end
    s
end

# call workload from wrapper to avoid misattribution bug
function wrapper()
    y = rand(10000)
    # force compilation
    println(tally(y))
    # clear allocation counters
    Profile.clear_malloc_data()
    # run compiled workload
    println(tally(y))
end

wrapper()

Next, in terminal, we run the script with --track-allocation=user option.

#;julia --track-allocation=user bar.jl

The profiler outputs a file bar.jl.51116.mem.

;cat bar.jl.51116.mem

        - using Profile
        - 
        - function tally(x::Array)
        -     s = zero(eltype(x))
        -     for v in x
        -         s += v
        -     end
        -     s
        - end
        - 
        - # call workload from wrapper to avoid misattribution bug
        - function wrapper()
        0     y = rand(10000)
        -     # force compilation
        0     println(tally(y))
        -     # clear allocation counters
        0     Profile.clear_malloc_data()
        -     # run compiled workload
      528     println(tally(y))
        - end
        - 
        - wrapper()
        -