Skip to content

Commit 97418ab

Browse files
committed
Complete the revision of README
1 parent 716d3c7 commit 97418ab

File tree

3 files changed

+82
-121
lines changed

3 files changed

+82
-121
lines changed

README.md

Lines changed: 60 additions & 92 deletions
Original file line numberDiff line numberDiff line change
@@ -97,129 +97,102 @@ A shorthand that creates the symbolic derivative program, simplifies it and inte
9797
```
9898
9999
100-
# WIP: The examples below are outdated and will be replaced soon using a new API
101-
102-
103-
## Computing Jacobians
104-
105-
-- TODO: we can have vector/matrix/tensor codomains, but not pair codomains
106-
-- until #68 is done;
107-
-- perhaps a vector codomain example, with a 1000x3 Jacobian, would make sense?
108-
-- 2 years later: actually, we can now have TKProduct codomains.
109-
110-
Now let's consider a function from 'R^n` to `R^m'. We don't want the gradient, but instead the Jacobian.
111-
```hs
112-
-- A function that goes from `R^3` to `R^2`.
113-
foo :: RealFloat a => (a,a,a) -> (a,a)
114-
foo (x,y,z) =
115-
let w = x * sin y
116-
in (atan2 z w, z * w)
117-
```
118-
TODO: show how the 2x3 Jacobian emerges from here
119-
120-
121-
122100
## Forall shapes and sizes
123101
124-
An additional feature of this library is a type system for tensor shape arithmetic. The following code is a part of convolutional neural network definition, for which horde-ad computes the gradient of a shape determined by the shape of input data and initial parameters. The compiler is able to infer a lot of tensor shapes, deriving them both from dynamic dimension arguments (the first two lines of parameters to the function) and from static type-level hints. Look at this beauty.
102+
An additional feature of this library is a type system for tensor shape arithmetic. The following code is a part of convolutional neural network definition, for which horde-ad computes the gradient of a shape determined by the shape of input data and of initial parameters. The compiler is able to infer a lot of tensor shapes, deriving them both from dynamic dimension arguments (the first two lines of parameters to the function) and from static type-level hints. Look at this beauty.
125103
```hs
126104
convMnistTwoS
127105
kh@SNat kw@SNat h@SNat w@SNat
128-
c_in@SNat c_out@SNat _n_hidden@SNat batch_size@SNat
129-
-- integer parameters denoting basic dimensions, with some notational noise
130-
input -- input images, shape (batch_size, c_in, h, w)
131-
(ker1, bias1) -- layer1 kernel, shape (c_out, c_in, kh+1, kw+1); and bias, shape (c_out)
132-
(ker2, bias2) -- layer2 kernel, shape (c_out, c_out, kh+1, kw+1); and bias, shape (c_out)
106+
c_out@SNat _n_hidden@SNat batch_size@SNat
107+
-- integer parameters denoting basic dimensions
108+
input -- input images, shape [batch_size, 1, h, w]
109+
(ker1, bias1) -- layer1 kernel, shape [c_out, 1, kh+1, kw+1]; and bias, shape [c_out]
110+
(ker2, bias2) -- layer2 kernel, shape [c_out, c_out, kh+1, kw+1]; and bias, shape [c_out]
133111
( weightsDense -- dense layer weights,
134-
-- shape (n_hidden, c_out * ((h+kh)/2 + kh)/2, ((w+kw)/2 + kw)/2)
135-
, biasesDense ) -- dense layer biases, shape (n_hidden)
136-
( weightsReadout -- readout layer weights, shape (10, n_hidden)
137-
, biasesReadout ) -- readout layer biases (10)
138-
= -- -> output classification, shape (10, batch_size)
139-
let t1 = convMnistLayerS kh kw
140-
h w
141-
c_in c_out batch_size
142-
ker1 (constant input) bias1
143-
t2 = convMnistLayerS kh kw
144-
(SNat @((h + kh) `Div` 2)) (SNat @((w + kw) `Div` 2))
112+
-- shape [n_hidden, c_out * (h/4) * (w/4)]
113+
, biasesDense ) -- dense layer biases, shape [n_hidden]
114+
( weightsReadout -- readout layer weights, shape [10, n_hidden]
115+
, biasesReadout ) -- readout layer biases [10]
116+
= -- -> output classification, shape [10, batch_size]
117+
gcastWith (unsafeCoerceRefl :: Div (Div w 2) 2 :~: Div w 4) $
118+
gcastWith (unsafeCoerceRefl :: Div (Div h 2) 2 :~: Div h 4) $
119+
let t1 = convMnistLayerS kh kw h w
120+
(SNat @1) c_out batch_size
121+
ker1 (sfromPrimal input) bias1
122+
t2 = convMnistLayerS kh kw (SNat @(h `Div` 2)) (SNat @(w `Div` 2))
145123
c_out c_out batch_size
146124
ker2 t1 bias2
147-
m1 = mapOuterS reshapeS t2
148-
m2 = transpose2S m1
149-
denseLayer = weightsDense <>$ m2 + asColumnS biasesDense
150-
denseRelu = relu denseLayer
151-
in weightsReadout <>$ denseRelu + asColumnS biasesReadout
125+
m1 = sreshape t2
126+
denseLayer = weightsDense `smatmul2` str m1
127+
+ str (sreplicate biasesDense)
128+
in weightsReadout `smatmul2` reluS denseLayer
129+
+ str (sreplicate biasesReadout)
152130
```
153131
But we don't just want the shapes in comments and in runtime expressions; we want them as a compiler-verified documentation in the form of the type signature of the function:
154132
```hs
155133
convMnistTwoS
156-
:: forall kh kw h w c_in c_out n_hidden batch_size d r.
157-
( 1 <= kh -- kernel height is large enough
158-
, 1 <= kw -- kernel width is large enough
159-
, ADModeAndNum d r ) -- differentiation mode and numeric type are known to the engine
160-
=> -- The two boilerplate lines below tie type parameters to the corresponding
161-
-- value parameters (built with SNat) denoting basic dimensions.
162-
SNat kh -> SNat kw -> SNat h -> SNat w
163-
-> SNat c_in -> SNat c_out -> SNat n_hidden -> SNat batch_size
164-
-> OS.Array '[batch_size, c_in, h, w] r
165-
-> ( ADVal d (OS.Array '[c_out, c_in, kh + 1, kw + 1] r)
166-
, ADVal d (OS.Array '[c_out] r ) )
167-
-> ( ADVal d (OS.Array '[c_out, c_out, kh + 1, kw + 1] r)
168-
, ADVal d (OS.Array '[c_out] r) )
169-
-> ( ADVal d (OS.Array '[ n_hidden
170-
, c_out * (((h + kh) `Div` 2 + kh) `Div` 2)
171-
* (((w + kw) `Div` 2 + kw) `Div` 2)
172-
] r)
173-
, ADVal d (OS.Array '[n_hidden] r) )
174-
-> ( ADVal d (OS.Array '[10, n_hidden] r)
175-
, ADVal d (OS.Array '[10] r) )
176-
-> ADVal d (OS.Array '[10, batch_size] r)
134+
:: forall kh kw h w c_out n_hidden batch_size target r.
135+
( 1 <= kh -- kernel height is large enough
136+
, 1 <= kw -- kernel width is large enough
137+
, ADReady target, GoodScalar r, Differentiable r )
138+
=> SNat kh -> SNat kw -> SNat h -> SNat w
139+
-> SNat c_out -> SNat n_hidden -> SNat batch_size
140+
-- ^ these boilerplate lines tie type parameters to the corresponding
141+
-- SNat value parameters denoting basic dimensions
142+
-> PrimalOf target (TKS '[batch_size, 1, h, w] r)
143+
-> ( ( target (TKS '[c_out, 1, kh + 1, kw + 1] r)
144+
, target (TKS '[c_out] r) )
145+
, ( target (TKS '[c_out, c_out, kh + 1, kw + 1] r)
146+
, target (TKS '[c_out] r) )
147+
, ( target (TKS '[n_hidden, c_out * (h `Div` 4) * (w `Div` 4) ] r)
148+
, target (TKS '[n_hidden] r) )
149+
, ( target (TKS '[10, n_hidden] r)
150+
, target (TKS '[10] r) ) )
151+
-> target (TKS '[SizeMnistLabel, batch_size] r)
177152
```
178153
179154
The full neural network definition from which this function is taken can be found at
180155
181156
https://github.com/Mikolaj/horde-ad/tree/master/example
182157
183-
in file `MnistCnnShaped.hs` and the directory contains several other sample neural networks for MNIST digit classification. Among them are recurrent, convolutional and fully connected networks based on fully typed tensors (sizes of all dimensions are tracked in the types, as above) as well as weakly typed fully connected networks built with, respectively, matrices, vectors and raw scalars (working with scalars is the most flexible but slowest; all others have comparable performance on CPU).
158+
in file `MnistCnnShaped2.hs` and the directory contains several other sample neural networks for MNIST digit classification. Among them are recurrent, convolutional and fully connected networks based on fully typed tensors (sizes of all dimensions are tracked in the types, as above) as well as their weakly typed variants that track only the ranks of tensors. It's possible to mix the two typing styles within one function signature and even within one shape description.
184159
185160
186161
Compilation from source
187162
-----------------------
188163
189-
Because we use [hmatrix] the OS needs libraries that on Ubuntu/Debian
190-
are called libgsl0-dev, liblapack-dev and libatlas-base-dev.
191-
See https://github.com/haskell-numerics/hmatrix/blob/master/INSTALL.md
192-
for information about other OSes.
193-
Other Haskell packages need their usual C library dependencies,
194-
as well, e.g., package zlib needs C library zlib1g-dev.
164+
The Haskell packages [we depend on](https://github.com/Mikolaj/horde-ad/blob/master/horde-ad.cabal) need their usual C library dependencies,
165+
e.g., package zlib needs the C library zlib1g-dev or an equivalent.
166+
At this time, we don't depend on any GPU hardware nor bindings.
195167
196168
For development, copying the included `cabal.project.local.development`
197169
to `cabal.project.local` provides a sensible default to run `cabal build` with.
198170
For extensive testing, a command like
199171
200-
cabal test minimalTest --enable-optimization -f test_seq
172+
cabal test minimalTest --enable-optimization
201173
202-
ensures that the code is compiled with optimization and so executes the rather
203-
computation-intensive testsuites in reasonable time.
174+
ensures that the code is compiled with optimization and consequently
175+
executes the rather computation-intensive testsuites in reasonable time.
204176
205177
206178
Running tests
207179
-------------
208180
209-
The test suite can run in parallel but, if so, the PP tests need to be disabled:
181+
The `parallelTest` test suite consists of large tests and runs in parallel
210182
211-
cabal test simplifiedOnlyTest --enable-optimization --test-options='-p "! /PP/"'
183+
cabal test parallelTest --enable-optimization
212184
213-
Parallel run may cause the extra printf messages coming from within the tests
214-
to be out of order. To keep your screen tidy, simply redirect `stderr`,
215-
e.g. via: `2>/dev/null`:
185+
which is likely to cause the extra printf messages coming from within
186+
the tests to be out of order. To keep your screen tidy, simply redirect
187+
`stderr`, e.g.,
216188
217-
cabal test simplifiedOnlyTest --enable-optimization --test-options='-p "! /PP/"' 2>/dev/null
189+
cabal test parallelTest --enable-optimization 2>/dev/null
218190
219-
You can also run the test suite sequentially and then all tests can be included
220-
and the extra printf messages are displayed fine most of the time:
191+
The remainder of the test suite is set up to run sequentially to simplify
192+
automatic verication of results that may vary slightly depending on
193+
execution order
221194
222-
cabal test simplifiedOnlyTest --enable-optimization -f test_seq
195+
cabal test CAFlessTest --enable-optimization
223196
224197
225198
Coding style
@@ -233,8 +206,8 @@ Spaces around arithmetic operators encouraged.
233206
Generally, relax and try to stick to the style apparent in a file
234207
you are editing. Put big formatting changes in separate commits.
235208
236-
Haddocks should be provided for all module headers and for all functions
237-
and types, or at least main sections, from the most important modules.
209+
Haddocks should be provided for all module headers and for the main
210+
functions and types from the most important modules.
238211
Apart of that, only particularly significant functions and types
239212
are distinguished by having a haddock. If minor ones have comments,
240213
they should not be haddocks and they are permitted to describe
@@ -245,11 +218,6 @@ of comments, unless too verbose.
245218
Copyright
246219
---------
247220
248-
Copyright 2023 Mikolaj Konarski, Well-Typed LLP and others (see git history)
221+
Copyright 2023--2025 Mikolaj Konarski, Well-Typed LLP and others (see git history)
249222
250223
License: BSD-3-Clause (see file LICENSE)
251-
252-
253-
254-
[hmatrix]: https://hackage.haskell.org/package/hmatrix
255-
[orthotope]: https://hackage.haskell.org/package/orthotope

cabal.project.local.development

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
ignore-project: False
2-
tests: True
3-
benchmarks: True
1+
package horde-ad
2+
tests: True
3+
benchmarks: True
4+
45
test-show-details: direct
56
optimization: False
6-

example/MnistCnnShaped2.hs

Lines changed: 18 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -43,10 +43,8 @@ convMnistLayerS
4343
( 1 <= kh
4444
, 1 <= kw -- wrongly reported as redundant due to plugins
4545
, ADReady target, GoodScalar r, Differentiable r )
46-
=> SNat kh -> SNat kw
47-
-> SNat h -> SNat w
48-
-> SNat c_in -> SNat c_out
49-
-> SNat batch_size
46+
=> SNat kh -> SNat kw -> SNat h -> SNat w
47+
-> SNat c_in -> SNat c_out -> SNat batch_size
5048
-> target (TKS '[c_out, c_in, kh + 1, kw + 1] r)
5149
-> target (TKS '[batch_size, c_in, h, w] r)
5250
-> target (TKS '[c_out] r)
@@ -63,43 +61,38 @@ convMnistLayerS SNat SNat SNat SNat SNat SNat SNat
6361

6462
convMnistTwoS
6563
:: forall kh kw h w c_out n_hidden batch_size target r.
66-
-- @h@ and @w@ are fixed for MNIST, but may be different, e.g., in tests
67-
( 1 <= kh -- kernel height is large enough
68-
, 1 <= kw -- kernel width is large enough
64+
-- @h@ and @w@ are fixed with MNIST data, but not with test data
65+
( 1 <= kh -- kernel height is large enough
66+
, 1 <= kw -- kernel width is large enough
6967
, ADReady target, GoodScalar r, Differentiable r )
70-
=> SNat kh -> SNat kw
71-
-> SNat h -> SNat w
68+
=> SNat kh -> SNat kw -> SNat h -> SNat w
7269
-> SNat c_out -> SNat n_hidden -> SNat batch_size
7370
-- ^ these boilerplate lines tie type parameters to the corresponding
74-
-- value parameters (@SNat@ below) denoting basic dimensions
71+
-- SNat value parameters denoting basic dimensions
7572
-> PrimalOf target (TKS '[batch_size, 1, h, w] r) -- ^ input images
7673
-> ADCnnMnistParametersShaped target h w kh kw c_out n_hidden r
77-
-> target (TKS '[SizeMnistLabel, batch_size] r) -- ^ classification
78-
convMnistTwoS kh@SNat kw@SNat
79-
h@SNat w@SNat
74+
-- ^ parameters
75+
-> target (TKS '[SizeMnistLabel, batch_size] r) -- ^ output classification
76+
convMnistTwoS kh@SNat kw@SNat h@SNat w@SNat
8077
c_out@SNat _n_hidden@SNat batch_size@SNat
8178
input
8279
( (ker1, bias1), (ker2, bias2)
8380
, (weightsDense, biasesDense), (weightsReadout, biasesReadout) ) =
8481
gcastWith (unsafeCoerceRefl :: Div (Div w 2) 2 :~: Div w 4) $
8582
gcastWith (unsafeCoerceRefl :: Div (Div h 2) 2 :~: Div h 4) $
86-
let t1 = convMnistLayerS kh kw
87-
h w
83+
let t1 = convMnistLayerS kh kw h w
8884
(SNat @1) c_out batch_size
8985
ker1 (sfromPrimal input) bias1
90-
t2 :: target (TKS '[batch_size, c_out, h `Div` 4, w `Div` 4] r)
91-
t2 = convMnistLayerS kh kw
92-
(SNat @(h `Div` 2)) (SNat @(w `Div` 2))
86+
-- t2 :: target (TKS '[batch_size, c_out, h `Div` 4, w `Div` 4] r)
87+
t2 = convMnistLayerS kh kw (SNat @(h `Div` 2)) (SNat @(w `Div` 2))
9388
c_out c_out batch_size
9489
ker2 t1 bias2
95-
m1 :: target (TKS '[batch_size, c_out * (h `Div` 4) * (w `Div` 4)] r)
90+
-- m1 :: target (TKS '[batch_size, c_out * (h `Div` 4) * (w `Div` 4)] r)
9691
m1 = sreshape t2
97-
m2 = str m1
98-
denseLayer = weightsDense `smatmul2` m2
99-
+ str (sreplicate {-@batch_size-} biasesDense)
100-
denseRelu = reluS denseLayer
101-
in weightsReadout `smatmul2` denseRelu
102-
+ str (sreplicate {-@batch_size-} biasesReadout)
92+
denseLayer = weightsDense `smatmul2` str m1
93+
+ str (sreplicate biasesDense)
94+
in weightsReadout `smatmul2` reluS denseLayer
95+
+ str (sreplicate biasesReadout)
10396

10497
convMnistLossFusedS
10598
:: forall kh kw h w c_out n_hidden batch_size target r.

0 commit comments

Comments
 (0)