Update the simple examples in README

Mikolaj · Mikolaj · commit 2223921d7f61 · 2025-03-07T23:25:04.000+01:00
diff --git a/README.md b/README.md
@@ -13,58 +13,81 @@ The benchmarks at SOMEWHERE show that this library has performance highly compet
 It is hoped that the (well-typed) separation of AD logic and the tensor manipulation backend will enable similar speedups on numerical accelerators.
 
 
-# WIP: The examples below are outdated and will be replaced soon using a new API
-
-
 ## Computing the derivative of a simple function
 
 Here is an example of a Haskell function to be differentiated:
 
 ```hs
 -- A function that goes from R^3 to R.
-foo :: RealFloat a => (a,a,a) -> a
-foo (x,y,z) =
+foo :: RealFloat a => (a, a, a) -> a
+foo (x, y, z) =
   let w = x * sin y
   in atan2 z w + z * w  -- note that w appears twice
 ```
 
-The gradient of `foo` is:
-<!--
-TODO: this may yet get simpler and the names not leaking implementation details
-("delta") so much, when the adaptor gets used at scale and redone.
-Alternatively, we could settle on Double already here.
--->
+The gradient of `foo` instantiated to `Double` is:
 ```hs
-grad_foo :: forall r. (HasDelta r, AdaptableScalar 'ADModeGradient r)
-         => (r, r, r) -> (r, r, r)
-grad_foo = rev @r foo
+gradFooDouble :: (Double, Double, Double) -> (Double, Double, Double)
+gradFooDouble = fromDValue . crev foo . fromValue
 ```
 
-As can be verified by computing the gradient at `(1.1, 2.2, 3.3)`:
+as can be verified by computing the gradient at `(1.1, 2.2, 3.3)`:
 ```hs
->>> grad_foo (1.1 :: Double, 2.2, 3.3)
+>>> gradFooDouble (1.1, 2.2, 3.3)
 (2.4396285219055063, -1.953374825727421, 0.9654825811012627)
 ```
 
-As a side note, `w` is processed only once during gradient computation and this property of sharing preservation is guaranteed universally by horde-ad without any action required from the user. The property holds not only for scalar values, but for arbitrary tensors, e.g., those in further examples. We won't mention the property further.
+Instantiated to matrices, the gradient is:
+```hs
+gradFooMatrix :: Differentiable r, GoodScalar r)
+              => (RepN (TKS '[2, 2] r), RepN (TKS '[2, 2] r), RepN (TKS '[2, 2] r))
+              -> (RepN (TKS '[2, 2] r), RepN (TKS '[2, 2] r), RepN (TKS '[2, 2] r))
+gradFooMatrix = crev foo
+```
 
-<!--
-Do we want yet another example here, before we reach Jacobians or shaped tensors? Perhaps one with the testing infrastructure, e.g., generating a single set of random tensors, or a full QuickCheck example or just a simple
+as can be verified by:
 ```hs
-  assertEqualUpToEpsilon 1e-9
-    (6.221706565357043, -12.856908977773593, 6.043601532156671)
-    (rev bar (1.1, 2.2, 3.3))
+>>> gradFooMatrix (srepl 1.1, srepl 2.2, srepl (3.3 :: Double))
+(sfromListLinear [2.4396285219055063,2.4396285219055063,2.4396285219055063,2.4396285219055063],sfromListLinear [-1.953374825727421,-1.953374825727421,-1.953374825727421,-1.953374825727421],sfromListLinear [0.9654825811012627,0.9654825811012627,0.9654825811012627,0.9654825811012627])
 ```
-? Or is there a risk the reader won't make it to the shaped example below if we tarry here? Or perhaps finish the shaped tensor example below with an invocation of `assertEqualUpToEpsilon`?
--->
+
+Note that `w` is processed only once during gradient computation and this property of sharing preservation is guaranteed for the `crev` tool universally by horde-ad without any action required from the user. When computing symbolic derivative programs, however, the user has to explicitly mark values for sharing using `tlet` with a more specific type of the objective function, as shown below.
+
+```hs
+fooLet :: (RealFloatH (target a), LetTensor target)
+       => (target a, target a, target a) -> target a
+fooLet (x, y, z) =
+  tlet (x * sin y) $ \w ->
+    atan2H z w + z * w
+```
+
+The symbolic derivative program (here presented with additional formatting) can be obtained using the `revArtifactAdapt` tool:
+```hs
+>>> let ftk = FTKS @'[2, 2] [2, 2]  (FTKScalar @Double)
+    in printArtifactGradient
+         (fst $ revArtifactAdapt True fooLet (FTKProduct (FTKProduct ftk ftk) ftk))
+"\m6 m1 ->
+   let m3 = sin (tproject2 (tproject1 m1))
+       m4 = tproject1 (tproject1 m1) * m3
+       m5 = recip (tproject2 m1 * tproject2 m1 + m4 * m4)
+       m7 = (negate (tproject2 m1) * m5) * m6 + tproject2 m1 * m6
+    in tpair
+         ( tpair (m3 * m7, cos (tproject2 (tproject1 m1)) * (tproject1 (tproject1 m1) * m7))
+         , (m4 * m5) * m6 + m4 * m6)"
+```
+
+A quick inspection of the derivative program reveals that computations are not repeated, which is thanks to sharing. A concrete value of the symbolic derivative can be obtained by interpreting the derivative program in the context of the operations supplied by the horde-ad library. The value should be the same as when evaluating `fooLet` with `crev` on the concrete input, as before. A shorthand that creates the symbolic derivative program and evaluates it at a given input is called `rev` and is used exactly the same (but with potentially better performance) as `crev`.
+
+
+# WIP: The examples below are outdated and will be replaced soon using a new API
 
 
-<!--
 ## Computing Jacobians
 
 -- TODO: we can have vector/matrix/tensor codomains, but not pair codomains
 -- until #68 is done;
 -- perhaps a vector codomain example, with a 1000x3 Jacobian, would make sense?
+-- 2 years later: actually, we can now have TKProduct codomains.
 
 Now let's consider a function from 'R^n` to `R^m'.  We don't want the gradient, but instead the Jacobian.
 ```hs
@@ -75,7 +98,7 @@ foo (x,y,z) =
   in (atan2 z w, z * w)
 ```
 TODO: show how the 2x3 Jacobian emerges from here
--->
+
 
 
 ## Forall shapes and sizes
diff --git a/src/HordeAd/Core/CarriersConcrete.hs b/src/HordeAd/Core/CarriersConcrete.hs
@@ -114,7 +114,31 @@ instance (Nested.NumElt r, Nested.PrimElt r, Eq r, IntegralH r)
                              , either (V.replicate (V.length x)) id y' )
                      in V.zipWith
                           (\a b -> if b == 0 then 0 else remH a b) x y)))
-                            -- TODO: do better somehow
+                            -- TODO: do better somehow'
+
+instance GoodScalar r
+         => Real (Nested.Ranked n r) where
+  toRational = error "horde-ad: operation not defined for tensor"
+
+instance GoodScalar r
+         => Real (Nested.Shaped sh r) where
+  toRational = error "horde-ad: operation not defined for tensor"
+
+instance GoodScalar r
+         => Real (Nested.Mixed sh r) where
+  toRational = error "horde-ad: operation not defined for tensor"
+
+instance (GoodScalar r, Nested.FloatElt r)
+         => RealFrac (Nested.Ranked n r) where
+  properFraction = error "horde-ad: operation not defined for tensor"
+
+instance (GoodScalar r, RealFrac r, Nested.FloatElt r)
+         => RealFrac (Nested.Shaped sh r) where
+  properFraction = error "horde-ad: operation not defined for tensor"
+
+instance (GoodScalar r, Nested.FloatElt r)
+         => RealFrac (Nested.Mixed sh r) where
+  properFraction = error "horde-ad: operation not defined for tensor"
 
 instance (Nested.NumElt r, Nested.PrimElt r, RealFloatH r, Nested.FloatElt r)
          => RealFloatH (Nested.Ranked n r) where
@@ -157,6 +181,77 @@ instance (Nested.NumElt r, Nested.PrimElt r, RealFloatH r, Nested.FloatElt r)
                              , either (V.replicate (V.length x)) id y' )
                      in V.zipWith atan2H x y)))  -- TODO: do better somehow
 
+instance (GoodScalar r, Nested.PrimElt r, RealFloat r, Nested.FloatElt r)
+         => RealFloat (Nested.Ranked n r) where
+  atan2 = Nested.Internal.arithPromoteRanked2
+            (Nested.Internal.mliftNumElt2
+               (flip Nested.Internal.Arith.liftVEltwise2
+                  (\x' y' ->
+                     let (x, y) = case (x', y') of
+                           (Left x2, Left y2) ->
+                             (V.singleton x2, V.singleton y2)
+                           _ ->
+                             ( either (V.replicate (V.length y)) id x'
+                             , either (V.replicate (V.length x)) id y' )
+                     in V.zipWith atan2 x y)))  -- TODO: do better somehow
+  floatRadix = error "horde-ad: operation not defined for tensor"
+  floatDigits = error "horde-ad: operation not defined for tensor"
+  floatRange = error "horde-ad: operation not defined for tensor"
+  decodeFloat = error "horde-ad: operation not defined for tensor"
+  encodeFloat = error "horde-ad: operation not defined for tensor"
+  isNaN = error "horde-ad: operation not defined for tensor"
+  isInfinite = error "horde-ad: operation not defined for tensor"
+  isDenormalized = error "horde-ad: operation not defined for tensor"
+  isNegativeZero = error "horde-ad: operation not defined for tensor"
+  isIEEE = error "horde-ad: operation not defined for tensor"
+
+instance (GoodScalar r, Nested.PrimElt r, RealFloat r, Nested.FloatElt r)
+         => RealFloat (Nested.Shaped sh r) where
+  atan2 = Nested.Internal.arithPromoteShaped2
+            (Nested.Internal.mliftNumElt2
+               (flip Nested.Internal.Arith.liftVEltwise2
+                  (\x' y' ->
+                     let (x, y) = case (x', y') of
+                           (Left x2, Left y2) ->
+                             (V.singleton x2, V.singleton y2)
+                           _ ->
+                             ( either (V.replicate (V.length y)) id x'
+                             , either (V.replicate (V.length x)) id y' )
+                     in V.zipWith atan2 x y)))  -- TODO: do better somehow
+  floatRadix = error "horde-ad: operation not defined for tensor"
+  floatDigits = error "horde-ad: operation not defined for tensor"
+  floatRange = error "horde-ad: operation not defined for tensor"
+  decodeFloat = error "horde-ad: operation not defined for tensor"
+  encodeFloat = error "horde-ad: operation not defined for tensor"
+  isNaN = error "horde-ad: operation not defined for tensor"
+  isInfinite = error "horde-ad: operation not defined for tensor"
+  isDenormalized = error "horde-ad: operation not defined for tensor"
+  isNegativeZero = error "horde-ad: operation not defined for tensor"
+  isIEEE = error "horde-ad: operation not defined for tensor"
+
+instance (GoodScalar r, Nested.PrimElt r, RealFloat r, Nested.FloatElt r)
+         => RealFloat (Nested.Mixed sh r) where
+  atan2 =   (Nested.Internal.mliftNumElt2
+               (flip Nested.Internal.Arith.liftVEltwise2
+                  (\x' y' ->
+                     let (x, y) = case (x', y') of
+                           (Left x2, Left y2) ->
+                             (V.singleton x2, V.singleton y2)
+                           _ ->
+                             ( either (V.replicate (V.length y)) id x'
+                             , either (V.replicate (V.length x)) id y' )
+                     in V.zipWith atan2 x y)))  -- TODO: do better somehow
+  floatRadix = error "horde-ad: operation not defined for tensor"
+  floatDigits = error "horde-ad: operation not defined for tensor"
+  floatRange = error "horde-ad: operation not defined for tensor"
+  decodeFloat = error "horde-ad: operation not defined for tensor"
+  encodeFloat = error "horde-ad: operation not defined for tensor"
+  isNaN = error "horde-ad: operation not defined for tensor"
+  isInfinite = error "horde-ad: operation not defined for tensor"
+  isDenormalized = error "horde-ad: operation not defined for tensor"
+  isNegativeZero = error "horde-ad: operation not defined for tensor"
+  isIEEE = error "horde-ad: operation not defined for tensor"
+
 
 -- * RepORArray and its operations
 
@@ -286,9 +381,12 @@ deriving instance Eq (RepORArray y) => Eq (RepN y)
 deriving instance Ord (RepORArray y) => Ord (RepN y)
 deriving instance Num (RepORArray y) => Num (RepN y)
 deriving instance IntegralH (RepORArray y) => IntegralH (RepN y)
+deriving instance Real (RepORArray y) => Real (RepN y)
 deriving instance Fractional (RepORArray y) => Fractional (RepN y)
 deriving instance Floating (RepORArray y) => Floating (RepN y)
+deriving instance RealFrac (RepORArray y) => RealFrac (RepN y)
 deriving instance RealFloatH (RepORArray y) => RealFloatH (RepN y)
+deriving instance RealFloat (RepORArray y) => RealFloat (RepN y)
 
 rtoVector :: GoodScalar r => RepN (TKR n r) -> VS.Vector r
 rtoVector  = Nested.rtoVector . unRepN
diff --git a/src/HordeAd/Core/Types.hs b/src/HordeAd/Core/Types.hs
@@ -147,7 +147,7 @@ class GoodScalarConstraint r => GoodScalar r
 instance GoodScalarConstraint r => GoodScalar r
 
 type Differentiable r =
-  (RealFloatH r, Nested.FloatElt r, RealFrac r, Random r)
+  (RealFloatH r, Nested.FloatElt r, RealFrac r, RealFloat r, Random r)
 
 -- We white-list all types on which we permit differentiation (e.g., SGD)
 -- to work. This is for technical typing purposes and imposes updates
diff --git a/test/simplified/TestAdaptorSimplified.hs b/test/simplified/TestAdaptorSimplified.hs