File tree Expand file tree Collapse file tree 1 file changed +9
-5
lines changed Expand file tree Collapse file tree 1 file changed +9
-5
lines changed Original file line number Diff line number Diff line change @@ -40,11 +40,15 @@ NEONKernel12x4Depth2 kernel, which specifies its format as
40
40
41
41
The meaning of these terms is explained in the lengthy comment at the top of
42
42
internal/kernel.h. Here, they mean that this kernel handles at each iteration
43
- (along the depth dimension): - 3 'cells' of size 4x2 each of the lhs, so a total
44
- lhs block of size 12x2 - 1 'cell' of size 2x4 of the rhs. In other words, this
45
- kernel handles 12 rows of the lhs and 4 columns of the rhs, and handles two
46
- levels of depth at once. The 'cells' and ` CellFormat ` detail the layout of these
47
- 12x2 and 2x4 blocks.
43
+ (along the depth dimension):
44
+
45
+ - 3 'cells' of size 4x2 each of the lhs, so a total lhs block of size 12x2
46
+
47
+ - 1 'cell' of size 2x4 of the rhs.
48
+
49
+ In other words, this kernel handles 12 rows of the lhs and 4 columns of the
50
+ rhs, and handles two levels of depth at once. The 'cells' and ` CellFormat `
51
+ detail the layout of these 12x2 and 2x4 blocks.
48
52
49
53
This kernel then loads these 12x2 and 2x4 blocks and computes the corresponding
50
54
12x4 GEMM; for ease of reference let us paste the critical comment and code
You can’t perform that action at this time.
0 commit comments