Skip to content

Commit f7ed7fc

Browse files
authored
[Docs] Update deeprec2304 release images and notes in README.md & RELEASE.md. (DeepRec-AI#865)
Signed-off-by: Tongxuan Liu <[email protected]>
1 parent 8b90640 commit f7ed7fc

8 files changed

+115
-11
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
--------------------------------------------------------------------------------
55

66
## **Introduction**
7-
DeepRec is a high-performance recommendation deep learning framework based on [TensorFlow 1.15](https://www.tensorflow.org/), [Intel-TensorFlow](https://github.com/Intel-tensorflow/tensorflow) and [NVIDIA-TensorFlow](https://github.com/NVIDIA/tensorflow).
7+
DeepRec is a high-performance recommendation deep learning framework based on [TensorFlow 1.15](https://www.tensorflow.org/), [Intel-TensorFlow](https://github.com/Intel-tensorflow/tensorflow) and [NVIDIA-TensorFlow](https://github.com/NVIDIA/tensorflow). It is hosted in incubation in LF AI & Data Foundation.
88

99

1010
### **Background**
@@ -95,13 +95,13 @@ $ pip3 install /tmp/tensorflow_pkg/tensorflow-1.15.5+${version}-cp38-cp38m-linux
9595
#### Image for CPU
9696

9797
```
98-
alideeprec/deeprec-release:deeprec2302-cpu-py38-ubuntu20.04
98+
alideeprec/deeprec-release:deeprec2304-cpu-py38-ubuntu20.04
9999
```
100100

101101
#### Image for GPU CUDA11.6
102102

103103
```
104-
alideeprec/deeprec-release:deeprec2302-gpu-py38-cu116-ubuntu20.04
104+
alideeprec/deeprec-release:deeprec2304-gpu-py38-cu116-ubuntu20.04
105105
```
106106

107107
***

RELEASE.md

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,107 @@
1+
# Release r1.15.5-deeprec2304
2+
3+
## **Major Features and Improvements**
4+
5+
### **Embedding**
6+
7+
- Suport tf.int32 dtype using feature_column API `tf.feature_column.categorical_column_with_embedding`.
8+
- Make the rules of export frequencies and versions the same as the rule of export keys.
9+
- Optimize cuda kernel implementation in GroupEmbedding.
10+
- Support to read embedding files with mmap and madvise, and direct IO.
11+
- Add double check in find_wait_free of lockless dense hashmap.
12+
- Change Embedding init value of version in EV from 0 to -1.
13+
- Interface 'GetSnapshot()' backward compatibility.
14+
- Implement CPU GroupEmbedding lookup sparse Op.
15+
- Make GroupEmbedding compatible with sequence feature_column interface.
16+
- Fix sp_weights indices calculation error in GroupEmbedding.
17+
- Add group_strategy to control parallelism of group_embedding.
18+
19+
### **Graph & Grappler Optimization**
20+
21+
- Support SparseTensor as placeholder in Sample-awared Graph Compression.
22+
- Add Dice fusion grappler and ops.
23+
- Enable MKL Matmul + Bias + LeakyRelu fusion.
24+
25+
### **Runtime Optimization**
26+
27+
- Avoid unnecessary polling in EventMgr.
28+
- Reduce lock cost and memory usage in EventMgr when use multi-stream.
29+
30+
### **Ops & Hardware Acceleration**
31+
32+
- Register GPU implementation of int64 type for Prod.
33+
- Register GPU implementation of string type for Shape, ShapeN and ExpandDims.
34+
- Optimize list of GPU SegmentReductionOps.
35+
- Optimize zeros_like_impl by reducing calls to convert_to_tensor.
36+
- Implement GPU version of SparseSlice Op.
37+
- Delay Reshape when rank > 2 in keras.layers.Dense so that post op can be fused with MatMul.
38+
- Implement setting max_num_threads hint to oneDNN at compile time.
39+
- Implement TensorPackTransH2DOp to improve SmartStage performance on GPU.
40+
41+
### **IO**
42+
43+
- Add tensor shape meta-data support for ParquetDataset.
44+
- Add arrow BINARY type support for ParquetDataset.
45+
46+
### **Serving**
47+
48+
- Add Dice fusion to inference mode.
49+
- Enable INFERENCE_MODE in processor.
50+
- Support TensorRT 8.x in Inference.
51+
- Add configure filed to control enable TensorRT or not.
52+
- Add flag for device_placement_optimization.
53+
- Avoid to clustering feature column related nodes when enable TensorRT.
54+
- Optimize inference latency when load increment checkpoint.
55+
- Optimize performance via only place TensorRT ops to gpu device.
56+
57+
### **Environment & Build**
58+
59+
- Support CUDA 12.
60+
- Update DEFAULT_CUDA_VERSION and DEFAULT_CUDNN_VERSION in configure.py.
61+
- Move thirdparties from WORKSPACE to workspace.bzl.
62+
- Update urls corresponding to colm, ragel, aliyun-oss-sdk and uuid.
63+
64+
### **BugFix**
65+
66+
- Fix constant op placing bug for device placement optimization.
67+
- Fix Nan issue occurred in group_embedding API.
68+
- Fix SOK not compatible with variable issue.
69+
- Fix memory leak when update full model in serving.
70+
- Fix 'cols_to_output_tensors' not setted issue in GroupEmbedding.
71+
- Fix core dump issue about saving GPU EmbeddingVariable.
72+
- Fix cuda resource issue in KvResourceImportV3 kernel.
73+
- Fix loading signature_def with coo_sparse bug and add UT.
74+
- Fix the bug that the training ends early when the workqueue is enabled.
75+
- Fix the control edge connection issue in device placement optimization.
76+
77+
### **ModelZoo**
78+
79+
- Modify GroupEmbedding related function usage.
80+
- Update masknet example with layernorm.
81+
82+
### **Tool & Documents**
83+
84+
- Add tools for remove filtered features in checkpoint.
85+
- Add Arm Compute Library (ACL) user documents.
86+
- Update Embedding Variable document to fix initializer config example.
87+
- Update GroupEmbedding document.
88+
- Update processor documents.
89+
- Add user documents for intel AMX.
90+
- Add TensorRT usage documents.
91+
- Update documents for ParquetDataset.
92+
93+
More details of features: [https://deeprec.readthedocs.io/zh/latest/](url)
94+
95+
## **Release Images**
96+
97+
### **CPU Image**
98+
99+
`alideeprec/deeprec-release:deeprec2304-cpu-py38-ubuntu20.04`
100+
101+
### **GPU Image**
102+
103+
`alideeprec/deeprec-release:deeprec2304-gpu-py38-cu116-ubuntu20.04`
104+
1105
# Release r1.15.5-deeprec2302
2106

3107
## **Major Features and Improvements**

docs/docs_en/DeepRec-Compile-And-Install.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ pip3 install /tmp/tensorflow_pkg/tensorflow-1.15.5+${version}-cp38-cp38m-linux_x
112112

113113
x86_64:
114114
```
115-
alideeprec/deeprec-release:deeprec2302-cpu-py38-ubuntu20.04
115+
alideeprec/deeprec-release:deeprec2304-cpu-py38-ubuntu20.04
116116
```
117117

118118
arm64:
@@ -123,5 +123,5 @@ alideeprec/deeprec-release:deeprec2302-cpu-py38-ubuntu22.04-arm64
123123
**GPU Image with CUDA 11.6**
124124

125125
```
126-
alideeprec/deeprec-release:deeprec2302-gpu-py38-cu116-ubuntu20.04
126+
alideeprec/deeprec-release:deeprec2304-gpu-py38-cu116-ubuntu20.04
127127
```

docs/docs_en/Estimator-Compile-And-Install.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ DeepRec provide new distributed protocols such as grpc++ and star_server, which
4444

4545
Source Code: [https://github.com/DeepRec-AI/estimator](https://github.com/DeepRec-AI/estimator)
4646

47-
Develop Branch:master, Latest Release Branch: deeprec2302
47+
Develop Branch:master, Latest Release Branch: deeprec2304
4848

4949
## Estimator Build
5050

docs/docs_en/TFServing-Compile-And-Install.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ We provide optimized TFServing which could highly improve performance in inferen
4343

4444
Source Code: [https://github.com/DeepRec-AI/serving](https://github.com/DeepRec-AI/serving)
4545

46-
Develop Branch: master, Latest Release Branch: deeprec2302
46+
Develop Branch: master, Latest Release Branch: deeprec2304
4747

4848
## TFServing Build
4949

docs/docs_zh/DeepRec-Compile-And-Install.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ pip3 install /tmp/tensorflow_pkg/tensorflow-1.15.5+${version}-cp38-cp38m-linux_x
111111

112112
x86_64:
113113
```
114-
alideeprec/deeprec-release:deeprec2302-cpu-py38-ubuntu20.04
114+
alideeprec/deeprec-release:deeprec2304-cpu-py38-ubuntu20.04
115115
```
116116

117117
arm64:
@@ -122,7 +122,7 @@ alideeprec/deeprec-release:deeprec2302-cpu-py38-ubuntu22.04-arm64
122122
**GPU CUDA11.6镜像**
123123

124124
```
125-
alideeprec/deeprec-release:deeprec2302-gpu-py38-cu116-ubuntu20.04
125+
alideeprec/deeprec-release:deeprec2304-gpu-py38-cu116-ubuntu20.04
126126
```
127127

128128
## DeepRec Processor编译打包

docs/docs_zh/Estimator-Compile-And-Install.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@
4444

4545
代码库:[https://github.com/DeepRec-AI/estimator](https://github.com/DeepRec-AI/estimator)
4646

47-
开发分支:master,最新Release分支:deeprec2302
47+
开发分支:master,最新Release分支:deeprec2304
4848

4949
## Estimator编译
5050

docs/docs_zh/TFServing-Compile-And-Install.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@
4343

4444
代码库:[https://github.com/DeepRec-AI/serving](https://github.com/DeepRec-AI/serving)
4545

46-
开发分支:master,最新Release分支:deeprec2302
46+
开发分支:master,最新Release分支:deeprec2304
4747

4848
## TFServing编译&打包
4949

0 commit comments

Comments
 (0)