Skip to content

Commit d65f685

Browse files
authored
[CI] Upgrade DeepRec docker to 2302 (#141)
Signed-off-by: yuanman.ym <yuanman.ym@alibaba-inc.com>
1 parent ad3fc62 commit d65f685

20 files changed

+169
-190
lines changed

.github/helm/values.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
image: registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-tf1.15-py3.8-cu118-ubuntu20.04
1+
image: registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-tf1.15-py3.8-cu121-ubuntu20.04
22
port: 20000
33
gpus: 2
44
caps: ["SYS_ADMIN", "SYS_PTRACE"]

.github/workflows/cibuild.yaml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ jobs:
5757
gpu-cibuild:
5858
name: "Run Tests on GPU w/ TF1"
5959
runs-on: ubuntu-latest
60-
environment: tf1.15-py3.8-cu118-ubuntu20.04
60+
environment: tf1.15-py3.8-cu121-ubuntu20.04
6161
concurrency:
6262
group: gpu-cibuild-${{ github.workflow }}-${{ github.head_ref }}
6363
cancel-in-progress: true
@@ -76,7 +76,7 @@ jobs:
7676
- name: Upload
7777
run: |-
7878
helm install ${JOBNAME}-gpu .github/helm/ \
79-
--set image=registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-tf1.15-py3.8-cu118-ubuntu20.04 \
79+
--set image=registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-tf1.15-py3.8-cu121-ubuntu20.04 \
8080
--set build=rc${{ github.run_id }} \
8181
--set gpus=2 && \
8282
.github/helm/upload ${JOBNAME}-gpu-chief-0
@@ -103,9 +103,9 @@ jobs:
103103
run: |-
104104
helm uninstall ${JOBNAME}-gpu
105105
deeprec-cibuild:
106-
name: "Run Tests on GPU w/ DeepRec2212"
106+
name: "Run Tests on GPU w/ DeepRec"
107107
runs-on: ubuntu-latest
108-
environment: deeprec2212-py3.6-cu114-ubuntu18.04
108+
environment: deeprec-py3.6-cu114-ubuntu18.04
109109
concurrency:
110110
group: deeprec-cibuild-${{ github.workflow }}-${{ github.head_ref }}
111111
cancel-in-progress: true
@@ -124,7 +124,7 @@ jobs:
124124
- name: Upload
125125
run: |-
126126
helm install ${JOBNAME}-deeprec .github/helm/ \
127-
--set image=registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-deeprec2212-py3.6-cu114-ubuntu18.04 \
127+
--set image=registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-deeprec-py3.6-cu114-ubuntu18.04 \
128128
--set build=rc${{ github.run_id }} \
129129
--set gpus=2 && \
130130
.github/helm/upload ${JOBNAME}-deeprec-chief-0

.github/workflows/gpu-nightly.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@ name: nightly deploy on gpu
33
on: workflow_dispatch
44

55
env:
6-
IMAGE: registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-tf1.15-py3.8-cu118-ubuntu20.04
6+
IMAGE: registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-tf1.15-py3.8-cu121-ubuntu20.04
77
JOBNAME: hbci-${{ github.run_id }}
88
PODNAME: hbci-${{ github.run_id }}-chief-0
99

1010
jobs:
1111
deploy:
1212
runs-on: ubuntu-latest
13-
environment: tf1.15-py3.8-cu118-ubuntu20.04
13+
environment: tf1.15-py3.8-cu121-ubuntu20.04
1414
steps:
1515
- name: Checkout Code
1616
uses: actions/checkout@v3

.github/workflows/gpu.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@ name: release deploy on gpu
33
on: workflow_dispatch
44

55
env:
6-
IMAGE: registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-tf1.15-py3.8-cu118-ubuntu20.04
6+
IMAGE: registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:developer-tf1.15-py3.8-cu121-ubuntu20.04
77
JOBNAME: hbci-${{ github.run_id }}
88
PODNAME: hbci-${{ github.run_id }}-chief-0
99

1010
jobs:
1111
deploy:
1212
runs-on: ubuntu-latest
13-
environment: tf1.15-py3.8-cu118-ubuntu20.04
13+
environment: tf1.15-py3.8-cu121-ubuntu20.04
1414
steps:
1515
- name: Checkout Code
1616
uses: actions/checkout@v3

BUILD.md

Lines changed: 9 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -75,13 +75,6 @@ ARROW_S3=ON \
7575
./build.sh
7676
```
7777

78-
Build & install sparsehash:
79-
80-
```bash
81-
cd build/sparsehash
82-
./build.sh
83-
```
84-
8578
Install TensorFlow and other requirements, see
8679
[Dockerfiles](build/dockerfiles/) for more detail.
8780

@@ -120,13 +113,6 @@ ARROW_S3=ON \
120113
./build.sh
121114
```
122115

123-
Build & install sparsehash:
124-
125-
```bash
126-
cd build/sparsehash
127-
./build.sh
128-
```
129-
130116
Install TensorFlow and other requirements:
131117

132118
```bash
@@ -155,16 +141,17 @@ export HYBRIDBACKEND_USE_CXX11_ABI=0
155141

156142
# Set path of thridparty libraries.
157143
export PYTHON=python3.7
158-
export PYTHON_HOME=/usr/local/opt/python@3.7/Frameworks/Python.framework/Versions/Current
144+
export PYTHON_INCLUDE=/usr/local/opt/python@3.7/Frameworks/Python.framework/Versions/Current/include
145+
export PYTHON_LIB=/usr/local/opt/python@3.7/Frameworks/Python.framework/Versions/Current/lib
159146
export PYTHON_IMPL=python3.7
160147
export PYTHON_IMPL_FLAG=m
161-
export SSL_HOME=/usr/local/opt/openssl@1.1
162-
export RE2_HOME=/usr/local/opt/re2
163-
export THRIFT_HOME=/usr/local/opt/thrift
164-
export UTF8PROC_HOME=/usr/local/opt/utf8proc
165-
export SNAPPY_HOME=/usr/local/opt/snappy
166-
export ZSTD_HOME=/usr/local/opt/zstd
167-
export ZLIB_HOME=/usr/local/opt/zlib
148+
export SSL_LIB=/usr/local/opt/openssl@1.1/lib
149+
export RE2_LIB=/usr/local/opt/re2/lib
150+
export THRIFT_LIB=/usr/local/opt/thrift/lib
151+
export UTF8PROC_LIB=/usr/local/opt/utf8proc/lib
152+
export SNAPPY_LIB=/usr/local/opt/snappy/lib
153+
export ZSTD_LIB=/usr/local/opt/zstd/lib
154+
export ZLIB_LIB=/usr/local/opt/zlib/lib
168155

169156
make -j$(nproc)
170157
```

Makefile

Lines changed: 48 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ HYBRIDBACKEND_WITH_ARROW ?= ON
1313
HYBRIDBACKEND_WITH_ARROW_ZEROCOPY ?= ON
1414
HYBRIDBACKEND_WITH_ARROW_HDFS ?= ON
1515
HYBRIDBACKEND_WITH_ARROW_S3 ?= ON
16-
HYBRIDBACKEND_WITH_SPARSEHASH ?= ON
1716
HYBRIDBACKEND_WITH_TENSORFLOW ?= ON
1817
HYBRIDBACKEND_WITH_TENSORFLOW_ESTIMATOR ?= ON
1918
HYBRIDBACKEND_WITH_TENSORFLOW_HALF ?= ON
19+
HYBRIDBACKEND_WITH_TENSORFLOW_DISTRO ?= 1015
2020
HYBRIDBACKEND_WITH_BUILDINFO ?= ON
2121
HYBRIDBACKEND_USE_CXX11_ABI ?= 0
2222
HYBRIDBACKEND_DEBUG ?= OFF
@@ -55,15 +55,15 @@ endif
5555

5656
ifeq ($(OS),Darwin)
5757
OSX_TARGET ?= $(shell sw_vers -productVersion)
58-
PYTHON_HOME ?= /usr/local
58+
PYTHON_INCLUDE ?= /usr/local/include
59+
PYTHON_LIB ?= /usr/local/lib
5960
PYTHON_IMPL ?=
6061
PYTHON_IMPL_FLAG ?=
6162
CFLAGS := $(CFLAGS) \
62-
-isystem $(PYTHON_HOME)/include/$(PYTHON_IMPL)$(PYTHON_IMPL_FLAG) \
63+
-isystem $(PYTHON_INCLUDE)/$(PYTHON_IMPL)$(PYTHON_IMPL_FLAG) \
6364
-mmacosx-version-min=$(OSX_TARGET)
6465

65-
LDFLAGS := $(LDFLAGS) \
66-
-L$(PYTHON_HOME)/lib -l$(PYTHON_IMPL)
66+
LDFLAGS := $(LDFLAGS) -L$(PYTHON_LIB) -l$(PYTHON_IMPL)
6767
endif
6868

6969
ifeq ($(HYBRIDBACKEND_WITH_BUILDINFO),ON)
@@ -89,13 +89,13 @@ endif
8989

9090
ifeq ($(HYBRIDBACKEND_WITH_CUDA),ON)
9191
NVCC ?= nvcc
92-
CUDA_HOME ?= /usr/local
92+
CUDA_INCLUDE ?= /usr/local/cuda/include
93+
CUDA_LIB ?= /usr/local/cuda/lib64
9394
HYBRIDBACKEND_CUDA_GENCODE := $(shell \
9495
echo "$(HYBRIDBACKEND_WITH_CUDA_GENCODE)" | tr ' ' ',' \
9596
2>/dev/null)
9697
CFLAGS := $(CFLAGS) \
97-
-isystem $(CUDA_HOME) \
98-
-isystem $(CUDA_HOME)/cuda/include \
98+
-isystem $(CUDA_INCLUDE) \
9999
-DHYBRIDBACKEND_CUDA=1 \
100100
-DHYBRIDBACKEND_CUDA_GENCODE="\"$(HYBRIDBACKEND_CUDA_GENCODE)\""
101101
NVCC_CFLAGS := --std=c++11 \
@@ -106,20 +106,20 @@ NVCC_CFLAGS := --std=c++11 \
106106
$(foreach cc, $(HYBRIDBACKEND_WITH_CUDA_GENCODE),\
107107
-gencode arch=compute_$(cc),code=sm_$(cc))
108108
LDFLAGS := $(LDFLAGS) \
109-
-L$(CUDA_HOME)/cuda/lib64 \
109+
-L$(CUDA_LIB) \
110110
-lcudart
111111
ifeq ($(HYBRIDBACKEND_WITH_NVTX),ON)
112112
CFLAGS := $(CFLAGS) -DHYBRIDBACKEND_NVTX=1
113113
LDFLAGS := $(LDFLAGS) -lnvToolsExt
114114
endif
115115
ifeq ($(HYBRIDBACKEND_WITH_NCCL),ON)
116-
NCCL_HOME ?= /usr/local
116+
NCCL_INCLUDE ?= /usr/local/nccl/include
117+
NCCL_LIB ?= /usr/local/nccl/lib
117118
CFLAGS := $(CFLAGS) \
118119
-DHYBRIDBACKEND_NCCL=1 \
119-
-isystem $(NCCL_HOME)/include
120+
-isystem $(NCCL_INCLUDE)
120121
LDFLAGS := $(LDFLAGS) \
121-
-L$(NCCL_HOME)/lib64 \
122-
-L$(NCCL_HOME)/lib \
122+
-L$(NCCL_LIB) \
123123
-lnccl
124124
endif
125125
endif
@@ -138,20 +138,22 @@ D_FILES := $(shell \
138138
endif
139139

140140
THIRDPARTY_DEPS :=
141-
SSL_HOME ?= /usr/local
141+
SSL_LIB ?= /usr/lib/x86_64-linux-gnu
142+
142143
COMMON_LDFLAGS := $(COMMON_LDFLAGS) \
143144
-Bsymbolic \
144-
-L$(SSL_HOME)/lib \
145+
-L$(SSL_LIB) \
145146
-lssl \
146147
-lcrypto \
147148
-lcurl
148149
ifeq ($(HYBRIDBACKEND_WITH_ARROW),ON)
149-
ARROW_HOME ?= build/arrow/dist
150-
ARROW_API_H := $(ARROW_HOME)/include/arrow/api.h
150+
ARROW_INCLUDE ?= build/arrow/dist/include
151+
ARROW_LIB ?= build/arrow/dist/lib
152+
ARROW_API_H := $(ARROW_INCLUDE)/arrow/api.h
151153
THIRDPARTY_DEPS := $(THIRDPARTY_DEPS) $(ARROW_API_H)
152154
CFLAGS := $(CFLAGS) \
153155
-DHYBRIDBACKEND_ARROW=1 \
154-
-isystem $(ARROW_HOME)/include
156+
-isystem $(ARROW_INCLUDE)
155157
ifeq ($(HYBRIDBACKEND_WITH_ARROW_ZEROCOPY),ON)
156158
CFLAGS := $(CFLAGS) -DHYBRIDBACKEND_ARROW_ZEROCOPY=1
157159
endif
@@ -164,7 +166,7 @@ endif
164166
ifeq ($(OS),Darwin)
165167
COMMON_LDFLAGS := \
166168
$(COMMON_LDFLAGS) \
167-
-L$(ARROW_HOME)/lib \
169+
-L$(ARROW_LIB) \
168170
-larrow \
169171
-larrow_dataset \
170172
-larrow_bundled_dependencies \
@@ -173,54 +175,43 @@ else
173175
COMMON_LDFLAGS := \
174176
$(COMMON_LDFLAGS) \
175177
-Wl,--whole-archive \
176-
-L$(ARROW_HOME)/lib \
178+
-L$(ARROW_LIB) \
177179
-larrow \
178180
-larrow_dataset \
179181
-larrow_bundled_dependencies \
180182
-lparquet \
181183
-Wl,--no-whole-archive
182184
endif
183-
LZ4_HOME ?=
184-
ifneq ($(strip $(LZ4_HOME)),)
185-
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(LZ4_HOME)/lib -llz4
185+
RE2_LIB ?=
186+
ifneq ($(strip $(RE2_LIB)),)
187+
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(RE2_LIB) -lre2
186188
endif
187-
RE2_HOME ?=
188-
ifneq ($(strip $(RE2_HOME)),)
189-
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(RE2_HOME)/lib -lre2
189+
THRIFT_LIB ?=
190+
ifneq ($(strip $(THRIFT_LIB)),)
191+
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(THRIFT_LIB) -lthrift
190192
endif
191-
THRIFT_HOME ?=
192-
ifneq ($(strip $(THRIFT_HOME)),)
193-
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(THRIFT_HOME)/lib -lthrift
193+
UTF8PROC_LIB ?=
194+
ifneq ($(strip $(UTF8PROC_LIB)),)
195+
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(UTF8PROC_LIB) -lutf8proc
194196
endif
195-
UTF8PROC_HOME ?=
196-
ifneq ($(strip $(UTF8PROC_HOME)),)
197-
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(UTF8PROC_HOME)/lib -lutf8proc
197+
LZ4_LIB ?=
198+
ifneq ($(strip $(LZ4_LIB)),)
199+
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(LZ4_LIB) -llz4
198200
endif
199-
SNAPPY_HOME ?=
200-
ifneq ($(strip $(SNAPPY_HOME)),)
201-
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(SNAPPY_HOME)/lib -lsnappy
201+
SNAPPY_LIB ?=
202+
ifneq ($(strip $(SNAPPY_LIB)),)
203+
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(SNAPPY_LIB) -lsnappy
202204
endif
203-
ZSTD_HOME ?=
204-
ifneq ($(strip $(ZSTD_HOME)),)
205-
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(ZSTD_HOME)/lib -lzstd
205+
ZSTD_LIB ?=
206+
ifneq ($(strip $(ZSTD_LIB)),)
207+
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(ZSTD_LIB) -lzstd
206208
endif
207-
ZLIB_HOME ?=
208-
ifneq ($(strip $(ZLIB_HOME)),)
209-
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(ZLIB_HOME)/lib -lz
209+
ZLIB_LIB ?=
210+
ifneq ($(strip $(ZLIB_LIB)),)
211+
COMMON_LDFLAGS := $(COMMON_LDFLAGS) -L$(ZLIB_LIB) -lz
210212
endif
211213
endif
212214

213-
ifeq ($(HYBRIDBACKEND_WITH_SPARSEHASH),ON)
214-
SPARSEHASH_HOME ?= build/sparsehash/dist
215-
SPARSEHASH_DENSE_HASH_MAP := $(SPARSEHASH_HOME)/include/sparsehash/dense_hash_map
216-
THIRDPARTY_DEPS := $(THIRDPARTY_DEPS) $(SPARSEHASH_DENSE_HASH_MAP)
217-
CFLAGS := $(CFLAGS) \
218-
-DHYBRIDBACKEND_SPARSEHASH=1 \
219-
-isystem ${SPARSEHASH_HOME}/include
220-
LDFLAGS := $(LDFLAGS) \
221-
-lpthread
222-
endif
223-
224215
COMMON_LIB := $(LIBNAME)/lib$(LIBNAME).so
225216
-include $(LIBNAME)/common/Makefile
226217
CORE_DEPS := $(COMMON_LIB)
@@ -233,11 +224,13 @@ CFLAGS := $(CFLAGS) -DHYBRIDBACKEND_TENSORFLOW=1
233224
ifeq ($(HYBRIDBACKEND_WITH_TENSORFLOW_HALF),ON)
234225
CFLAGS := $(CFLAGS) -DHYBRIDBACKEND_TENSORFLOW_HALF=1
235226
endif
236-
TENSORFLOW_HOME ?=
237-
ifneq ($(strip $(TENSORFLOW_HOME)),)
227+
CFLAGS := $(CFLAGS) -DHYBRIDBACKEND_TENSORFLOW_DISTRO=$(HYBRIDBACKEND_WITH_TENSORFLOW_DISTRO)
228+
229+
TENSORFLOW_INCLUDE ?=
230+
ifneq ($(strip $(TENSORFLOW_INCLUDE)),)
238231
CFLAGS := $(CFLAGS) \
239232
-DHYBRIDBACKEND_TENSORFLOW_INTERNAL=1 \
240-
-isystem $(TENSORFLOW_HOME)
233+
-isystem $(TENSORFLOW_INCLUDE)
241234
endif
242235
endif
243236

README.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -43,19 +43,17 @@ more information.
4343

4444
| `{PACKAGE}` | Dependency | Python | CUDA | GLIBC | Data Opt. | Embedding Opt. | Parallelism Opt. |
4545
| ----------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- | ------ | ---- | ------ | --------- | -------------- | ---------------- |
46-
| [hybridbackend-tf115-cu118](https://pypi.org/project/hybridbackend-tf115-cu118/) | [TensorFlow 1.15](https://github.com/NVIDIA/tensorflow) `1` | 3.8 | 11.8 | >=2.31 | &check; | &check; | &check; |
46+
| [hybridbackend-tf115-cu121](https://pypi.org/project/hybridbackend-tf115-cu121/) | [TensorFlow 1.15](https://github.com/NVIDIA/tensorflow) | 3.8 | 12.1 | >=2.31 | &check; | &check; | &check; |
4747
| [hybridbackend-tf115-cu100](https://pypi.org/project/hybridbackend-tf115-cu100/) | [TensorFlow 1.15](https://github.com/tensorflow/tensorflow/tree/r1.15) | 3.6 | 10.0 | >=2.27 | &check; | &check; | &cross; |
4848
| [hybridbackend-tf115-cpu](https://pypi.org/project/hybridbackend-tf115-cpu/) | [TensorFlow 1.15](https://github.com/tensorflow/tensorflow/tree/r1.15) | 3.6 | - | >=2.24 | &check; | &cross; | &cross; |
49-
| [hybridbackend-deeprec2212-cu114](https://pypi.org/project/hybridbackend-deeprec2212-cu114/) | [DeepRec 22.12](https://github.com/alibaba/DeepRec/tree/deeprec2212) `2` | 3.6 | 11.4 | >=2.27 | &check; | &check; | &check; |
50-
51-
> `1`: Suggested docker image: `nvcr.io/nvidia/tensorflow:23.02-tf1-py3`
52-
53-
> `2`: Suggested docker image: `registry.cn-shanghai.aliyuncs.com/pai-dlc/tensorflow-training:deeprec2212-gpu-py36-cu114-ubuntu18.04`
5449

5550
### Method 2: Build from source
5651

5752
See [Building Instructions](https://github.com/alibaba/HybridBackend/blob/main/BUILD.md).
5853

54+
We also provide built docker images for latest [DeepRec](https://github.com/alibaba/DeepRec):
55+
`registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:1.0.0-deeprec-py3.6-cu114-ubuntu18.04`
56+
5957
## License
6058

6159
HybridBackend is licensed under the [Apache 2.0 License](LICENSE).

0 commit comments

Comments
 (0)