ARROW-5958: [Python] Link zlib statically in the wheels
Author: Krisztián Szűcs <szucs.krisztian@gmail.com>
Closes #4886 from kszucs/wheel-static-zlib and squashes the following commits:
3cddfa30e <Krisztián Szűcs> remove unused get_so_version
98f0dbc8a <Krisztián Szűcs> update documentation
aae4fd7c4 <Krisztián Szűcs> static zlib in wheels
diff --git a/dev/tasks/python-wheels/osx-build.sh b/dev/tasks/python-wheels/osx-build.sh
index c1d001d..3dd3ccb 100755
--- a/dev/tasks/python-wheels/osx-build.sh
+++ b/dev/tasks/python-wheels/osx-build.sh
@@ -137,6 +137,7 @@
-DARROW_FLIGHT=ON \
-DgRPC_SOURCE=SYSTEM \
-Dc-ares_SOURCE=BUNDLED \
+ -Dzlib_SOURCE=BUNDLED \
-DARROW_PROTOBUF_USE_SHARED=OFF \
-DOPENSSL_USE_STATIC_LIBS=ON \
-DMAKE=make \
diff --git a/dev/tasks/python-wheels/travis.linux.yml b/dev/tasks/python-wheels/travis.linux.yml
index 858d24e..13121a4 100644
--- a/dev/tasks/python-wheels/travis.linux.yml
+++ b/dev/tasks/python-wheels/travis.linux.yml
@@ -61,7 +61,7 @@
# run auditwheel, it does always exit with 0 so it is mostly for debugging
# purposes
- - docker run -v `pwd`:/arrow quay.io/pypa/manylinux1_x86_64 /bin/bash -c
+ - docker run -v `pwd`:/arrow quay.io/pypa/{{ wheel_tag }}_x86_64 /bin/bash -c
"auditwheel show /arrow/python/{{ wheel_tag }}/dist/*.whl"
# test on multiple distributions
diff --git a/dev/tasks/python-wheels/win-build.bat b/dev/tasks/python-wheels/win-build.bat
index dbb4e47..14b5bb5 100644
--- a/dev/tasks/python-wheels/win-build.bat
+++ b/dev/tasks/python-wheels/win-build.bat
@@ -54,6 +54,7 @@
-DARROW_PARQUET=ON ^
-DARROW_GANDIVA=ON ^
-Duriparser_SOURCE=BUNDLED ^
+ -Dzlib_SOURCE=BUNDLED ^
.. || exit /B
cmake --build . --target install --config Release || exit /B
popd
diff --git a/docker-compose.yml b/docker-compose.yml
index ea00e72..3b9fbdd 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -439,7 +439,7 @@
# $ docker-compose pull python-manylinux1
# an then run:
# $ docker-compose run -e PYTHON_VERSION=3.7 python-manylinux1
- image: quay.io/ursa-labs/arrow_manylinux1_x86_64_base:latest
+ image: ursalab/arrow_manylinux1_x86_64_base:0.14.1-static-zlib
build:
context: python/manylinux1
dockerfile: Dockerfile-x86_64_base
@@ -461,7 +461,7 @@
# $ docker-compose pull python-manylinux2010
# an then run:
# $ docker-compose run -e PYTHON_VERSION=3.7 python-manylinux2010
- image: ursalab/arrow_manylinux2010_x86_64_base:latest
+ image: ursalab/arrow_manylinux2010_x86_64_base:0.14.1-static-zlib
build:
context: python/manylinux2010
dockerfile: Dockerfile-x86_64_base
diff --git a/python/CMakeLists.txt b/python/CMakeLists.txt
index 4432c2f..87d26d3 100644
--- a/python/CMakeLists.txt
+++ b/python/CMakeLists.txt
@@ -357,17 +357,9 @@
bundle_boost_lib(Boost_SYSTEM_LIBRARY)
endif()
- find_package(ZLIB)
if(MSVC)
bundle_arrow_implib(ARROW_SHARED_IMP_LIB)
bundle_arrow_implib(ARROW_PYTHON_SHARED_IMP_LIB)
- bundle_arrow_dependency(zlib zlib.dll)
- elseif(APPLE)
- # zlib is unavailable by default since mojave
- bundle_arrow_dependency(zlib "libz.${ZLIB_VERSION_MAJOR}.dylib")
- else()
- # neither manylinux1 nor manylinux2010 specification contain zlib
- bundle_arrow_dependency(zlib "libz.so.${ZLIB_VERSION_MAJOR}")
endif()
endif()
diff --git a/python/manylinux1/README.md b/python/manylinux1/README.md
index 948de20..731dbed 100644
--- a/python/manylinux1/README.md
+++ b/python/manylinux1/README.md
@@ -38,7 +38,7 @@
```bash
# Build the python packages
-docker run --env PYTHON_VERSION="2.7" --env UNICODE_WIDTH=16 --shm-size=2g --rm -t -i -v $PWD:/io -v $PWD/../../:/arrow quay.io/ursa-labs/arrow_manylinux1_x86_64_base:latest /io/build_arrow.sh
+docker-compose run -e PYTHON_VERSION="2.7" -e UNICODE_WIDTH=16 python-manylinux1
# Now the new packages are located in the dist/ folder
ls -l dist/
```
@@ -49,7 +49,7 @@
this image using
```bash
-docker build -t arrow_manylinux1_x86_64_base -f Dockerfile-x86_64_base .
+docker-compose build python-manylinux1
```
For each dependency, we have a bash script in the directory `scripts/` that
@@ -59,29 +59,46 @@
this image, you need to change the name of the docker image in the `docker run`
command.
+### Publishing a new build image
+
+If you have write access to the Docker Hub Ursa Labs account, you can directly
+publish a build image that you built locally.
+
+```bash
+$ docker-compose push python-manylinux1
+```
+
### Using quay.io to trigger and build the docker image
-1. Make the change in the build scripts (eg. to modify the boost build, update `scripts/boost.sh`).
+The used images under the docker-compose setup can be freely changed, currently
+the images are hosted on dockerhub.
+
+1. Make the change in the build scripts (eg. to modify the boost build, update
+ `scripts/boost.sh`).
2. Setup an account on quay.io and link to your GitHub account
3. In quay.io, Add a new repository using :
1. Link to GitHub repository push
- 2. Trigger build on changes to a specific branch (eg. myquay) of the repo (eg. `pravindra/arrow`)
+ 2. Trigger build on changes to a specific branch (eg. myquay) of the repo
+ (eg. `pravindra/arrow`)
3. Set Dockerfile location to `/python/manylinux1/Dockerfile-x86_64_base`
4. Set Context location to `/python/manylinux1`
4. Push change (in step 1) to the branch specified in step 3.ii
- * This should trigger a build in quay.io, the build takes about 2 hrs to finish.
+ * This should trigger a build in quay.io, the build takes about 2 hrs to
+ finish.
-5. Add a tag `latest` to the build after step 4 finishes, save the build ID (eg. `quay.io/pravindra/arrow_manylinux1_x86_64_base:latest`)
+5. Add a tag `latest` to the build after step 4 finishes, save the build ID
+ (eg. `quay.io/pravindra/arrow_manylinux1_x86_64_base:latest`)
6. In your arrow PR,
* include the change from 1.
- * modify `travis_script_manylinux.sh` to switch to the location from step 5 for the docker image.
+ * modify the docker-compose.yml's python-manylinux1 entryo to switch to
+ the location from step 5 for the docker image.
## TensorFlow compatible wheels for Arrow
diff --git a/python/manylinux1/build_arrow.sh b/python/manylinux1/build_arrow.sh
index ca5ab7c..9128b73 100755
--- a/python/manylinux1/build_arrow.sh
+++ b/python/manylinux1/build_arrow.sh
@@ -79,7 +79,6 @@
pushd "${ARROW_BUILD_DIR}"
cmake -DCMAKE_BUILD_TYPE=Release \
-DARROW_DEPENDENCY_SOURCE="SYSTEM" \
- -DZLIB_ROOT=/usr/local \
-DCMAKE_INSTALL_PREFIX=/arrow-dist \
-DCMAKE_INSTALL_LIBDIR=lib \
-DARROW_BUILD_TESTS=OFF \
@@ -103,6 +102,7 @@
-DOPENSSL_USE_STATIC_LIBS=ON \
-DORC_SOURCE=BUNDLED \
-GNinja /arrow/cpp
+ninja
ninja install
popd
diff --git a/python/manylinux1/scripts/build_zlib.sh b/python/manylinux1/scripts/build_zlib.sh
index 272b6c4..71968c1 100755
--- a/python/manylinux1/scripts/build_zlib.sh
+++ b/python/manylinux1/scripts/build_zlib.sh
@@ -19,7 +19,7 @@
curl -sL https://zlib.net/zlib-1.2.11.tar.gz -o /zlib-1.2.11.tar.gz
tar xf zlib-1.2.11.tar.gz
pushd zlib-1.2.11
-./configure
+CFLAGS=-fPIC ./configure --static
make -j8
make install
popd
diff --git a/python/manylinux2010/README.md b/python/manylinux2010/README.md
index c5acb27..fe2888e 100644
--- a/python/manylinux2010/README.md
+++ b/python/manylinux2010/README.md
@@ -42,7 +42,7 @@
```bash
# Build the python packages
-docker run --env PYTHON_VERSION="2.7" --env UNICODE_WIDTH=16 --shm-size=2g --rm -t -i -v $PWD:/io -v $PWD/../../:/arrow ursalab/arrow_manylinux2010_x86_64_base:latest /io/build_arrow.sh
+docker-compose run -e PYTHON_VERSION="2.7" -e UNICODE_WIDTH=16 python-manylinux2010
# Now the new packages are located in the dist/ folder
ls -l dist/
```
@@ -55,7 +55,7 @@
scripts stored under the `scripts` directory.
```bash
-docker build -t arrow_manylinux2010_x86_64_base -f Dockerfile-x86_64_base .
+docker-compose build python-manylinux2010
```
For each dependency, a bash script in the `scripts/` directory downloads the
@@ -69,8 +69,8 @@
publish a build image that you built locally.
```bash
-$ docker push ursalab/arrow_manylinux2010_x86_64_base
+$ docker push python-manylinux2010
The push refers to repository [ursalab/arrow_manylinux2010_x86_64_base]
a1ab88d27acc: Pushing [==============> ] 492.5MB/1.645GB
[... etc. ...]
-```
\ No newline at end of file
+```
diff --git a/python/manylinux2010/scripts/build_zlib.sh b/python/manylinux2010/scripts/build_zlib.sh
index 272b6c4..71968c1 100755
--- a/python/manylinux2010/scripts/build_zlib.sh
+++ b/python/manylinux2010/scripts/build_zlib.sh
@@ -19,7 +19,7 @@
curl -sL https://zlib.net/zlib-1.2.11.tar.gz -o /zlib-1.2.11.tar.gz
tar xf zlib-1.2.11.tar.gz
pushd zlib-1.2.11
-./configure
+CFLAGS=-fPIC ./configure --static
make -j8
make install
popd
diff --git a/python/pyarrow/__init__.py b/python/pyarrow/__init__.py
index f3fd738..cea55cc 100644
--- a/python/pyarrow/__init__.py
+++ b/python/pyarrow/__init__.py
@@ -234,18 +234,6 @@
return out.rstrip().decode('utf8')
-def get_so_version():
- """
- Return the SO version for Arrow libraries.
- """
- if _sys.platform == 'win32':
- raise NotImplementedError("Cannot get SO version on Windows")
- if _has_pkg_config("arrow"):
- return _read_pkg_config_variable("arrow", ["--variable=so_version"])
- else:
- return "100" # XXX Find a way not to hardcode this?
-
-
def get_libraries():
"""
Return list of library names to include in the `libraries` argument for C
diff --git a/python/setup.py b/python/setup.py
index 18d0a56..ab95b29 100755
--- a/python/setup.py
+++ b/python/setup.py
@@ -384,10 +384,6 @@
"{}_regex".format(self.boost_namespace),
implib_required=False)
if sys.platform == 'win32':
- # zlib uses zlib.dll for Windows
- zlib_lib_name = 'zlib'
- move_shared_libs(build_prefix, build_lib, zlib_lib_name,
- implib_required=False)
if self.with_flight:
# DLL dependencies for gRPC / Flight
for lib_name in ['cares', 'libprotobuf',
@@ -395,10 +391,6 @@
'libssl-1_1-x64']:
move_shared_libs(build_prefix, build_lib, lib_name,
implib_required=False)
- else:
- zlib_lib_name = 'z'
- move_shared_libs(build_prefix, build_lib, zlib_lib_name,
- implib_required=False)
if self.with_plasma:
# Move the plasma store