Skip to content

Commit 243af2f

Browse files
committed
[SPARK-52483][INFRA] Upgrade to Python 3.11 in doc image
### What changes were proposed in this pull request? Upgrade to Python 3.11 in doc image ### Why are the changes needed? Python 3.9 is reaching the EOL soon ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI ### Was this patch authored or co-authored using generative AI tooling? No Closes #51150 from zhengruifeng/infra_doc_311. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
1 parent 057d9fa commit 243af2f

File tree

5 files changed

+45
-12
lines changed

5 files changed

+45
-12
lines changed

.github/workflows/build_and_test.yml

Lines changed: 31 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1000,8 +1000,12 @@ jobs:
10001000
python3.9 -m pip install ipython_genutils # See SPARK-38517
10011001
python3.9 -m pip install sphinx_plotly_directive 'numpy>=1.20.0' pyarrow pandas 'plotly<6.0.0'
10021002
python3.9 -m pip install 'docutils<0.18.0' # See SPARK-39421
1003-
- name: List Python packages
1003+
- name: List Python packages for branch-3.5 and branch-4.0
1004+
if: inputs.branch == 'branch-3.5' || inputs.branch == 'branch-4.0'
10041005
run: python3.9 -m pip list
1006+
- name: List Python packages
1007+
if: inputs.branch != 'branch-3.5' && inputs.branch != 'branch-4.0'
1008+
run: python3.11 -m pip list
10051009
- name: Install dependencies for documentation generation
10061010
run: |
10071011
# Keep the version of Bundler here in sync with the following locations:
@@ -1010,7 +1014,8 @@ jobs:
10101014
gem install bundler -v 2.4.22
10111015
cd docs
10121016
bundle install --retry=100
1013-
- name: Run documentation build
1017+
- name: Run documentation build for branch-3.5 and branch-4.0
1018+
if: inputs.branch == 'branch-3.5' || inputs.branch == 'branch-4.0'
10141019
run: |
10151020
# We need this link to make sure `python3` points to `python3.9` which contains the prerequisite packages.
10161021
ln -s "$(which python3.9)" "/usr/local/bin/python3"
@@ -1031,6 +1036,30 @@ jobs:
10311036
echo "SKIP_SQLDOC: $SKIP_SQLDOC"
10321037
cd docs
10331038
bundle exec jekyll build
1039+
- name: Run documentation build
1040+
if: inputs.branch != 'branch-3.5' && inputs.branch != 'branch-4.0'
1041+
run: |
1042+
# We need this link to make sure `python3` points to `python3.11` which contains the prerequisite packages.
1043+
ln -s "$(which python3.11)" "/usr/local/bin/python3"
1044+
# Build docs first with SKIP_API to ensure they are buildable without requiring any
1045+
# language docs to be built beforehand.
1046+
cd docs; SKIP_ERRORDOC=1 SKIP_API=1 bundle exec jekyll build; cd ..
1047+
if [ -f "./dev/is-changed.py" ]; then
1048+
# Skip PySpark and SparkR docs while keeping Scala/Java/SQL docs
1049+
pyspark_modules=`cd dev && python3.11 -c "import sparktestsupport.modules as m; print(','.join(m.name for m in m.all_modules if m.name.startswith('pyspark')))"`
1050+
if [ `./dev/is-changed.py -m $pyspark_modules` = false ]; then export SKIP_PYTHONDOC=1; fi
1051+
if [ `./dev/is-changed.py -m sparkr` = false ]; then export SKIP_RDOC=1; fi
1052+
fi
1053+
export PYSPARK_DRIVER_PYTHON=python3.11
1054+
export PYSPARK_PYTHON=python3.11
1055+
# Print the values of environment variables `SKIP_ERRORDOC`, `SKIP_SCALADOC`, `SKIP_PYTHONDOC`, `SKIP_RDOC` and `SKIP_SQLDOC`
1056+
echo "SKIP_ERRORDOC: $SKIP_ERRORDOC"
1057+
echo "SKIP_SCALADOC: $SKIP_SCALADOC"
1058+
echo "SKIP_PYTHONDOC: $SKIP_PYTHONDOC"
1059+
echo "SKIP_RDOC: $SKIP_RDOC"
1060+
echo "SKIP_SQLDOC: $SKIP_SQLDOC"
1061+
cd docs
1062+
bundle exec jekyll build
10341063
- name: Tar documentation
10351064
if: github.repository != 'apache/spark'
10361065
run: tar cjf site.tar.bz2 docs/_site

dev/run-tests.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -427,7 +427,7 @@ def parse_opts():
427427
parser.add_argument(
428428
"--python-executables",
429429
type=str,
430-
default="python3.9",
430+
default="python3.11",
431431
help="A comma-separated list of Python executables to test against (default: %(default)s)",
432432
)
433433
parser.add_argument(

dev/spark-test-image/docs/Dockerfile

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image for Documentat
2424
# Overwrite this label to avoid exposing the underlying Ubuntu OS version label
2525
LABEL org.opencontainers.image.version=""
2626

27-
ENV FULL_REFRESH_DATE=20241029
27+
ENV FULL_REFRESH_DATE=20250616
2828

2929
ENV DEBIAN_FRONTEND=noninteractive
3030
ENV DEBCONF_NONINTERACTIVE_SEEN=true
@@ -56,6 +56,7 @@ RUN apt-get update && apt-get install -y \
5656
pandoc \
5757
pkg-config \
5858
qpdf \
59+
tzdata \
5960
r-base \
6061
ruby \
6162
ruby-dev \
@@ -74,18 +75,21 @@ RUN Rscript -e "install.packages(c('devtools', 'knitr', 'markdown', 'rmarkdown',
7475
# See more in SPARK-39735
7576
ENV R_LIBS_SITE="/usr/local/lib/R/site-library:${R_LIBS_SITE}:/usr/lib/R/library"
7677

77-
# Install Python 3.9
78+
# Install Python 3.11
7879
RUN add-apt-repository ppa:deadsnakes/ppa
79-
RUN apt-get update && apt-get install -y python3.9 python3.9-distutils \
80+
RUN apt-get update && apt-get install -y \
81+
python3.11 \
82+
&& apt-get autoremove --purge -y \
83+
&& apt-get clean \
8084
&& rm -rf /var/lib/apt/lists/*
81-
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.9
85+
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.11
8286

8387
# Should unpin 'sphinxcontrib-*' after upgrading sphinx>5
8488
# See 'ipython_genutils' in SPARK-38517
8589
# See 'docutils<0.18.0' in SPARK-39421
86-
RUN python3.9 -m pip install 'sphinx==4.5.0' mkdocs 'pydata_sphinx_theme>=0.13' sphinx-copybutton nbsphinx numpydoc jinja2 markupsafe 'pyzmq<24.0.0' \
90+
RUN python3.11 -m pip install 'sphinx==4.5.0' mkdocs 'pydata_sphinx_theme>=0.13' sphinx-copybutton nbsphinx numpydoc jinja2 markupsafe 'pyzmq<24.0.0' \
8791
ipython ipython_genutils sphinx_plotly_directive 'numpy>=1.20.0' pyarrow pandas 'plotly>=4.8' 'docutils<0.18.0' \
8892
'flake8==3.9.0' 'mypy==1.8.0' 'pytest==7.1.3' 'pytest-mypy-plugins==1.9.3' 'black==23.12.1' \
8993
'pandas-stubs==1.2.0.53' 'grpcio==1.67.0' 'grpcio-status==1.67.0' 'protobuf==5.29.1' 'grpc-stubs==1.24.11' 'googleapis-common-protos-stubs==2.2.0' \
9094
'sphinxcontrib-applehelp==1.0.4' 'sphinxcontrib-devhelp==1.0.2' 'sphinxcontrib-htmlhelp==2.0.1' 'sphinxcontrib-qthelp==1.0.3' 'sphinxcontrib-serializinghtml==1.1.5' \
91-
&& python3.9 -m pip cache purge
95+
&& python3.11 -m pip cache purge

python/pyspark/pandas/supported_api_gen.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
MAX_MISSING_PARAMS_SIZE = 5
3939
COMMON_PARAMETER_SET = {"kwargs", "args", "cls"}
4040
MODULE_GROUP_MATCH = [(pd, ps), (pdw, psw), (pdg, psg)]
41-
PANDAS_LATEST_VERSION = "2.2.3"
41+
PANDAS_LATEST_VERSION = "2.3.0"
4242

4343
RST_HEADER = """
4444
=====================

python/run-tests.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -212,9 +212,9 @@ def run_individual_python_test(target_dir, test_name, pyspark_python, keep_test_
212212

213213

214214
def get_default_python_executables():
215-
python_execs = [x for x in ["python3.9", "pypy3"] if which(x)]
215+
python_execs = [x for x in ["python3.11", "pypy3"] if which(x)]
216216

217-
if "python3.9" not in python_execs:
217+
if "python3.11" not in python_execs:
218218
p = which("python3")
219219
if not p:
220220
LOGGER.error("No python3 executable found. Exiting!")

0 commit comments

Comments
 (0)