From c34d18dc7306660f753bf4fb86337fe642c4f788 Mon Sep 17 00:00:00 2001 From: Sudhanva Huruli Date: Mon, 23 Mar 2026 09:01:23 -0700 Subject: [PATCH 1/2] Add quotes to fix erroneous pip install command --- docs/declarative-pipelines-programming-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/declarative-pipelines-programming-guide.md b/docs/declarative-pipelines-programming-guide.md index c5d18a7cb71be..3d7c23c315255 100644 --- a/docs/declarative-pipelines-programming-guide.md +++ b/docs/declarative-pipelines-programming-guide.md @@ -40,7 +40,7 @@ The key advantage of SDP is its declarative approach - you define what tables sh A quick way to install SDP is with pip: ``` -pip install pyspark[pipelines] +pip install "pyspark[pipelines]" ``` See the [downloads page](//spark.apache.org/downloads.html) for more installation options. From 16115214f1cad74d95d4b41b698da56dbbb43a16 Mon Sep 17 00:00:00 2001 From: Sudhanva Huruli Date: Mon, 23 Mar 2026 20:19:53 -0700 Subject: [PATCH 2/2] Update other instances which could use quotes around square brackets --- python/docs/source/getting_started/install.rst | 6 +++--- python/docs/source/tutorial/sql/arrow_pandas.rst | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/python/docs/source/getting_started/install.rst b/python/docs/source/getting_started/install.rst index 1b122e197c494..fbf95b018ea58 100644 --- a/python/docs/source/getting_started/install.rst +++ b/python/docs/source/getting_started/install.rst @@ -47,11 +47,11 @@ If you want to install extra dependencies for a specific component, you can inst .. code-block:: bash # Spark SQL - pip install pyspark[sql] + pip install "pyspark[sql]" # pandas API on Spark - pip install pyspark[pandas_on_spark] plotly # to plot your data, you can install plotly together. + pip install "pyspark[pandas_on_spark]" plotly # to plot your data, you can install plotly together. # Spark Connect - pip install pyspark[connect] + pip install "pyspark[connect]" See :ref:`optional-dependencies` for more detail about extra dependencies. diff --git a/python/docs/source/tutorial/sql/arrow_pandas.rst b/python/docs/source/tutorial/sql/arrow_pandas.rst index 608307266f1fd..a0b724e9a1de9 100644 --- a/python/docs/source/tutorial/sql/arrow_pandas.rst +++ b/python/docs/source/tutorial/sql/arrow_pandas.rst @@ -34,7 +34,7 @@ Ensure PyArrow Installed To use Apache Arrow in PySpark, `the recommended version of PyArrow `_ should be installed. If you install PySpark using pip, then PyArrow can be brought in as an extra dependency of the -SQL module with the command ``pip install pyspark[sql]``. Otherwise, you must ensure that PyArrow +SQL module with the command ``pip install "pyspark[sql]"``. Otherwise, you must ensure that PyArrow is installed and available on all cluster nodes. You can install it using pip or conda from the conda-forge channel. See PyArrow `installation `_ for details.