diff --git a/docs/declarative-pipelines-programming-guide.md b/docs/declarative-pipelines-programming-guide.md index c5d18a7cb71be..3d7c23c315255 100644 --- a/docs/declarative-pipelines-programming-guide.md +++ b/docs/declarative-pipelines-programming-guide.md @@ -40,7 +40,7 @@ The key advantage of SDP is its declarative approach - you define what tables sh A quick way to install SDP is with pip: ``` -pip install pyspark[pipelines] +pip install "pyspark[pipelines]" ``` See the [downloads page](//spark.apache.org/downloads.html) for more installation options. diff --git a/python/docs/source/getting_started/install.rst b/python/docs/source/getting_started/install.rst index 1b122e197c494..fbf95b018ea58 100644 --- a/python/docs/source/getting_started/install.rst +++ b/python/docs/source/getting_started/install.rst @@ -47,11 +47,11 @@ If you want to install extra dependencies for a specific component, you can inst .. code-block:: bash # Spark SQL - pip install pyspark[sql] + pip install "pyspark[sql]" # pandas API on Spark - pip install pyspark[pandas_on_spark] plotly # to plot your data, you can install plotly together. + pip install "pyspark[pandas_on_spark]" plotly # to plot your data, you can install plotly together. # Spark Connect - pip install pyspark[connect] + pip install "pyspark[connect]" See :ref:`optional-dependencies` for more detail about extra dependencies. diff --git a/python/docs/source/tutorial/sql/arrow_pandas.rst b/python/docs/source/tutorial/sql/arrow_pandas.rst index 608307266f1fd..a0b724e9a1de9 100644 --- a/python/docs/source/tutorial/sql/arrow_pandas.rst +++ b/python/docs/source/tutorial/sql/arrow_pandas.rst @@ -34,7 +34,7 @@ Ensure PyArrow Installed To use Apache Arrow in PySpark, `the recommended version of PyArrow `_ should be installed. If you install PySpark using pip, then PyArrow can be brought in as an extra dependency of the -SQL module with the command ``pip install pyspark[sql]``. Otherwise, you must ensure that PyArrow +SQL module with the command ``pip install "pyspark[sql]"``. Otherwise, you must ensure that PyArrow is installed and available on all cluster nodes. You can install it using pip or conda from the conda-forge channel. See PyArrow `installation `_ for details.