Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions dev/make-distribution.sh
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,8 @@ cp -r "$SPARK_HOME/data" "$DISTDIR"
if [ "$MAKE_PIP" == "true" ]; then
echo "Building python distribution package"
pushd "$SPARK_HOME/python" > /dev/null
# Delete the egg info file if it exists, this can cache older setup files.
rm -rf pyspark.egg-info || echo "No existing egg info file, skipping deletion"
python setup.py sdist
popd > /dev/null
else
Expand Down
2 changes: 2 additions & 0 deletions dev/pip-sanity-check.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@
from __future__ import print_function

from pyspark.sql import SparkSession
from pyspark.ml.param import Params
from pyspark.mllib.linalg import *
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to import pyspark.ml.linalg or both?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just checks one sub component from each, we could import each with rename I suppose but not sure it would do much?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. i think this should be enough.

import sys

if __name__ == "__main__":
Expand Down
1 change: 1 addition & 0 deletions dev/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
jira==1.0.3
PyGithub==1.26.0
Unidecode==0.04.19
pypandoc==1.3.3
7 changes: 5 additions & 2 deletions dev/run-pip-tests
Original file line number Diff line number Diff line change
Expand Up @@ -78,11 +78,14 @@ for python in "${PYTHON_EXECS[@]}"; do
mkdir -p "$VIRTUALENV_PATH"
virtualenv --python=$python "$VIRTUALENV_PATH"
source "$VIRTUALENV_PATH"/bin/activate
# Upgrade pip
pip install --upgrade pip
# Upgrade pip & friends
pip install --upgrade pip pypandoc wheel
pip install numpy # Needed so we can verify mllib imports

echo "Creating pip installable source dist"
cd "$FWDIR"/python
# Delete the egg info file if it exists, this can cache the setup file.
rm -rf pyspark.egg-info || echo "No existing egg info file, skipping deletion"
$python setup.py sdist


Expand Down
5 changes: 5 additions & 0 deletions python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,12 @@ def _supports_symlinks():
url='https://github.com/apache/spark/tree/master/python',
packages=['pyspark',
'pyspark.mllib',
'pyspark.mllib.linalg',
'pyspark.mllib.stat',
'pyspark.ml',
'pyspark.ml.linalg',
'pyspark.ml.param',
'pyspark.ml.stat',
'pyspark.sql',
'pyspark.streaming',
'pyspark.bin',
Expand Down