Editing CommunityData:Hyak Spark (section)

=== Installing Spark ===
Now we can install spark.

# Download the latest version of spark 'prebuilt for Apache Hadoop 2.7 or later'. From [http://spark.apache.org/downloads.html here]
# Extract the archive. (i.e. to /home/you/spark)
# Set the <code> $SPARK_HOME </code> environment variable and update your path. I.e. update your .bashrc to set: <code> SPARK_HOME=/home/you/spark </code>. 
# Test your spark install by running <code> $SPARK_HOME/bin/pyspark </code>

You should see this: 

<syntaxhighlight lang="bash">
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /__ / .__/\_,_/_/ /_/\_\   version 2.3.1
          /_/
</syntaxhighlight>


For working with Python you also want to add pyspark (Spark Python bindings) to your $PYTHONPATH. 

Add to your .bashrc:
<syntaxhighlight lang="bash">
    export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
</syntaxhighlight>