Check java version else install jdk
$ sudo apt-get -y update
$ sudo apt-get -y install openjdk-7-jdk
Download pre-built spark
$ wget http://d3kbcqa49mib13.cloudfront.net/spark-1.1.0-bin-hadoop1.tgz
$ tar -zxf spark-1.1.0-bin-hadoop1.tgz
Test Installation$ cd spark-1.1.0-bin-hadoop1
$ ./bin/spark-shell
spark> val textFile = sc.textFile("README.md")
spark> textFile.count()
This comment has been removed by the author.
ReplyDeleteInstalling spark 2.0
ReplyDeletesudo wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.0-bin-hadoop2.7.tgz
Unpack the tar file
sudo tar -xvzf spark-2.0.0-bin-hadoop2.7.tgz
Remove the tar file after it has been unpacked
sudo rm spark-2.0.0-bin-hadoop2.7.tgz
Change the ownership of the folder and its elements
sudo chown -R spark:spark spark-2.0.0-bin-hadoop2.7
Update system variables
Step into the spark 2.0.0 directory and run pwd to get full path
cd spark-2.0.0-bin-hadoop2.7
Update the system environment file by adding SPARK_HOME and adding Spark_HOME/bin to the PATH
sudo vi /etc/environment
export SPARK_HOME=/usr/apache/spark-2.0.0-bin-hadoop2.7
At the end of PATH add
${SPARK_HOME}/bin
Refresh the system environments
source /etc/environment