Saturday, July 16, 2016

Steps to Installing Spark

Check java version else install jdk

 $ sudo apt-get -y update
$ sudo apt-get -y install openjdk-7-jdk
Download pre-built spark
$ wget http://d3kbcqa49mib13.cloudfront.net/spark-1.1.0-bin-hadoop1.tgz
$ tar -zxf spark-1.1.0-bin-hadoop1.tgz
Test Installation

$ cd spark-1.1.0-bin-hadoop1
$ ./bin/spark-shell
spark> val textFile = sc.textFile("README.md")
spark> textFile.count()

2 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Installing spark 2.0

    sudo wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.0-bin-hadoop2.7.tgz

    Unpack the tar file
    sudo tar -xvzf spark-2.0.0-bin-hadoop2.7.tgz

    Remove the tar file after it has been unpacked
    sudo rm spark-2.0.0-bin-hadoop2.7.tgz

    Change the ownership of the folder and its elements
    sudo chown -R spark:spark spark-2.0.0-bin-hadoop2.7

    Update system variables

    Step into the spark 2.0.0 directory and run pwd to get full path
    cd spark-2.0.0-bin-hadoop2.7


    Update the system environment file by adding SPARK_HOME and adding Spark_HOME/bin to the PATH
    sudo vi /etc/environment
    export SPARK_HOME=/usr/apache/spark-2.0.0-bin-hadoop2.7

    At the end of PATH add
    ${SPARK_HOME}/bin

    Refresh the system environments
    source /etc/environment

    ReplyDelete