Saturday, February 27, 2016

Installation Spark Cluster on Windows

How to install Spark Cluster on Windows?

Installation spark cluster on windows isn't the same on Unix. When you using Unix terminal you start a standalone master

But on Windows it's more complex. How to do it right, without mistake and don't spend too much time.
Do it step by step and carefully.


1. Open Command Prompt ( Win+R : cmd )

2. Go to directory of Spark project

cd c:\spark-1.6.0\bin

3. Start master server by executing:

./bin/spark-class org.apache.spark.deploy.master.Master

4. Take HOST/POST of URL:
Master will give [spark://HOST:POST - URL] - Address for starting application

5. Initiate the operation of one or more processes of applications and connect them to the master

./bin/spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT

 6. Connect application vs cluster
./bin/spark-shell --master spark:://IP:PORT


Run simple application and view informations

Write this code into cmd and run:
scala> val textFile = sc.textFile("c:\\spark-1.6.0\\") //directory of file

scala> textFile.count() // Number of items in this RDD

cala> textFile.first() // First item in this RDD

scala> val linesWithSpark = textFile.filter(line => line.contains("Spark"))

scala> textFile.filter(line => line.contains("Spark")).count() // How many lines contain "Spark"?

Video for installation & test application

We received image results as below

Spark Master :

 Spark Jobs:

Spark Environment:

Spark Executors:



This topic isn't new topic for someone but many of us didn't run Spark on Windows. So you don't worry and practice on Windows carefully, do it step by step. Remember commands and way to connect spark app with a cluster. From Web Master UI we get whole information of processes, memory, environment, executors, etc.
Hope you have fun with Spark.


1. Quick start - Spark-1.6.0: 

1 comment:
