To give an example on progress in society I will tell you how you can compute PI using distributed computing on Spark and Raspberry PI’s on your own little local computing cluster.
Install Raspian on your Raspberry Pi’s.
Install Java on your Raspberry Pi thus:
sudo apt-get update && sudo apt-get install oracle-java7-jdk
Install ssh on your Raspberry Pi thus:
sudo apt-get install ssh
Fetch Apache Spark to each of your Raspberry Pi’s:
Also install Spark on your master machine, in my case my Macbook Pro. The current version of Spark (1.0.1) wants all installations of Spark to be in the same folder on all of the machines, so I put them in
/usr/local/spark. To be precise, I coped the unpacked spark folder structure to /usr/local and then I made a symbolic link to this folder calling it “spark”.
This is done on all Workers (Raspberry Pi’s) and the Master Machine (Macbook Pro):
sudo mv spark-1.0.1-bin-hadoop2 /usr/local/
ln -s spark-1.0.1-bin-hadoop2 spark
I then tell the Master machine which IP to export so the workers can connect to the master, this is done by:
I can now run the “Master Start” script on the master (Macbook Pro):
and then start the workers on each of the Raspberry Pi’s:
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://10.0.1.10:7077
I can now submit jobs (for instance to calculate PI using Java) to the cluster on my master machine by:
./bin/spark-submit --master spark://10.0.1.10:7077 --class org.apache.spark.examples.JavaSparkPi lib/spark-examples-1.0.1-hadoop2.2.0.jar
Or using the Python Spark version:
./bin/spark-submit --master spark://10.0.1.10:7077 examples/src/main/python/pi.py 10
I can surf to:
To monitor my cluster.
We can now calculate pi to:
“Pi is roughly 3.131820”
and it only takes:
116.324641 seconds, now that is progress!