Spark Source Codes 01 Submit and Run Jobs

standalone mode

1
$ cd {SPARK_HOME}/libexec/sbin/

Start Master at 8080,

org.apache.spark.deploy.master.Master
onStart()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# spark command: java -Xms1g -Xmx1g org.apache.spark.deploy.master.Master
# --ip localhost --port 7077 --webui-port 8080
$ ./start-master.sh
Output Logs:
16/01/10 20:45:23 INFO Master: Registered signal handlers for [TERM, HUP, INT]
16/01/10 20:45:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/01/10 20:45:24 INFO SecurityManager: Changing view acls to: tony
16/01/10 20:45:24 INFO SecurityManager: Changing modify acls to: tony
16/01/10 20:45:24 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(tony); users with modify permissions: Set(tony)
16/01/10 20:45:24 INFO Utils: Successfully started service 'sparkMaster' on port 7077.
16/01/10 20:45:24 INFO Master: Starting Spark master at spark://localhost:7077
16/01/10 20:45:24 INFO Master: Running Spark version 1.6.0
16/01/10 20:45:24 INFO Utils: Successfully started service 'MasterUI' on port 8080.
16/01/10 20:45:24 INFO MasterWebUI: Started MasterWebUI at http://192.168.0.112:8080
16/01/10 20:45:24 INFO Utils: Successfully started service on port 6066.
16/01/10 20:45:24 INFO StandaloneRestServer: Started REST server for submitting applications on port 6066
16/01/10 20:45:24 INFO Master: I have been elected leader! New state: ALIVE

Start Worker at 8081

onStart() => registerWithMaster()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# spark command: java -Xms1g -Xmx1g org.apache.spark.deploy.worker.Worker
# --webui-port 8081 spark://localhost:7077
$ ./start-slave.sh spark://localhost:7077
Output Logs:
16/01/10 20:50:45 INFO Worker: Registered signal handlers for [TERM, HUP, INT]
16/01/10 20:50:45 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/01/10 20:50:45 INFO SecurityManager: Changing view acls to: tony
16/01/10 20:50:45 INFO SecurityManager: Changing modify acls to: tony
16/01/10 20:50:45 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(tony); users with modify permissions: Set(tony)
16/01/10 20:50:46 INFO Utils: Successfully started service 'sparkWorker' on port 49576.
16/01/10 20:50:46 INFO Worker: Starting Spark worker 192.168.0.112:49576 with 4 cores, 7.0 GB RAM
16/01/10 20:50:46 INFO Worker: Running Spark version 1.6.0
16/01/10 20:50:46 INFO Worker: Spark home: /usr/local/Cellar/apache-spark/1.6.0/libexec
16/01/10 20:50:46 INFO Utils: Successfully started service 'WorkerUI' on port 8081.
16/01/10 20:50:46 INFO WorkerWebUI: Started WorkerWebUI at http://192.168.0.112:8081
16/01/10 20:50:46 INFO Worker: Connecting to master localhost:7077...
16/01/10 20:50:46 INFO Worker: Successfully registered with master spark://localhost:7077

Start Spark-shell over cluster on http://localhost:4040

1
$ MASTER=spark://localhost:7077 spark-shell

14526662171327

1
scala> sc.textFile("README.md").filter(_.contains("Spark")).count

14526662553694

sc.textFile(“”)

RDD Object

DAGScheduler: error between stages

==TaskSet===>

TaskScheduler: error inside stage

org.apache.spark.scheduler.TaskScheduler

Contents
  1. 1. Start Master at 8080,
  2. 2. Start Worker at 8081
  3. 3. Start Spark-shell over cluster on http://localhost:4040
|