多应用+插件架构,代码干净,二开方便,首家独创一键云编译技术,文档视频完善,免费商用码云13.8K 广告
**1. 首先检查是否安装了 jdk8。** **2. 下载安装包并解压** 直接访问:https://archive.apache.org/dist/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.6.tgz a. 如需历史版本下载请访问:https://archive.apache.org/dist/spark/。 b. Spark 不是必须安装 Hadoop,但如果已经有 Hadoop 集群或 HDFS,建议下载对应版本的 Spark。 下载完成后解压到/opt/install 目录。 ```shell tar -zvxf spark-2.4.4-bin-hadoop2.6.tgz -C /opt/install -- 创建软连接 ln -s spark-2.4.4-bin-hadoop2.6 spark ``` **3. 配置环境变量** ```sql vim /etc/profile export SPARK_HOME=/opt/install/spark export PATH=$SPARK_HOME/bin:$SPARK_HOME/sbin:$PATH ``` 使环境变量生效 ``` source /etc/profile ``` **4. 配置文件** ```sql cd /opt/install/spark/conf cp spark-env.sh.template spark-env.sh vim spark-env.sh export JAVA_HOME=/opt/install/jdk export HADOOP_HOME=/opt/install/hadoop export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop #指定 master 的主机 export SPARK_MASTER_HOST=hadoop101 #指定 master 的端口 export SPARK_MASTER_PORT=7077 ``` 修改配置文件 {spark_home}/conf/slaves(单台可以不配置,也可以按下面方式配置) ```sql -- 先把 slaves.template 重命名为 slaves mv slaves.template slaves vim slaves hadoop101 ``` **5. 启动 Spark 集群,在 spark 目录下,输入:** ```sql sbin/start-all.sh ``` 查看是否有 Master 与 Worker 进程 ```sql [root@hadoop101 spark]# jps 3694 Worker 3599 Master ``` **6. 启动 spark-shell 测试 scala 交互式环境** ```sql [root@hadoop101 spark]# spark-shell --master spark://hadoop101:7077 或者 [root@hadoop101 sbin]# spark-shell --master local[4] 21/01/04 10:05:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http://hadoop101:4040 Spark context available as 'sc' (master = local[4], app id = local-1609725951951). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.4 /_/ Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_171) Type in expressions to have them evaluated. Type :help for more information. scala> ``` 可以通过浏览器访问:http://hadoop101:8080/ ![](https://img.kancloud.cn/e8/6b/e86b2faca084f98c210325e1df6c5416_1048x448.png) **7. 测试 Spark on YARN** ```sql -- 需提前启动 yarn 进程。启动yarn [root@hadoop101 sbin]# ./start-yarn.sh -- Spark on Yarn [root@hadoop101 sbin]# spark-shell --master yarn 21/01/04 14:23:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 21/01/04 14:23:58 WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. 21/01/04 14:23:59 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. Spark context Web UI available at http://hadoop101:4041 Spark context available as 'sc' (master = yarn, app id = application_1609741310740_0001). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.4 /_/ Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_171) Type in expressions to have them evaluated. Type :help for more information. scala> ```