# Hadoop安装与集群部署 #
## 1、基本环境配置
* Linux服务器:CentOS 7
* 配置:阿里云YUM源
* 安装基础工具:gcc, network-tool
```
yum install gcc
(centos7不支持netstat, ifconfig,先安装sudo yum install net-tools)
```
* 网络配置
* 准备
* JDK安装包:
```
HOME/downloads/jdk-8u65-linux-x64.tar.gz
```
* Hadoop安装包:
```
HOME/downloads/hadoop-2.7.3.tar.gz
```
* 用户:
```
useradd centos
passwd 123456
```
* 软件存放目录:
```
sudo mkdir /soft
sudo chown centos:centos /soft
tar -xzvf HOME/downloads/jdk-8u65-linux-x64.tar.gz /soft/
```
* 安装JDK
* 检查是否已经安装:
```
rpm -qa | grep Java
```
* 验证:
```
cd /soft/jdk-1.8.0_65/bin
./java -version
```
* 创建软连接:
```
ln -s /soft/jdk-1.8.0_65 /soft/jdk
```
* 环境变量:
```
编辑/etc/profile
export JAVA_HOME=/soft/jdk
exprot PATH=$PATH:$JAVA_HOME/bin
source /etc/profile
```
* 进入任意目录
```
java -version
```
* 安装Hadoop
* 解压:
```
tar -zxvf hadoop-2.7.3.tar.gz
mv HOME/downloads/hadoop-2.7.3 /soft/
```
* 创建软连接
```
ln -s /soft/hadoop-2.7.3 /soft/hadoop
```
* 验证
```
cd /soft/hadoop/bin
./hadoop version
```
* 环境变量
```
export HADOOP_HOME=/soft/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
```
* SSH安装
```
Yum安装
```
## 2、Hadoop部署
架构解析
(1).创建三个配置目录,内容等同于hadoop目录
```
${hadoop_home}/etc/local
${hadoop_home}/etc/pesudo
${hadoop_home}/etc/full
```
(2).创建软链接
```
ln -s ${hadoop_home}/etc/pesudo hadoop
```
(3).对hdfs进行格式化
```
hadoop namenode -format
```
(4).修改hadoop配置文件,手动指定JAVA_HOME环境变量
```
[${hadoop_home}/etc/hadoop/hadoop-env.sh]
...
export JAVA_HOME=/soft/jdk
...
```
(5).启动Hadoop所有进程
```
start-all.sh
```
(6).启动完成后,出现以下进程
```
jps
```
(7).查看hdfs文件系统
```
hdfs dfs -ls /
```
(8).通过webui查看hadoop的文件系统
```
netstat -ano | grep 50070 #端口查询
http://localhost:50070/
```
(9).停止hadoop所有进程
```
stop-all.sh
```
(10).centos防火墙操作
```
[centos7]
$>sudo systemctl enable firewalld.service //"开机启动"启用
$>sudo systemctl disable firewalld.service //"开机自启"禁用
$>sudo systemctl start firewalld.service //启动防火墙
$>sudo systemctl stop firewalld.service //停止防火墙
$>sudo systemctl status firewalld.service //查看防火墙状态
```
```
[开机自启]
$>sudo chkconfig firewalld on //"开启自启"启用
$>sudo chkconfig firewalld off //"开启自启"禁用
```
## 配置完全分布式
* 修改Hadoop的配置模式full
* 修改hostname文件
* 克隆4台主机
* 修改网络IP地址
* 主机名
* 配置SSH无密登录