本文主要介绍Hadoop搭建以及若干性能参数说明。
Hadoop搭建
选机器
所有机器要修改/etc/hosts
- 10.16.10.1 hadoop1 #这台做master
- 10.16.10.2 hadoop2
- 10.16.10.3 hadoop3
SSH环境
集群所有机器ssh互相免密登陆,设置略。ssh端口可以自己设置,并在hadoop-env.sh中export HADOOP_SSH_OPTS=”-p $ssh_port”
配置hadoop安装目录etc/hodoop下多份配置文件
hadoop-env.sh
1 | export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64/jre |
yarn-env.sh & mapred-env.sh
1 | export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64/jre |
core-site.xml
1 | <configuration> |
hdfs-site.xml
1 | <configuration> |
mapred-site.xml
1 | <configuration> |
yarn-site.xml
1 | <configuration> |
slaves
1 | hadoop1 |
说明
环境变量调用路径
- HADOOP_HEAPSIZE: hadoop -> hadoop-config.sh -> hadoop-env.sh
- hdfs -> hdfs-config.sh -> hadoop-config.sh -> hadoop-env.sh
- YARN_HEAPSIZE: yarn -> yarn-config.sh -> hadoop-config.sh -> hadoop-env.sh
节点角色分配
- hadoop1(master+slave): ResourceManager(yarn), NodeManager(yarn), NameNode(master-dfs), SecondaryNameNode(master-dfs), DataNode(hdfs)
- hadoop2(slave): NodeManager(yarn), DataNode(hdfs)
- hadoop3(slave): NodeManager(yarn), DataNode(hdfs)
启动集群及关闭集群
设置过ssh免密登陆,可简单启动和关闭。
1 | # 第一次使用namenode服务,需要在NN-master作格式化 |