版本:
- CENTOS7.2
- CDH5.10
- Kudu1.2
2.1 概述
- 本文CENTOS7.2操作系统部署CDH企业版的过程。Cloudera企业级数据中心的安装主要分为4个步骤:
- 集群服务器配置,包括安装操作系统、关闭防火墙、同步服务器时钟等;
- 外部数据库安装
- 安装Cloudera管理器;
- 安装CDH集群;
- 集群完整性检查,包括HDFS文件系统、MapReduce、Hive等是否可以正常运行。这篇文档将着重介绍Cloudera管理器与CDH的安装,并基于以下假设:
- 操作系统版本:CENTOS7.2
- MariaDB数据库版本为10.2.1
- CM版本:CDH 5.10.0
- CDH版本:CDH 5.10.0
- 采用ec2-user对集群进行部署
- 已经下载CDH和CM的安装包
2.2 前期准备
2.2.1 hostname及hosts配置
- 集群中各个节点之间能互相通信使用静态IP地址。IP地址和主机名通过/etc/hosts配置,主机名/etc/hostname进行配置。
- 以cm节点(172.31.2.159)为例:
- hostname配置
- /etc/hostname文件如下:
ip-172-31-2-159
或者可以通过命令修改立即生效
[ec2-user@ip-172-31-2-159 ~]$ sudo hostnamectl set-hostname ip-172-31-2-159
- 注意:这里修改hostname跟REDHAT6的区别
- hosts配置
- /etc/hosts文件如下:
172.31.2.159 ip-172-31-2-159
172.31.12.108 ip-172-31-12-108
172.31.5.236 ip-172-31-5-236
172.31.7.96 ip-172-31-7-96
- 以上两步操作,在集群中其它节点做相应配置
2.2.2 禁用SELinux
- 在所有节点执行sudo setenforce 0 命令,此处使用批处理shell执行:
[ec2-user@ip-172-31-2-159 ~]$ sh ssh_do_all.sh node.list "sudo setenforce 0"
- 集群所有节点修改/etc/selinux/config文件如下:
SELINUX=disabled
SELINUXTYPE=targeted
2.2.3 关闭防火墙
- 集群所有节点执行 sudo systemctl stop命令,此处通过shell批量执行命令如下:
[ec2-user@ip-172-31-2-159 ~]$ sh ssh_do_all.sh node.list "sudo systemctl stop firewalld"
[ec2-user@ip-172-31-2-159 ~]$ sh ssh_do_all.sh node.list "sudo systemctl disable firewalld"
[ec2-user@ip-172-31-2-159 ~]$ sh ssh_do_all.sh node.list "sudo systemctl status firewalld"
2.2.4 集群时钟同步
在CentOS7.2的操作系统上,已经默认的安装了chrony,配置chrony时钟同步,将cm(172.31.2.159)服务作为本地chrony服务器,其它3台服务器与其保持同步,配置片段:
- 172.31.2.159配置与自己同步
[ec2-user@ ip-172-31-2-159 ~]$ sudo vim /etc/chrony.conf
server ip-172-31-2-159 iburst
#keyfile=/etc/chrony.keys
- 集群其它节点:在注释下增加如下配置
[ec2-user@ip-172-31-12-108 ~]$ sudo vim /etc/chrony.conf
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server ip-172-31-2-159 iburst
#keyfile=/etc/chrony.keys
- 重启所有机器的chrony服务
[ec2-user@ip-172-31-2-159 ~]$ sh ssh_do_all.sh node.list "sudo systemctl restart chronyd"
- 验证始终同步,在所有节点执行chronycsources命令,如下使用脚本批量执行
[ec2-user@ip-172-31-2-159 ~]$ sh ssh_do_all.sh node.list "chronyc sources"
2.2.5 配置操作系统repo
- 挂载操作系统iso文件
[ec2-user@ip-172-31-2-159 ~]$ sudo mkdir /media/DVD1
[ec2-user@ip-172-31-2-159 ~]$ sudo mount -o loop
CentOS-7-x86_64-DVD-1611.iso /media/DVD1/
- 配置操作系统repo
[ec2-user@ip-172-31-2-159 ~]$ sudo vim /etc/yum.repos.d/local_os.repo
[local_iso]
name=CentOS-$releasever - Media
baseurl=file:///media/DVD1
gpgcheck=0
enabled=1
[ec2-user@ip-172-31-2-159 ~]$ sudo yum repolist
2.2.6 安装http服务
- 安装httpd服务
[ec2-user@ip-172-31-2-159 ~]$ sudo yum -y install httpd
- 启动或停止httpd服务
[ec2-user@ip-172-31-2-159 ~]$ sudo systemctl start httpd
[ec2-user@ip-172-31-2-159 ~]$ sudo systemctl stop httpd
- 安装完httpd后,重新制作操作系统repo,换成http的方式方便其它服务器也可以访问
[ec2-user@ip-172-31-2-159 ~]$ sudo mkdir /var/www/html/iso
[ec2-user@ip-172-31-2-159 ~]$ sudo scp -r /media/DVD1/* /var/www/html/iso/
[ec2-user@ip-172-31-2-159 ~]$ sudo vim /etc/yum.repos.d/os.repo
[osrepo]
name=os_repo
baseurl=http://172.31.2.159/iso/
enabled=true
gpgcheck=false
[ec2-user@ip-172-31-2-159 ~]$ sudo yum repolist
2.2.7 安装MariaDB
由于centos7默认使用的是5.5.52版本的MariaDB,此处使用的10.2.1版本(http://yum.mariadb.org/10.2.1/centos7-amd64/rpms/),在官网下载rpm安装包:
MariaDB-10.2.1-centos7-x86_64-client.rpm
MariaDB-10.2.1-centos7-x86_64-common.rpm
MariaDB-10.2.1-centos7-x86_64-compat.rpm
MariaDB-10.2.1-centos7-x86_64-server.rpm
将包下载到本地,放在同一目录,执行createrepo命令生成rpm元数据。
此处使用apache2,将上述mariadb10.2.1目录移动到/var/www/html目录下, 使得用户可以通过HTTP访问这些rpm包。
[ec2-user@ip-172-31-2-159 ~]$ sudo mv mariadb10.2.1 /var/www/html/
- 安装MariaDB依赖
[ec2-user@ip-172-31-2-159 ~]$ yum install libaio perl perl-DBI perl-Module-Pluggable perl-Pod-Escapes perl-Pod-Simple perl-libs perl-version
- 制作本地repo
[ec2-user@ip-172-31-2-159 ~]$ sudo vim /etc/yum.repos.d/mariadb.repo
[mariadb]
name = MariaDB
baseurl = http://172.31.2.159/ mariadb10.2.1
enable = true
gpgcheck = false
[ec2-user@ip-172-31-2-159 ~]$ sudo yum repolist
- 安装MariaDB
[ec2-user@ip-172-31-2-159 ~]$ sudo yum -y install MariaDB-server MariaDB-client
- 启动并配置MariaDB
[ec2-user@ip-172-31-2-159 ~]$ sudo systemctl start mariadb
[ec2-user@ip-172-31-2-159 ~]$ sudo /usr/bin/mysql_secure_installation
NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MariaDB
SERVERS IN PRODUCTION USE! PLEASE READ EACH STEP CAREFULLY!
In order to log into MariaDB to secure it, we'll need the current
password for the root user. If you've just installed MariaDB, and
you haven't set the root password yet, the password will be blank,
so you should just press enter here.
Enter current password for root (enter for none):
OK, successfully used password, moving on...
Setting the root password ensures that nobody can log into the MariaDB
root user without the proper authorisation.
Set root password? [Y/n] Y
New password:
Re-enter new password:
Password updated successfully!
Reloading privilege tables..
... Success!
By default, a MariaDB installation has an anonymous user, allowing anyone
to log into MariaDB without having to have a user account created for
them. This is intended only for testing, and to make the installation
go a bit smoother. You should remove them before moving into a
production environment.
Remove anonymous users? [Y/n] Y
... Success!
Normally, root should only be allowed to connect from 'localhost'. This
ensures that someone cannot guess at the root password from the network.
Disallow root login remotely? [Y/n] n
... skipping.
By default, MariaDB comes with a database named 'test' that anyone can
access. This is also intended only for testing, and should be removed
before moving into a production environment.
Remove test database and access to it? [Y/n] Y
- Dropping test database...
... Success!
- Removing privileges on test database...
... Success!
Reloading the privilege tables will ensure that all changes made so far
will take effect immediately.
Reload privilege tables now? [Y/n] Y
... Success!
Cleaning up...
All done! If you've completed all of the above steps, your MariaDB
installation should now be secure.
Thanks for using MariaDB!
- 建立CM和Hive需要的表
[ec2-user@ip-172-31-2-159 ~]$ mysql -uroot -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.2.1-MariaDB MariaDB Server
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]>
create database metastore default character set utf8;
CREATE USER 'hive'@'%' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON metastore. * TO 'hive'@'%';
FLUSH PRIVILEGES;
create database cm default character set utf8;
CREATE USER 'cm'@'%' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON cm. * TO 'cm'@'%';
FLUSH PRIVILEGES;
create database am default character set utf8;
CREATE USER 'am'@'%' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON am. * TO 'am'@'%';
FLUSH PRIVILEGES;
create database rm default character set utf8;
CREATE USER 'rm'@'%' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON rm. * TO 'rm'@'%';
FLUSH PRIVILEGES;
- 安装jdbc驱动
[ec2-user@ip-172-31-2-159 ~]$ sudo mkdir -p /usr/share/java/
[ec2-user@ip-172-31-2-159 ~]$ sudo mv mysql-connector-java-5.1.37.jar /usr/share/java/
[ec2-user@ip-172-31-2-159 java]$ cd /usr/share/java
[ec2-user@ip-172-31-2-159 java]$ sudo ln -s mysql-connector-java-5.1.37.jar mysql-connector-java.jar
[ec2-user@ip-172-31-2-159 java]$ ll
total 964
-rw-r--r--. 1 root root 985600 Oct 6 2015 mysql-connector-java-5.1.37.jar
lrwxrwxrwx. 1 root root 31 Mar 29 14:37 mysql-connector-java.jar -> mysql-connector-java-5.1.37.jar
2.3 Cloudera Manager安装
2.3.1 配置本地repo源
将Cloudera Manager安装需要的7个rpm包下载到本地,放在同一目录,执行createrepo命令生成rpm元数据。
[ec2-user@ip-172-31-2-159 cm]$ ls
cloudera-manager-agent-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
cloudera-manager-daemons-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
cloudera-manager-server-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
cloudera-manager-server-db-2-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
enterprise-debuginfo-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
jdk-6u31-linux-amd64.rpm
oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
[ec2-user@ip-172-31-2-159 cm]$ sudo createrepo .
Spawning worker 0 with 1 pkgs
Spawning worker 1 with 1 pkgs
Spawning worker 2 with 1 pkgs
Spawning worker 3 with 1 pkgs
Spawning worker 4 with 1 pkgs
Spawning worker 5 with 1 pkgs
Spawning worker 6 with 1 pkgs
Spawning worker 7 with 0 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete
- 配置Web服务器
- 此处使用apache2,将上述cdh5.10.0/cm5.10.0目录移动到/var/www/html目录下, 使得用户可以通过HTTP访问这些rpm包。
[ec2-user@ip-172-31-2-159 ~]$ sudo mv cdh5.10.0/ cm5.10.0/ /var/www/html/
- 制作ClouderaManager的repo源
[ec2-user@ip-172-31-2-159 ~]$ sudo vim /etc/yum.repos.d/cm.repo
[cmrepo]
name = cm_repo
baseurl = http://172.31.2.159/cm5.10.0.0
enable = true
gpgcheck = false
[ec2-user@ip-172-31-2-159 yum.repos.d]$ sudo yum repolist
- 验证安装JDK
[ec2-user@ip-172-31-2-159 ~]$ sudo yum -y install oracle-j2sdk1.7-1.7.0+update67-1
2.3.2 安装Cloudera Manager Server
- 通过yum安装ClouderaManager Server
[ec2-user@ip-172-31-2-159 ~]$ sudo yum -y install cloudera-manager-server
- 初始化数据库
[ec2-user@ip-172-31-2-159 ~]$ sudo /usr/share/cmf/schema/scm_prepare_database.sh mysql cm cm password
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/java/jdk1.7.0_67-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/
oracle-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!
- 启动ClouderaManager Server
[ec2-user@ip-172-31-2-159 ~]$ sudo systemctl start cloudera-scm-server
- 检查端口是否监听
[ec2-user@ip-172-31-2-159 ~]$ sudo netstat -lnpt | grep 7180
tcp 0 0 0.0.0.0:7180 0.0.0.0:* LISTEN 6890/java
2.4 CDH安装
2.4.1 CDH集群安装向导
- admin/admin登录到CM
-
同意license协议,点击继续
-
选择60试用,点击继续
-
点击“继续”
-
输入主机ip或者名称,点击搜索找到主机后点击继续
-
点击“继续”
- 使用parcel选择,点击“更多选项”,点击“-”删除其它所有地址,输入
http://172.31.2.159/cm5.10.0/点击“保存更改”
-
选择自定义存储库,输入cm的http地址
-
点击“继续”,进入下一步安装jdk
-
点击“继续”,进入下一步,默认多用户模式
-
点击“继续”,进入下一步配置ssh账号密码
-
点击“继续”,进入下一步,安装Cloudera Manager相关到各个节点
-
点击“继续”,进入下一步安装cdh到各个节点
-
点击“继续”,进入下一步主机检查,确保所有检查项均通过
- 点击完成进入服务安装向导。
2.4.2 集群设置安装向导
-
选择需要安装的服务
-
点击“继续”,进入集群角色分配
-
点击“继续”,进入下一步,测试数据库连接
-
测试成功,点击“继续”,进入目录设置,此处使用默认默认目录,根据实际情况进行目录修改
-
点击“继续”,进入各个服务启动
-
安装成功
- 安装成功后进入home管理界面
大数据视频推荐:
腾讯课堂
CSDN
大数据语音推荐:
企业级大数据技术应用
大数据机器学习案例之推荐系统
自然语言处理
大数据基础
人工智能:深度学习入门到精通