Hive版本:我们使用2.1.1的版本安装
Hive介绍:能够让你使用SQL语言查询分布式文件如HDFS。
Hive安装
- 下载安装包
安装包下载地址:http://mirror.nexcess.net/apache/hive/hive-2.1.1/
选择二进制安装包下载
将安装包拷贝到目录:/Users/****/apps/,这个目录是我自己电脑上所有大数据相关产品的安装目录。
解压安装包:
tar zxvf apache-hive-2.1.1-bin.tar.gz
设置HIVE_HOME:
vim ~/.bash_profile
使配置文件立即生效
source ~/.bash_profile
- 配置文件修改
我们使用mysql作为Hive的元数据存储,所以先从模板文件复制一份:
cd conf
cp hive-default.xml.template hive-site.xml
直接用如下配置覆盖hive-site.xml文件内容:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<configuration>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/tmp/hive/warehouse</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive123456</value>
</property>
</configuration>
- 拷贝mysql jdbc连接jar包到目录:
/Users/****/apps/apache-hive-2.1.1-bin/lib
- 执行初始化元数据命令
$HIVE_HOME/bin/schematool -dbType mysql -initSchema
- 使用Hive CLI
hive
创建数据库:pptb
create database pptb;
在pptb数据库下面创建表ad_log:
use pptb;
CREATE EXTERNAL TABLE ad_log (
time timestamp comment '访问时间',
user_id string comment '用户id',
ad_slot_id int comment '广告位id',
event_type int comment '日志类型 1:广告请求 2:广告展示 3:广告点击',
creative_id int comment '广告创意id',
url string comment '用户浏览的网站地址',
referer_url string comment '来源网址',
ip string comment '用户访问ip',
city_id int comment '城市id'
)
partitioned by (dt string,hour string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
location 'hdfs://localhost:9000/ad_log/';
查看广告创建的表:
desc ad_log;
- 使用HiveServer2:
启动HiveServer2服务:
$HIVE_HOME/bin/hiveserver2 &
- 使用Beeline连接HiveServer2
现在Hive官方已经将使用HiveCLI 连接Hive置为废弃的,建议启动HiveServer2并使用Beeline CLI连接HiveServer2 查询。
beeline -u jdbc:hive2://localhost:10000/default -n user_name -p password
下一篇文章将介绍如何使用Hue通过页面访问Hive。