安装 Hive 集群(Rocky Linux 9.2)¶
安装目标
- Hive 集群启用 Kerberos 认证。
- Hive 由 2 个 HiveServer2 和 2 个 MetaStore 组成集群。
- Hive 版本是 3.1.3。
前置条件
- 已按照安装 Hadoop 集群(Rocky Linux 9.2)安装 Hadoop。
- 已安装 MySQL。
准备用户¶
Warning
下面的命令是在主机 ipa-server 上执行的。
-
身份认证
-
新增用户 hive
共享 Hive 配置目录¶
Warning
下面的命令是在主机 ipa-server 上执行的。
-
修改 NFS 配置文件/etc/exports,使得主机 ipa-server 上的/etc/hive 目录共享给其他主机。
-
创建 Hive 配置目录并设置属主属组
-
刷新 NFS 的配置
-
在 IPA Server 上创建挂载配置
安装 Hive¶
Warning
下面的命令需要在主机 machine1~machine5 上执行。
-
下载、解压、创建别名
-
添加环境变量
-
下载 MySQL JDBC 驱动到 $HIVE_HOME/lib 目录下
-
创建数据目录并设置属主属组
-
创建配置目录并设置属主属组
-
创建日志目录并设置属主属组
-
共享 ipa-server 的/etc/hive 目录
准备 Kerberos principals 和 keytab files¶
Warning
下面的命令是在主机 ipa-server 上执行的。
-
创建principal
ipa service-add hive/machine1.xuwangwei.test@XUWANGWEI.TEST --requires-pre-auth=true ipa service-add hive/machine2.xuwangwei.test@XUWANGWEI.TEST --requires-pre-auth=true ipa service-add hive/machine3.xuwangwei.test@XUWANGWEI.TEST --requires-pre-auth=true ipa service-add hive/machine4.xuwangwei.test@XUWANGWEI.TEST --requires-pre-auth=true ipa service-add hive/machine5.xuwangwei.test@XUWANGWEI.TEST --requires-pre-auth=true
-
导出principal的密钥
ipa-getkeytab --principal=HTTP/machine1.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=HTTP/machine2.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=HTTP/machine3.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=HTTP/machine4.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=HTTP/machine5.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=hive/machine1.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=hive/machine2.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=hive/machine3.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=hive/machine4.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=hive/machine5.xuwangwei.test@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1 ipa-getkeytab --principal=hive@XUWANGWEI.TEST --keytab=/etc/hive/hive.keytab --enctypes=aes256-sha1,aes128-sha1
-
修改keytab的属主属组和权限
配置 Hive¶
Warning
下面的命令是在主机 machine1 上执行的。
-
在MySQL数据库中创建用户hive,创建数据库metastore_db。这些信息会放在配置文件hive-site.xml中。
-
将数据库 metastore_db 的密码存储到/etc/hive/hive.jceks
[root@machine1 ~]# su - hive -c "hadoop credential create javax.jdo.option.ConnectionPassword -provider jceks://file/etc/hive/hive.jceks" WARNING: You have accepted the use of the default provider password by not configuring a password in one of the two following locations: * In the environment variable HADOOP_CREDSTORE_PASSWORD * In a file referred to by the configuration entry hadoop.security.credstore.java-keystore-provider.password-file. Please review the documentation regarding provider passwords in the keystore passwords section of the Credential Provider API Continuing with the default provider password. Enter alias password: Enter alias password again: javax.jdo.option.ConnectionPassword has been successfully created. Provider jceks://file/etc/hive/hive.jceks was updated.
-
创建 hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <configuration> <property> <name>system:user.name</name> <value>${user.name}</value> </property> <property> <name>system:java.io.tmpdir</name> <value>/var/lib/hive</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://machine4.xuwangwei.test:9083,thrift://machine5.xuwangwei.test:9083</value> </property> <property> <name>hive.metastore.kerberos.keytab.file</name> <value>/etc/hive/hive.keytab</value> </property> <property> <name>hive.metastore.kerberos.principal</name> <value>hive/_HOST@XUWANGWEI.TEST</value> </property> <property> <name>hive.metastore.client.kerberos.principal</name> <value>hive/_HOST@XUWANGWEI.TEST</value> </property> <property> <name>hive.metastore.sasl.enabled</name> <value>true</value> </property> <property> <name>hive.metastore.db.type</name> <value>mysql</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.cj.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://mysql.xuwangwei.test:3306/metastore_db?characterEncoding=UTF-8&characterSetResults=UTF-8&connectionTimeZone=Asia/Shanghai</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>hadoop.security.credential.provider.path</name> <value>jceks://file/etc/hive/hive.jceks</value> </property> <property> <name>hive.metastore.port</name> <value>9083</value> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> <property> <name>hive.metastore.schema.verification.record.version</name> <value>false</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <property> <name>hive.server2.webui.port</name> <value>10002</value> <description>The port the HiveServer2 WebUI will listen on. This can beset to 0 or a negative integer to disable the web UI</description> </property> <property> <name>hive.server2.webui.use.spnego</name> <value>false</value> </property> <property> <name>hive.server2.webui.spnego.keytab</name> <value>/etc/hive/hive.keytab</value> </property> <property> <name>hive.server2.webui.spnego.principal</name> <value>HTTP/_HOST@XUWANGWEI.TEST</value> </property> <property> <name>hive.server2.authentication</name> <value>kerberos</value> </property> <property> <name>hive.server2.thrift.sasl.qop</name> <value>auth-conf</value> </property> <property> <name>hive.server2.thrift.port</name> <value>10000</value> <description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description> </property> <property> <name>hive.server2.authentication.kerberos.keytab</name> <value>/etc/hive/hive.keytab</value> </property> <property> <name>hive.server2.authentication.kerberos.principal</name> <value>hive/_HOST@XUWANGWEI.TEST</value> </property> <property> <name>hive.server2.authentication.client.kerberos.principal</name> <value>hive/_HOST@XUWANGWEI.TEST</value> </property> <property> <name>hive.server2.authentication.spnego.keytab</name> <value>/etc/hive/hive.keytab</value> </property> <property> <name>hive.server2.authentication.spnego.principal</name> <value>HTTP/_HOST@XUWANGWEI.TEST</value> </property> <property> <name>hive.compute.query.using.stats</name> <value>false</value> </property> <property> <name>hive.exec.parallel</name> <value>true</value> <description>Whether to execute jobs in parallel</description> </property> <property> <name>hive.execution.engine</name> <value>mr</value> <description> Expects one of [mr, tez, spark]. Chooses execution engine. Options are: mr (Map reduce, default), tez, spark. While MR remains the default engine for historical reasons, it is itself a historical engine and is deprecated in Hive 2 line. It may be removed without further warning. </description> </property> <property> <name>hive.server2.support.dynamic.service.discovery</name> <value>true</value> </property> <property> <name>hive.server2.zookeeper.namespace</name> <value>hiveserver2</value> </property> <property> <name>hive.zookeeper.quorum</name> <value>machine1.xuwangwei.test:2181,machine2.xuwangwei.test:2181,machine3.xuwangwei.test:2181,machine4.xuwangwei.test:2181,machine5.xuwangwei.test:2181</value> </property> </configuration>
-
修改配置文件 $HADOOP_CONF_DIR/core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://xuwangwei</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/var/lib/hadoop</value> </property> <property> <name>hadoop.zk.address</name> <value>machine1.xuwangwei.test:2181,machine2.xuwangwei.test:2181,machine3.xuwangwei.test:2181,machine4.xuwangwei.test:2181,machine5.xuwangwei.test:2181</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>machine1.xuwangwei.test:2181,machine2.xuwangwei.test:2181,machine3.xuwangwei.test:2181,machine4.xuwangwei.test:2181,machine5.xuwangwei.test:2181</value> </property> <property> <name>hadoop.security.authentication</name> <value>kerberos</value> </property> <property> <name>hadoop.proxyuser.hive.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hive.groups</name> <value>*</value> </property> </configuration>
-
使用 hdfs@XUWANGWEI.TEST 帐号进行 Kerberos 认证
-
在 HDFS 上创建目录
-
初始化 MetaStore 使用的数据库
-
使 core-silte 中刚修改的配置生效
启动 Hive¶
-
启动 MetaStore
-
启动 HiveServer2
使用 Beeline 连接 Hive 服务¶
-
添加 hive 用户的环境变量
-
使用 hive@XUWANGWEI.TEST 帐号进行 Kerberos 认证
-
使用 Beeline 连接 Hive 集群
su - hive -c "beeline" beeline> !connect jdbc:hive2://machine1.xuwangwei.test:2181,machine2.xuwangwei.test:2181,machine3.xuwangwei.test:2181,machine4.xuwangwei.test:2181,machine5.xuwangwei.test:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;saslQop=auth-conf;auth=KERBEROS;principal=hive/_HOST@XUWANGWEI.TEST;
-
连接成功,日志如下所示
Connecting to jdbc:hive2://machine1.xuwangwei.test:2181,machine2.xuwangwei.test:2181,machine3.xuwangwei.test:2181,machine4.xuwangwei.test:2181,machine5.xuwangwei.test:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;saslQop=auth-conf;auth=KERBEROS;principal=hive/_HOST@XUWANGWEI.TEST; 24/01/03 15:12:56 [main]: INFO jdbc.HiveConnection: Connected to machine4.xuwangwei.test:10000 Connected to: Apache Hive (version 3.1.3) Driver: Hive JDBC (version 3.1.3) Transaction isolation: TRANSACTION_REPEATABLE_READ