[root@master software]# mysql -uroot -p Enter password: Welcome to the MySQL monitor. Commands endwith ; or \g. Your MySQL connection id is3 Server version: 5.7.44 MySQL Community Server (GPL)
Copyright (c) 2000, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.
Type 'help;'or'\h'for help. Type '\c'to clear the current input statement.
mysql> exit; Bye [root@master software]# mysql -uroot -p Enter password: Welcome to the MySQL monitor. Commands endwith ; or \g. Your MySQL connection id is4 Server version: 5.7.44 MySQL Community Server (GPL)
Copyright (c) 2000, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.
Type 'help;'or'\h'for help. Type '\c'to clear the current input statement.
[root@master software]# hive --version SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/root/software/hive/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/root/software/hadoop/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive 3.1.3 Git git://MacBook-Pro.fios-router.home/Users/ngangam/commit/hive -r 4df4d75bf1e16fe0af75aad0b4179c34c07fc975 Compiled by ngangam on Sun Apr 3 16:58:16 EDT 2022 From source with checksum 5da234766db5dfbe3e92926c9bbab2af
[root@master conf]# flume-ng version Flume 1.11.0 Source code repository: https://git.apache.org/repos/asf/flume.git Revision: 1a15927e594fd0d05a59d804b90a9c31ec93f5e1 Compiled by rgoers on Sun Oct 16 14:44:15 MST 2022 From source with checksum bbbca682177262aac3a89defde369a37
(二)任务二:数据库配置维护
1.子任务一:数据库配置
在 Hive 中创建一个名为 comm 的数据库,如果数据库已经存在,则不进行创建。
1 2 3 4 5 6 7
[root@master conf]# hive hive> create database if not exist comm; FAILED: ParseException line 1:23 missing KW_EXISTS at 'exist' near '<EOF>' line 1:29 extraneous input 'comm' expecting EOF near '<EOF>' hive> create database if not exists comm; OK Time taken: 0.297 seconds
hive> use comm; OK Time taken: 0.042 seconds hive> CREATE TABLE IF NOT EXISTS ods_behavior_log ( > line STRING > ) > PARTITIONED BY (dt STRING) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '\t' > STORED AS TEXTFILE; OK Time taken: 0.244 seconds
(2) 使 用 load data 子 句 将 本 地/root/eduhq/data/app_log/behavior目录下的每个数据文件依次加载到外部表ods_behavior_log的对应分区中,按照日志文件对应日期定义静态分区(例如:dt=’2023-01-01’)
hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/excel_log/behavior2023-01-01.xlsx' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-01'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 1.831 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/excel_log/behavior2023-01-02.xlsx' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-02'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.628 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/excel_log/behavior2023-01-03.xlsx' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-03'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.491 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/excel_log/behavior2023-01-04.xlsx' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-04'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.506 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/excel_log/behavior2023-01-05.xlsx' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-05'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.465 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/excel_log/behavior2023-01-06.xlsx' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-06'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.44 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/excel_log/behavior2023-01-07.xlsx' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-07'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.423 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/txt_log/behavior2023-01-01.log' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-01'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.432 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/txt_log/behavior2023-01-02.log' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-02'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.397 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/txt_log/behavior2023-01-03.log' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-03'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.425 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/txt_log/behavior2023-01-04.log' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-04'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.43 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/txt_log/behavior2023-01-05.log' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-05'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.395 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/txt_log/behavior2023-01-06.log' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-06'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.422 seconds hive> LOAD DATA LOCAL INPATH '/root/eduhq/data/app_log/behavior/txt_log/behavior2023-01-07.log' INTO TABLE ods_behavior_log PARTITION (dt='2023-01-07'); Loading data to table comm.ods_behavior_log partition (dt=2023-01-01) OK Time taken: 0.773 seconds
# 统计数据总行数 hive> SELECT COUNT(*) FROM ods_behavior_log; Query ID = root_20241028190258_2de0fddd-343e-4256-b248-f252f3400a9b Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1730102747884_0003, Tracking URL = http://master:8089/proxy/application_1730102747884_0003/ Kill Command = /root/software/hadoop/bin/mapred job -kill job_1730102747884_0003 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2024-10-28 19:03:31,071 Stage-1 map = 0%, reduce = 0% 2024-10-28 19:03:42,782 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.55 sec 2024-10-28 19:04:43,115 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.55 sec 2024-10-28 19:05:43,251 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.55 sec 2024-10-28 19:06:44,246 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.55 sec 2024-10-28 19:06:53,529 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.32 sec MapReduce Total cumulative CPU time: 4 seconds 320 msec Ended Job = job_1730102747884_0003 MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.32 sec HDFS Read: 37824491 HDFS Write: 106 SUCCESS Total MapReduce CPU Time Spent: 4 seconds 320 msec OK 198413 Time taken: 235.981 seconds, Fetched: 1 row(s)