文章目录

  • 第9章 数仓开发之DWD层
    • 9.1 交易域加购事务事实表
    • 9.2 交易域下单事务事实表
    • 9.3 交易域取消订单事务事实表
    • 9.4 交易域支付成功事务事实表
    • 9.5 交易域退单事务事实表
    • 9.6 交易域退款成功事务事实表
    • 9.7 交易域购物车周期快照事实表
    • 9.8 工具域优惠券领取事务事实表
    • 9.9 工具域优惠券使用(下单)事务事实表
    • 9.10 工具域优惠券使用(支付)事务事实表
    • 9.11 互动域收藏商品事务事实表
    • 9.12 互动域评价事务事实表
    • 9.13 流量域页面浏览事务事实表
    • 9.14 流量域启动事务事实表
    • 9.15 流量域动作事务事实表
    • 9.16 流量域曝光事务事实表
    • 9.17 流量域错误事务事实表
    • 9.18 用户域用户注册事务事实表
    • 9.19 用户域用户登录事务事实表
    • 9.20 数据装载脚本
      • 9.20.1 首日装载脚本
      • 9.20.2 每日装载脚本

上一篇: 离线数仓11—— 数仓开发之DIM层
下一篇: 离线数仓13—— 数仓开发之DWS层

第9章 数仓开发之DWD层

DWD层设计要点:
(1)DWD层的设计依据是维度建模理论,该层存储维度模型的事实表。
(2)DWD层的数据存储格式为orc列式存储+snappy压缩。
(3)DWD层表名的命名规范为dwd_数据域_表名_单分区增量全量标识(inc/full)

9.1 交易域加购事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_trade_cart_add_inc;
CREATE EXTERNAL TABLE dwd_trade_cart_add_inc
(`id`               STRING COMMENT '编号',`user_id`          STRING COMMENT '用户id',`sku_id`           STRING COMMENT '商品id',`date_id`          STRING COMMENT '时间id',`create_time`      STRING COMMENT '加购时间',`source_id`        STRING COMMENT '来源类型ID',`source_type_code` STRING COMMENT '来源类型编码',`source_type_name` STRING COMMENT '来源类型名称',`sku_num`          BIGINT COMMENT '加购物车件数'
) COMMENT '交易域加购物车事务事实表'PARTITIONED BY (`dt` STRING)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_trade_cart_add_inc/'TBLPROPERTIES ('orc.compress' = 'snappy');

2)分区规划

3)数据装载
(1)数据流向

(2)首日装载

set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table dwd_trade_cart_add_inc partition (dt)
selectid,user_id,sku_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,source_id,source_type,dic.dic_name,sku_num,date_format(create_time, 'yyyy-MM-dd')
from
(selectdata.id,data.user_id,data.sku_id,data.create_time,data.source_id,data.source_type,data.sku_numfrom ods_cart_info_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
)ci
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-14'and parent_code='24'
)dic
on ci.source_type=dic.dic_code;

(3)每日装载

insert overwrite table dwd_trade_cart_add_inc partition(dt='2020-06-15')
selectid,user_id,sku_id,date_id,create_time,source_id,source_type_code,source_type_name,sku_num
from
(selectdata.id,data.user_id,data.sku_id,date_format(from_utc_timestamp(ts*1000,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts*1000,'GMT+8'),'yyyy-MM-dd HH:mm:ss') create_time,data.source_id,data.source_type source_type_code,if(type='insert',data.sku_num,data.sku_num-old['sku_num']) sku_numfrom ods_cart_info_incwhere dt='2020-06-15'and (type='insert'or(type='update' and old['sku_num'] is not null and data.sku_num>cast(old['sku_num'] as int)))
)cart
left join
(selectdic_code,dic_name source_type_namefrom ods_base_dic_fullwhere dt='2020-06-15'and parent_code='24'
)dic
on cart.source_type_code=dic.dic_code;

9.2 交易域下单事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_trade_order_detail_inc;
CREATE EXTERNAL TABLE dwd_trade_order_detail_inc
(`id`                    STRING COMMENT '编号',`order_id`              STRING COMMENT '订单id',`user_id`               STRING COMMENT '用户id',`sku_id`                STRING COMMENT '商品id',`province_id`           STRING COMMENT '省份id',`activity_id`           STRING COMMENT '参与活动规则id',`activity_rule_id`      STRING COMMENT '参与活动规则id',`coupon_id`             STRING COMMENT '使用优惠券id',`date_id`               STRING COMMENT '下单日期id',`create_time`           STRING COMMENT '下单时间',`source_id`             STRING COMMENT '来源编号',`source_type_code`      STRING COMMENT '来源类型编码',`source_type_name`      STRING COMMENT '来源类型名称',`sku_num`               BIGINT COMMENT '商品数量',`split_original_amount` DECIMAL(16, 2) COMMENT '原始价格',`split_activity_amount` DECIMAL(16, 2) COMMENT '活动优惠分摊',`split_coupon_amount`   DECIMAL(16, 2) COMMENT '优惠券优惠分摊',`split_total_amount`    DECIMAL(16, 2) COMMENT '最终价格分摊'
) COMMENT '交易域下单明细事务事实表'PARTITIONED BY (`dt` STRING)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_trade_order_detail_inc/'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载
(1)首日装载

set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table dwd_trade_order_detail_inc partition (dt)
selectod.id,order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,date_format(create_time, 'yyyy-MM-dd') date_id,create_time,source_id,source_type,dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount,date_format(create_time,'yyyy-MM-dd')
from
(selectdata.id,data.order_id,data.sku_id,data.create_time,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ods_order_detail_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) od
left join
(selectdata.id,data.user_id,data.province_idfrom ods_order_info_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ods_order_detail_activity_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ods_order_detail_coupon_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-14'and parent_code='24'
)dic
on od.source_type=dic.dic_code;

(2)每日装载

insert overwrite table dwd_trade_order_detail_inc partition (dt='2020-06-15')
selectod.id,order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,date_id,create_time,source_id,source_type,dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount
from
(selectdata.id,data.order_id,data.sku_id,date_format(data.create_time, 'yyyy-MM-dd') date_id,data.create_time,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ods_order_detail_incwhere dt = '2020-06-15'and type = 'insert'
) od
left join
(selectdata.id,data.user_id,data.province_idfrom ods_order_info_incwhere dt = '2020-06-15'and type = 'insert'
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ods_order_detail_activity_incwhere dt = '2020-06-15'and type = 'insert'
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ods_order_detail_coupon_incwhere dt = '2020-06-15'and type = 'insert'
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-15'and parent_code='24'
)dic
on od.source_type=dic.dic_code;

9.3 交易域取消订单事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_trade_cancel_detail_inc;
CREATE EXTERNAL TABLE dwd_trade_cancel_detail_inc
(`id`                    STRING COMMENT '编号',`order_id`              STRING COMMENT '订单id',`user_id`               STRING COMMENT '用户id',`sku_id`                STRING COMMENT '商品id',`province_id`           STRING COMMENT '省份id',`activity_id`           STRING COMMENT '参与活动规则id',`activity_rule_id`      STRING COMMENT '参与活动规则id',`coupon_id`             STRING COMMENT '使用优惠券id',`date_id`               STRING COMMENT '取消订单日期id',`cancel_time`           STRING COMMENT '取消订单时间',`source_id`             STRING COMMENT '来源编号',`source_type_code`      STRING COMMENT '来源类型编码',`source_type_name`      STRING COMMENT '来源类型名称',`sku_num`               BIGINT COMMENT '商品数量',`split_original_amount` DECIMAL(16, 2) COMMENT '原始价格',`split_activity_amount` DECIMAL(16, 2) COMMENT '活动优惠分摊',`split_coupon_amount`   DECIMAL(16, 2) COMMENT '优惠券优惠分摊',`split_total_amount`    DECIMAL(16, 2) COMMENT '最终价格分摊'
) COMMENT '交易域取消订单明细事务事实表'PARTITIONED BY (`dt` STRING)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_trade_cancel_detail_inc/'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载
(1)首日装载

set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table dwd_trade_cancel_detail_inc partition (dt)
selectod.id,order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,date_format(canel_time,'yyyy-MM-dd') date_id,canel_time,source_id,source_type,dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount,date_format(canel_time,'yyyy-MM-dd')
from
(selectdata.id,data.order_id,data.sku_id,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ods_order_detail_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) od
join
(selectdata.id,data.user_id,data.province_id,data.operate_time canel_timefrom ods_order_info_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'and data.order_status='1003'
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ods_order_detail_activity_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ods_order_detail_coupon_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-14'and parent_code='24'
)dic
on od.source_type=dic.dic_code;

(2)每日装载

insert overwrite table dwd_trade_cancel_detail_inc partition (dt='2020-06-15')
selectod.id,order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,date_format(canel_time,'yyyy-MM-dd') date_id,canel_time,source_id,source_type,dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount
from
(selectdata.id,data.order_id,data.sku_id,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ods_order_detail_incwhere (dt='2020-06-15' or dt=date_add('2020-06-15',-1))and (type = 'insert' or type= 'bootstrap-insert')
) od
join
(selectdata.id,data.user_id,data.province_id,data.operate_time canel_timefrom ods_order_info_incwhere dt = '2020-06-15'and type = 'update'and data.order_status='1003'and array_contains(map_keys(old),'order_status')
) oi
on order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ods_order_detail_activity_incwhere (dt='2020-06-15' or dt=date_add('2020-06-15',-1))and (type = 'insert' or type= 'bootstrap-insert')
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ods_order_detail_coupon_incwhere (dt='2020-06-15' or dt=date_add('2020-06-15',-1))and (type = 'insert' or type= 'bootstrap-insert')
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-15'and parent_code='24'
)dic
on od.source_type=dic.dic_code;

9.4 交易域支付成功事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_trade_pay_detail_suc_inc;
CREATE EXTERNAL TABLE dwd_trade_pay_detail_suc_inc
(`id`                    STRING COMMENT '编号',`order_id`              STRING COMMENT '订单id',`user_id`               STRING COMMENT '用户id',`sku_id`                STRING COMMENT '商品id',`province_id`           STRING COMMENT '省份id',`activity_id`           STRING COMMENT '参与活动规则id',`activity_rule_id`      STRING COMMENT '参与活动规则id',`coupon_id`             STRING COMMENT '使用优惠券id',`payment_type_code`     STRING COMMENT '支付类型编码',`payment_type_name`     STRING COMMENT '支付类型名称',`date_id`               STRING COMMENT '支付日期id',`callback_time`         STRING COMMENT '支付成功时间',`source_id`             STRING COMMENT '来源编号',`source_type_code`      STRING COMMENT '来源类型编码',`source_type_name`      STRING COMMENT '来源类型名称',`sku_num`               BIGINT COMMENT '商品数量',`split_original_amount` DECIMAL(16, 2) COMMENT '应支付原始金额',`split_activity_amount` DECIMAL(16, 2) COMMENT '支付活动优惠分摊',`split_coupon_amount`   DECIMAL(16, 2) COMMENT '支付优惠券优惠分摊',`split_payment_amount`  DECIMAL(16, 2) COMMENT '支付金额'
) COMMENT '交易域成功支付事务事实表'PARTITIONED BY (`dt` STRING)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_trade_pay_detail_suc_inc/'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载
(1)首日装载

insert overwrite table dwd_trade_pay_detail_suc_inc partition (dt)
selectod.id,od.order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,payment_type,pay_dic.dic_name,date_format(callback_time,'yyyy-MM-dd') date_id,callback_time,source_id,source_type,src_dic.dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount,date_format(callback_time,'yyyy-MM-dd')
from
(selectdata.id,data.order_id,data.sku_id,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ods_order_detail_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) od
join
(selectdata.user_id,data.order_id,data.payment_type,data.callback_timefrom ods_payment_info_incwhere dt='2020-06-14'and type='bootstrap-insert'and data.payment_status='1602'
) pi
on od.order_id=pi.order_id
left join
(selectdata.id,data.province_idfrom ods_order_info_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ods_order_detail_activity_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ods_order_detail_coupon_incwhere dt = '2020-06-14'and type = 'bootstrap-insert'
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-14'and parent_code='11'
) pay_dic
on pi.payment_type=pay_dic.dic_code
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-14'and parent_code='24'
)src_dic
on od.source_type=src_dic.dic_code;

(2)每日装载

insert overwrite table dwd_trade_pay_detail_suc_inc partition (dt='2020-06-15')
selectod.id,od.order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,payment_type,pay_dic.dic_name,date_format(callback_time,'yyyy-MM-dd') date_id,callback_time,source_id,source_type,src_dic.dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount
from
(selectdata.id,data.order_id,data.sku_id,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ods_order_detail_incwhere (dt = '2020-06-15' or dt = date_add('2020-06-15',-1))and (type = 'insert' or type = 'bootstrap-insert')
) od
join
(selectdata.user_id,data.order_id,data.payment_type,data.callback_timefrom ods_payment_info_incwhere dt='2020-06-15'and type='update'and array_contains(map_keys(old),'payment_status')and data.payment_status='1602'
) pi
on od.order_id=pi.order_id
left join
(selectdata.id,data.province_idfrom ods_order_info_incwhere (dt = '2020-06-15' or dt = date_add('2020-06-15',-1))and (type = 'insert' or type = 'bootstrap-insert')
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ods_order_detail_activity_incwhere (dt = '2020-06-15' or dt = date_add('2020-06-15',-1))and (type = 'insert' or type = 'bootstrap-insert')
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ods_order_detail_coupon_incwhere (dt = '2020-06-15' or dt = date_add('2020-06-15',-1))and (type = 'insert' or type = 'bootstrap-insert')
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-15'and parent_code='11'
) pay_dic
on pi.payment_type=pay_dic.dic_code
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-15'and parent_code='24'
)src_dic
on od.source_type=src_dic.dic_code;

9.5 交易域退单事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_trade_order_refund_inc;
CREATE EXTERNAL TABLE dwd_trade_order_refund_inc
(`id`                      STRING COMMENT '编号',`user_id`                 STRING COMMENT '用户ID',`order_id`                STRING COMMENT '订单ID',`sku_id`                  STRING COMMENT '商品ID',`province_id`             STRING COMMENT '地区ID',`date_id`                 STRING COMMENT '日期ID',`create_time`             STRING COMMENT '退单时间',`refund_type_code`        STRING COMMENT '退单类型编码',`refund_type_name`        STRING COMMENT '退单类型名称',`refund_reason_type_code` STRING COMMENT '退单原因类型编码',`refund_reason_type_name` STRING COMMENT '退单原因类型名称',`refund_reason_txt`       STRING COMMENT '退单原因描述',`refund_num`              BIGINT COMMENT '退单件数',`refund_amount`           DECIMAL(16, 2) COMMENT '退单金额'
) COMMENT '交易域退单事务事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_trade_order_refund_inc/'TBLPROPERTIES ("orc.compress" = "snappy");

2)数据装载
(1)首日装载

insert overwrite table dwd_trade_order_refund_inc partition(dt)
selectri.id,user_id,order_id,sku_id,province_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,refund_type,type_dic.dic_name,refund_reason_type,reason_dic.dic_name,refund_reason_txt,refund_num,refund_amount,date_format(create_time,'yyyy-MM-dd')
from
(selectdata.id,data.user_id,data.order_id,data.sku_id,data.refund_type,data.refund_num,data.refund_amount,data.refund_reason_type,data.refund_reason_txt,data.create_timefrom ods_order_refund_info_incwhere dt='2020-06-14'and type='bootstrap-insert'
)ri
left join
(selectdata.id,data.province_idfrom ods_order_info_incwhere dt='2020-06-14'and type='bootstrap-insert'
)oi
on ri.order_id=oi.id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-14'and parent_code = '15'
)type_dic
on ri.refund_type=type_dic.dic_code
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-14'and parent_code = '13'
)reason_dic
on ri.refund_reason_type=reason_dic.dic_code;

(2)每日装载

insert overwrite table dwd_trade_order_refund_inc partition(dt='2020-06-15')
selectri.id,user_id,order_id,sku_id,province_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,refund_type,type_dic.dic_name,refund_reason_type,reason_dic.dic_name,refund_reason_txt,refund_num,refund_amount
from
(selectdata.id,data.user_id,data.order_id,data.sku_id,data.refund_type,data.refund_num,data.refund_amount,data.refund_reason_type,data.refund_reason_txt,data.create_timefrom ods_order_refund_info_incwhere dt='2020-06-15'and type='insert'
)ri
left join
(selectdata.id,data.province_idfrom ods_order_info_incwhere dt='2020-06-15'and type='update'and data.order_status='1005'and array_contains(map_keys(old),'order_status')
)oi
on ri.order_id=oi.id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-15'and parent_code = '15'
)type_dic
on ri.refund_type=type_dic.dic_code
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-15'and parent_code = '13'
)reason_dic
on ri.refund_reason_type=reason_dic.dic_code;

9.6 交易域退款成功事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_trade_refund_pay_suc_inc;
CREATE EXTERNAL TABLE dwd_trade_refund_pay_suc_inc
(`id`                STRING COMMENT '编号',`user_id`           STRING COMMENT '用户ID',`order_id`          STRING COMMENT '订单编号',`sku_id`            STRING COMMENT 'SKU编号',`province_id`       STRING COMMENT '地区ID',`payment_type_code` STRING COMMENT '支付类型编码',`payment_type_name` STRING COMMENT '支付类型名称',`date_id`           STRING COMMENT '日期ID',`callback_time`     STRING COMMENT '支付成功时间',`refund_num`        DECIMAL(16, 2) COMMENT '退款件数',`refund_amount`     DECIMAL(16, 2) COMMENT '退款金额'
) COMMENT '交易域提交退款成功事务事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_trade_refund_pay_suc_inc/'TBLPROPERTIES ("orc.compress" = "snappy");

2)数据装载
(1)首日装载

insert overwrite table dwd_trade_refund_pay_suc_inc partition(dt)
selectrp.id,user_id,rp.order_id,rp.sku_id,province_id,payment_type,dic_name,date_format(callback_time,'yyyy-MM-dd') date_id,callback_time,refund_num,total_amount,date_format(callback_time,'yyyy-MM-dd')
from
(selectdata.id,data.order_id,data.sku_id,data.payment_type,data.callback_time,data.total_amountfrom ods_refund_payment_incwhere dt='2020-06-14'and type = 'bootstrap-insert'and data.refund_status='1602'
)rp
left join
(selectdata.id,data.user_id,data.province_idfrom ods_order_info_incwhere dt='2020-06-14'and type='bootstrap-insert'
)oi
on rp.order_id=oi.id
left join
(selectdata.order_id,data.sku_id,data.refund_numfrom ods_order_refund_info_incwhere dt='2020-06-14'and type='bootstrap-insert'
)ri
on rp.order_id=ri.order_id
and rp.sku_id=ri.sku_id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-14'and parent_code='11'
)dic
on rp.payment_type=dic.dic_code;

(2)每日装载

insert overwrite table dwd_trade_refund_pay_suc_inc partition(dt='2020-06-15')
selectrp.id,user_id,rp.order_id,rp.sku_id,province_id,payment_type,dic_name,date_format(callback_time,'yyyy-MM-dd') date_id,callback_time,refund_num,total_amount
from
(selectdata.id,data.order_id,data.sku_id,data.payment_type,data.callback_time,data.total_amountfrom ods_refund_payment_incwhere dt='2020-06-15'and type = 'update'and array_contains(map_keys(old),'refund_status')and data.refund_status='1602'
)rp
left join
(selectdata.id,data.user_id,data.province_idfrom ods_order_info_incwhere dt='2020-06-15'and type='update'and data.order_status='1006'and array_contains(map_keys(old),'order_status')
)oi
on rp.order_id=oi.id
left join
(selectdata.order_id,data.sku_id,data.refund_numfrom ods_order_refund_info_incwhere dt='2020-06-15'and type='update'and data.refund_status='0705'and array_contains(map_keys(old),'refund_status')
)ri
on rp.order_id=ri.order_id
and rp.sku_id=ri.sku_id
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-15'and parent_code='11'
)dic
on rp.payment_type=dic.dic_code;

9.7 交易域购物车周期快照事实表

1)建表语句

DROP TABLE IF EXISTS dwd_trade_cart_full;
CREATE EXTERNAL TABLE dwd_trade_cart_full
(`id`       STRING COMMENT '编号',`user_id`  STRING COMMENT '用户id',`sku_id`   STRING COMMENT '商品id',`sku_name` STRING COMMENT '商品名称',`sku_num`  BIGINT COMMENT '加购物车件数'
) COMMENT '交易域购物车周期快照事实表'PARTITIONED BY (`dt` STRING)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_trade_cart_full/'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

insert overwrite table dwd_trade_cart_full partition(dt='2020-06-14')
selectid,user_id,sku_id,sku_name,sku_num
from ods_cart_info_full
where dt='2020-06-14'
and is_ordered='0';

9.8 工具域优惠券领取事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_tool_coupon_get_inc;
CREATE EXTERNAL TABLE dwd_tool_coupon_get_inc
(`id`        STRING COMMENT '编号',`coupon_id` STRING COMMENT '优惠券ID',`user_id`   STRING COMMENT 'userid',`date_id`   STRING COMMENT '日期ID',`get_time`  STRING COMMENT '领取时间'
) COMMENT '优惠券领取事务事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_tool_coupon_get_inc/'TBLPROPERTIES ("orc.compress" = "snappy");

2)数据装载
(1)首日装载

insert overwrite table dwd_tool_coupon_get_inc partition(dt)
selectdata.id,data.coupon_id,data.user_id,date_format(data.get_time,'yyyy-MM-dd') date_id,data.get_time,date_format(data.get_time,'yyyy-MM-dd')
from ods_coupon_use_inc
where dt='2020-06-14'
and type='bootstrap-insert';

(2)每日装载

insert overwrite table dwd_tool_coupon_get_inc partition (dt='2020-06-15')
selectdata.id,data.coupon_id,data.user_id,date_format(data.get_time,'yyyy-MM-dd') date_id,data.get_time
from ods_coupon_use_inc
where dt='2020-06-15'
and type='insert';

9.9 工具域优惠券使用(下单)事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_tool_coupon_order_inc;
CREATE EXTERNAL TABLE dwd_tool_coupon_order_inc
(`id`         STRING COMMENT '编号',`coupon_id`  STRING COMMENT '优惠券ID',`user_id`    STRING COMMENT 'user_id',`order_id`   STRING COMMENT 'order_id',`date_id`    STRING COMMENT '日期ID',`order_time` STRING COMMENT '使用下单时间'
) COMMENT '优惠券使用下单事务事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_tool_coupon_order_inc/'TBLPROPERTIES ("orc.compress" = "snappy");

2)数据装载
(1)首日装载

insert overwrite table dwd_tool_coupon_order_inc partition(dt)
selectdata.id,data.coupon_id,data.user_id,data.order_id,date_format(data.using_time,'yyyy-MM-dd') date_id,data.using_time,date_format(data.using_time,'yyyy-MM-dd')
from ods_coupon_use_inc
where dt='2020-06-14'
and type='bootstrap-insert'
and data.using_time is not null;

(2)每日装载

insert overwrite table dwd_tool_coupon_order_inc partition(dt='2020-06-15')
selectdata.id,data.coupon_id,data.user_id,data.order_id,date_format(data.using_time,'yyyy-MM-dd') date_id,data.using_time
from ods_coupon_use_inc
where dt='2020-06-15'
and type='update'
and array_contains(map_keys(old),'using_time');

9.10 工具域优惠券使用(支付)事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_tool_coupon_pay_inc;
CREATE EXTERNAL TABLE dwd_tool_coupon_pay_inc
(`id`           STRING COMMENT '编号',`coupon_id`    STRING COMMENT '优惠券ID',`user_id`      STRING COMMENT 'user_id',`order_id`     STRING COMMENT 'order_id',`date_id`      STRING COMMENT '日期ID',`payment_time` STRING COMMENT '使用下单时间'
) COMMENT '优惠券使用支付事务事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_tool_coupon_pay_inc/'TBLPROPERTIES ("orc.compress" = "snappy");

2)数据装载
(1)首日装载

insert overwrite table dwd_tool_coupon_pay_inc partition(dt)
selectdata.id,data.coupon_id,data.user_id,data.order_id,date_format(data.used_time,'yyyy-MM-dd') date_id,data.used_time,date_format(data.used_time,'yyyy-MM-dd')
from ods_coupon_use_inc
where dt='2020-06-14'
and type='bootstrap-insert'
and data.used_time is not null;

(2)每日装载

insert overwrite table dwd_tool_coupon_pay_inc partition(dt='2020-06-15')
selectdata.id,data.coupon_id,data.user_id,data.order_id,date_format(data.used_time,'yyyy-MM-dd') date_id,data.used_time
from ods_coupon_use_inc
where dt='2020-06-15'
and type='update'
and array_contains(map_keys(old),'used_time');

9.11 互动域收藏商品事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_interaction_favor_add_inc;
CREATE EXTERNAL TABLE dwd_interaction_favor_add_inc
(`id`          STRING COMMENT '编号',`user_id`     STRING COMMENT '用户id',`sku_id`      STRING COMMENT 'sku_id',`date_id`     STRING COMMENT '日期id',`create_time` STRING COMMENT '收藏时间'
) COMMENT '收藏事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_interaction_favor_add_inc/'TBLPROPERTIES ("orc.compress" = "snappy");

2)数据装载
(1)首日装载

set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table dwd_interaction_favor_add_inc partition(dt)
selectdata.id,data.user_id,data.sku_id,date_format(data.create_time,'yyyy-MM-dd') date_id,data.create_time,date_format(data.create_time,'yyyy-MM-dd')
from ods_favor_info_inc
where dt='2020-06-14'
and type = 'bootstrap-insert';
(2)每日装载
insert overwrite table dwd_interaction_favor_add_inc partition(dt='2020-06-15')
selectdata.id,data.user_id,data.sku_id,date_format(data.create_time,'yyyy-MM-dd') date_id,data.create_time
from ods_favor_info_inc
where dt='2020-06-15'
and type = 'insert';

9.12 互动域评价事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_interaction_comment_inc;
CREATE EXTERNAL TABLE dwd_interaction_comment_inc
(`id`            STRING COMMENT '编号',`user_id`       STRING COMMENT '用户ID',`sku_id`        STRING COMMENT 'sku_id',`order_id`      STRING COMMENT '订单ID',`date_id`       STRING COMMENT '日期ID',`create_time`   STRING COMMENT '评价时间',`appraise_code` STRING COMMENT '评价编码',`appraise_name` STRING COMMENT '评价名称'
) COMMENT '评价事务事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_interaction_comment_inc/'TBLPROPERTIES ("orc.compress" = "snappy");

2)数据装载
(1)首日装载

insert overwrite table dwd_interaction_comment_inc partition(dt)
selectid,user_id,sku_id,order_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,appraise,dic_name,date_format(create_time,'yyyy-MM-dd')
from
(selectdata.id,data.user_id,data.sku_id,data.order_id,data.create_time,data.appraisefrom ods_comment_info_incwhere dt='2020-06-14'and type='bootstrap-insert'
)ci
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-14'and parent_code='12'
)dic
on ci.appraise=dic.dic_code;

(2)每日装载

insert overwrite table dwd_interaction_comment_inc partition(dt='2020-06-15')
selectid,user_id,sku_id,order_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,appraise,dic_name
from
(selectdata.id,data.user_id,data.sku_id,data.order_id,data.create_time,data.appraisefrom ods_comment_info_incwhere dt='2020-06-15'and type='insert'
)ci
left join
(selectdic_code,dic_namefrom ods_base_dic_fullwhere dt='2020-06-15'and parent_code='12'
)dic
on ci.appraise=dic.dic_code;

9.13 流量域页面浏览事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_traffic_page_view_inc;
CREATE EXTERNAL TABLE dwd_traffic_page_view_inc
(`province_id`    STRING COMMENT '省份id',`brand`          STRING COMMENT '手机品牌',`channel`        STRING COMMENT '渠道',`is_new`         STRING COMMENT '是否首次启动',`model`          STRING COMMENT '手机型号',`mid_id`         STRING COMMENT '设备id',`operate_system` STRING COMMENT '操作系统',`user_id`        STRING COMMENT '会员id',`version_code`   STRING COMMENT 'app版本号',`page_item`      STRING COMMENT '目标id ',`page_item_type` STRING COMMENT '目标类型',`last_page_id`   STRING COMMENT '上页类型',`page_id`        STRING COMMENT '页面ID ',`source_type`    STRING COMMENT '来源类型',`date_id`        STRING COMMENT '日期id',`view_time`      STRING COMMENT '跳入时间',`session_id`     STRING COMMENT '所属会话id',`during_time`    BIGINT COMMENT '持续时间毫秒'
) COMMENT '页面日志表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_traffic_page_view_inc'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

set hive.cbo.enable=false;
insert overwrite table dwd_traffic_page_view_inc partition (dt='2020-06-14')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,page_item,page_item_type,last_page_id,page_id,source_type,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') view_time,concat(mid_id,'-',last_value(session_start_point,true) over (partition by mid_id order by ts)) session_id,during_time
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,ts,if(page.last_page_id is null,ts,null) session_start_pointfrom ods_log_incwhere dt='2020-06-14'and page is not null
)log
left join
(selectid province_id,area_codefrom ods_base_province_fullwhere dt='2020-06-14'
)bp
on log.area_code=bp.area_code;

9.14 流量域启动事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_traffic_start_inc;
CREATE EXTERNAL TABLE dwd_traffic_start_inc
(`province_id`     STRING COMMENT '省份id',`brand`           STRING COMMENT '手机品牌',`channel`         STRING COMMENT '渠道',`is_new`          STRING COMMENT '是否首次启动',`model`           STRING COMMENT '手机型号',`mid_id`          STRING COMMENT '设备id',`operate_system`  STRING COMMENT '操作系统',`user_id`         STRING COMMENT '会员id',`version_code`    STRING COMMENT 'app版本号',`entry`           STRING COMMENT 'icon手机图标 notice 通知',`open_ad_id`      STRING COMMENT '广告页ID ',`date_id`         STRING COMMENT '日期id',`start_time`      STRING COMMENT '启动时间',`loading_time_ms` BIGINT COMMENT '启动加载时间',`open_ad_ms`      BIGINT COMMENT '广告总共播放时间',`open_ad_skip_ms` BIGINT COMMENT '用户跳过广告时点'
) COMMENT '启动日志表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_traffic_start_inc'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

set hive.cbo.enable=false;
insert overwrite table dwd_traffic_start_inc partition(dt='2020-06-14')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,entry,open_ad_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') action_time,loading_time,open_ad_ms,open_ad_skip_ms
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,`start`.entry,`start`.loading_time,`start`.open_ad_id,`start`.open_ad_ms,`start`.open_ad_skip_ms,tsfrom ods_log_incwhere dt='2020-06-14'and `start` is not null
)log
left join
(selectid province_id,area_codefrom ods_base_province_fullwhere dt='2020-06-14'
)bp
on log.area_code=bp.area_code;

9.15 流量域动作事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_traffic_action_inc;
CREATE EXTERNAL TABLE dwd_traffic_action_inc
(`province_id`      STRING COMMENT '省份id',`brand`            STRING COMMENT '手机品牌',`channel`          STRING COMMENT '渠道',`is_new`           STRING COMMENT '是否首次启动',`model`            STRING COMMENT '手机型号',`mid_id`           STRING COMMENT '设备id',`operate_system`   STRING COMMENT '操作系统',`user_id`          STRING COMMENT '会员id',`version_code`     STRING COMMENT 'app版本号',`during_time`      BIGINT COMMENT '持续时间毫秒',`page_item`        STRING COMMENT '目标id ',`page_item_type`   STRING COMMENT '目标类型',`last_page_id`     STRING COMMENT '上页类型',`page_id`          STRING COMMENT '页面id ',`source_type`      STRING COMMENT '来源类型',`action_id`        STRING COMMENT '动作id',`action_item`      STRING COMMENT '目标id ',`action_item_type` STRING COMMENT '目标类型',`date_id`          STRING COMMENT '日期id',`action_time`      STRING COMMENT '动作发生时间'
) COMMENT '动作日志表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_traffic_action_inc'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

set hive.cbo.enable=false;
insert overwrite table dwd_traffic_action_inc partition(dt='2020-06-14')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,during_time,page_item,page_item_type,last_page_id,page_id,source_type,action_id,action_item,action_item_type,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') action_time
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,action.action_id,action.item action_item,action.item_type action_item_type,action.tsfrom ods_log_inc lateral view explode(actions) tmp as actionwhere dt='2020-06-14'and actions is not null
)log
left join
(selectid province_id,area_codefrom ods_base_province_fullwhere dt='2020-06-14'
)bp
on log.area_code=bp.area_code;

9.16 流量域曝光事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_traffic_display_inc;
CREATE EXTERNAL TABLE dwd_traffic_display_inc
(`province_id`       STRING COMMENT '省份id',`brand`             STRING COMMENT '手机品牌',`channel`           STRING COMMENT '渠道',`is_new`            STRING COMMENT '是否首次启动',`model`             STRING COMMENT '手机型号',`mid_id`            STRING COMMENT '设备id',`operate_system`    STRING COMMENT '操作系统',`user_id`           STRING COMMENT '会员id',`version_code`      STRING COMMENT 'app版本号',`during_time`       BIGINT COMMENT 'app版本号',`page_item`         STRING COMMENT '目标id ',`page_item_type`    STRING COMMENT '目标类型',`last_page_id`      STRING COMMENT '上页类型',`page_id`           STRING COMMENT '页面ID ',`source_type`       STRING COMMENT '来源类型',`date_id`           STRING COMMENT '日期id',`display_time`      STRING COMMENT '曝光时间',`display_type`      STRING COMMENT '曝光类型',`display_item`      STRING COMMENT '曝光对象id ',`display_item_type` STRING COMMENT 'app版本号',`display_order`     BIGINT COMMENT '曝光顺序',`display_pos_id`    BIGINT COMMENT '曝光位置'
) COMMENT '曝光日志表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_traffic_display_inc'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

set hive.cbo.enable=false;
insert overwrite table dwd_traffic_display_inc partition(dt='2020-06-14')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,during_time,page_item,page_item_type,last_page_id,page_id,source_type,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') display_time,display_type,display_item,display_item_type,display_order,display_pos_id
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,display.display_type,display.item display_item,display.item_type display_item_type,display.`order` display_order,display.pos_id display_pos_id,tsfrom ods_log_inc lateral view explode(displays) tmp as displaywhere dt='2020-06-14'and displays is not null
)log
left join
(selectid province_id,area_codefrom ods_base_province_fullwhere dt='2020-06-14'
)bp
on log.area_code=bp.area_code;

9.17 流量域错误事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_traffic_error_inc;
CREATE EXTERNAL TABLE dwd_traffic_error_inc
(`province_id`     STRING COMMENT '地区编码',`brand`           STRING COMMENT '手机品牌',`channel`         STRING COMMENT '渠道',`is_new`          STRING COMMENT '是否首次启动',`model`           STRING COMMENT '手机型号',`mid_id`          STRING COMMENT '设备id',`operate_system`  STRING COMMENT '操作系统',`user_id`         STRING COMMENT '会员id',`version_code`    STRING COMMENT 'app版本号',`page_item`       STRING COMMENT '目标id ',`page_item_type`  STRING COMMENT '目标类型',`last_page_id`    STRING COMMENT '上页类型',`page_id`         STRING COMMENT '页面ID ',`source_type`     STRING COMMENT '来源类型',`entry`           STRING COMMENT 'icon手机图标  notice 通知',`loading_time`    STRING COMMENT '启动加载时间',`open_ad_id`      STRING COMMENT '广告页ID ',`open_ad_ms`      STRING COMMENT '广告总共播放时间',`open_ad_skip_ms` STRING COMMENT '用户跳过广告时点',`actions`         ARRAY<STRUCT<action_id:STRING,item:STRING,item_type:STRING,ts:BIGINT>> COMMENT '动作信息',`displays`        ARRAY<STRUCT<display_type :STRING,item :STRING,item_type :STRING,`order` :STRING,pos_id:STRING>> COMMENT '曝光信息',`date_id`         STRING COMMENT '日期id',`error_time`      STRING COMMENT '错误时间',`error_code`      STRING COMMENT '错误码',`error_msg`       STRING COMMENT '错误信息'
) COMMENT '错误日志表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_traffic_error_inc'TBLPROPERTIES ('orc.compress' = 'snappy');

2)数据装载

set hive.cbo.enable=false;
set hive.execution.engine=mr;
insert overwrite table dwd_traffic_error_inc partition(dt='2020-06-14')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,page_item,page_item_type,last_page_id,page_id,source_type,entry,loading_time,open_ad_id,open_ad_ms,open_ad_skip_ms,actions,displays,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') error_time,error_code,error_msg
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,`start`.entry,`start`.loading_time,`start`.open_ad_id,`start`.open_ad_ms,`start`.open_ad_skip_ms,actions,displays,err.error_code,err.msg error_msg,tsfrom ods_log_incwhere dt='2020-06-14'and err is not null
)log
join
(selectid province_id,area_codefrom ods_base_province_fullwhere dt='2020-06-14'
)bp
on log.area_code=bp.area_code;

9.18 用户域用户注册事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_user_register_inc;
CREATE EXTERNAL TABLE dwd_user_register_inc
(`user_id`        STRING COMMENT '用户ID',`date_id`        STRING COMMENT '日期ID',`create_time`    STRING COMMENT '注册时间',`channel`        STRING COMMENT '应用下载渠道',`province_id`    STRING COMMENT '省份id',`version_code`   STRING COMMENT '应用版本',`mid_id`         STRING COMMENT '设备id',`brand`          STRING COMMENT '设备品牌',`model`          STRING COMMENT '设备型号',`operate_system` STRING COMMENT '设备操作系统'
) COMMENT '用户域用户注册事务事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_user_register_inc/'TBLPROPERTIES ("orc.compress" = "snappy");

2)数据装载
(1)首日装载

set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table dwd_user_register_inc partition(dt)
selectui.user_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,channel,province_id,version_code,mid_id,brand,model,operate_system,date_format(create_time,'yyyy-MM-dd')
from
(selectdata.id user_id,data.create_timefrom ods_user_info_incwhere dt='2020-06-14'and type='bootstrap-insert'
)ui
left join
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_codefrom ods_log_incwhere dt='2020-06-14'and page.page_id='register'and common.uid is not null
)log
on ui.user_id=log.user_id
left join
(selectid province_id,area_codefrom ods_base_province_fullwhere dt='2020-06-14'
)bp
on log.area_code=bp.area_code;

(2)每日装载

insert overwrite table dwd_user_register_inc partition(dt='2020-06-15')
selectui.user_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,channel,province_id,version_code,mid_id,brand,model,operate_system
from
(selectdata.id user_id,data.create_timefrom ods_user_info_incwhere dt='2020-06-15'and type='insert'
)ui
left join
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_codefrom ods_log_incwhere dt='2020-06-15'and page.page_id='register'and common.uid is not null
)log
on ui.user_id=log.user_id
left join
(selectid province_id,area_codefrom ods_base_province_fullwhere dt='2020-06-15'
)bp
on log.area_code=bp.area_code;

9.19 用户域用户登录事务事实表

1)建表语句

DROP TABLE IF EXISTS dwd_user_login_inc;
CREATE EXTERNAL TABLE dwd_user_login_inc
(`user_id`        STRING COMMENT '用户ID',`date_id`        STRING COMMENT '日期ID',`login_time`     STRING COMMENT '登录时间',`channel`        STRING COMMENT '应用下载渠道',`province_id`    STRING COMMENT '省份id',`version_code`   STRING COMMENT '应用版本',`mid_id`         STRING COMMENT '设备id',`brand`          STRING COMMENT '设备品牌',`model`          STRING COMMENT '设备型号',`operate_system` STRING COMMENT '设备操作系统'
) COMMENT '用户域用户登录事务事实表'PARTITIONED BY (`dt` STRING)STORED AS ORCLOCATION '/warehouse/gmall/dwd/dwd_user_login_inc/'TBLPROPERTIES ("orc.compress" = "snappy");

2)数据装载

insert overwrite table dwd_user_login_inc partition(dt='2020-06-14')
selectuser_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') login_time,channel,province_id,version_code,mid_id,brand,model,operate_system
from
(selectuser_id,channel,area_code,version_code,mid_id,brand,model,operate_system,tsfrom(selectuser_id,channel,area_code,version_code,mid_id,brand,model,operate_system,ts,row_number() over (partition by session_id order by ts) rnfrom(selectuser_id,channel,area_code,version_code,mid_id,brand,model,operate_system,ts,concat(mid_id,'-',last_value(session_start_point,true) over(partition by mid_id order by ts)) session_idfrom(selectcommon.uid user_id,common.ch channel,common.ar area_code,common.vc version_code,common.mid mid_id,common.ba brand,common.md model,common.os operate_system,ts,if(page.last_page_id is null,ts,null) session_start_pointfrom ods_log_incwhere dt='2020-06-14'and page is not null)t1)t2where user_id is not null)t3where rn=1
)t4
left join
(selectid province_id,area_codefrom ods_base_province_fullwhere dt='2020-06-14'
)bp
on t4.area_code=bp.area_code;

9.20 数据装载脚本

9.20.1 首日装载脚本

(1)在hadoop102的/home/atguigu/bin目录下创建ods_to_dwd_init.sh

[atguigu@hadoop102 bin]$ vim ods_to_dwd_init.sh

(2)编写如下内容

#!/bin/bash
APP=gmallif [ -n "$2" ] ;thendo_date=$2
else echo "请传入日期参数"exit
fidwd_interaction_comment_inc="
insert overwrite table ${APP}.dwd_interaction_comment_inc partition(dt)
selectid,user_id,sku_id,order_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,appraise,dic_name,date_format(create_time,'yyyy-MM-dd')
from
(selectdata.id,data.user_id,data.sku_id,data.order_id,data.create_time,data.appraisefrom ${APP}.ods_comment_info_incwhere dt='$do_date'and type='bootstrap-insert'
)ci
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='12'
)dic
on ci.appraise=dic.dic_code;
"
dwd_interaction_favor_add_inc="
insert overwrite table ${APP}.dwd_interaction_favor_add_inc partition(dt)
selectdata.id,data.user_id,data.sku_id,date_format(data.create_time,'yyyy-MM-dd') date_id,data.create_time,date_format(data.create_time,'yyyy-MM-dd')
from ${APP}.ods_favor_info_inc
where dt='$do_date'
and type = 'bootstrap-insert';
"dwd_tool_coupon_get_inc="
insert overwrite table ${APP}.dwd_tool_coupon_get_inc partition(dt)
selectdata.id,data.coupon_id,data.user_id,date_format(data.get_time,'yyyy-MM-dd') date_id,data.get_time,date_format(data.get_time,'yyyy-MM-dd')
from ${APP}.ods_coupon_use_inc
where dt='$do_date'
and type='bootstrap-insert';
"
dwd_tool_coupon_order_inc="
insert overwrite table ${APP}.dwd_tool_coupon_order_inc partition(dt)
selectdata.id,data.coupon_id,data.user_id,data.order_id,date_format(data.using_time,'yyyy-MM-dd') date_id,data.using_time,date_format(data.using_time,'yyyy-MM-dd')
from ${APP}.ods_coupon_use_inc
where dt='$do_date'
and type='bootstrap-insert'
and data.using_time is not null;
"
dwd_tool_coupon_pay_inc="
insert overwrite table ${APP}.dwd_tool_coupon_pay_inc partition(dt)
selectdata.id,data.coupon_id,data.user_id,data.order_id,date_format(data.used_time,'yyyy-MM-dd') date_id,data.used_time,date_format(data.used_time,'yyyy-MM-dd')
from ${APP}.ods_coupon_use_inc
where dt='$do_date'
and type='bootstrap-insert'
and data.used_time is not null;
"
dwd_trade_cancel_detail_inc="
insert overwrite table ${APP}.dwd_trade_cancel_detail_inc partition (dt)
selectod.id,order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,date_format(canel_time,'yyyy-MM-dd') date_id,canel_time,source_id,source_type,dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount,date_format(canel_time,'yyyy-MM-dd')
from
(selectdata.id,data.order_id,data.sku_id,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ${APP}.ods_order_detail_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) od
join
(selectdata.id,data.user_id,data.province_id,data.operate_time canel_timefrom ${APP}.ods_order_info_incwhere dt = '$do_date'and type = 'bootstrap-insert'and data.order_status='1003'
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ${APP}.ods_order_detail_activity_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ${APP}.ods_order_detail_coupon_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='24'
)dic
on od.source_type=dic.dic_code;
"
dwd_trade_cart_add_inc="
insert overwrite table ${APP}.dwd_trade_cart_add_inc partition (dt)
selectid,user_id,sku_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,source_id,source_type,dic.dic_name,sku_num,date_format(create_time, 'yyyy-MM-dd')
from
(selectdata.id,data.user_id,data.sku_id,data.create_time,data.source_id,data.source_type,data.sku_numfrom ${APP}.ods_cart_info_incwhere dt = '$do_date'and type = 'bootstrap-insert'
)ci
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='24'
)dic
on ci.source_type=dic.dic_code;
"
dwd_trade_cart_full="
insert overwrite table ${APP}.dwd_trade_cart_full partition(dt='$do_date')
selectid,user_id,sku_id,sku_name,sku_num
from ${APP}.ods_cart_info_full
where dt='$do_date'
and is_ordered='0';
"
dwd_trade_order_detail_inc="
insert overwrite table ${APP}.dwd_trade_order_detail_inc partition (dt)
selectod.id,order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,date_format(create_time, 'yyyy-MM-dd') date_id,create_time,source_id,source_type,dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount,date_format(create_time,'yyyy-MM-dd')
from
(selectdata.id,data.order_id,data.sku_id,data.create_time,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ${APP}.ods_order_detail_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) od
left join
(selectdata.id,data.user_id,data.province_idfrom ${APP}.ods_order_info_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ${APP}.ods_order_detail_activity_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ${APP}.ods_order_detail_coupon_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='24'
)dic
on od.source_type=dic.dic_code;
"
dwd_trade_order_refund_inc="
insert overwrite table ${APP}.dwd_trade_order_refund_inc partition(dt)
selectri.id,user_id,order_id,sku_id,province_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,refund_type,type_dic.dic_name,refund_reason_type,reason_dic.dic_name,refund_reason_txt,refund_num,refund_amount,date_format(create_time,'yyyy-MM-dd')
from
(selectdata.id,data.user_id,data.order_id,data.sku_id,data.refund_type,data.refund_num,data.refund_amount,data.refund_reason_type,data.refund_reason_txt,data.create_timefrom ${APP}.ods_order_refund_info_incwhere dt='$do_date'and type='bootstrap-insert'
)ri
left join
(selectdata.id,data.province_idfrom ${APP}.ods_order_info_incwhere dt='$do_date'and type='bootstrap-insert'
)oi
on ri.order_id=oi.id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code = '15'
)type_dic
on ri.refund_type=type_dic.dic_code
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code = '13'
)reason_dic
on ri.refund_reason_type=reason_dic.dic_code;
"dwd_trade_pay_detail_suc_inc="
insert overwrite table ${APP}.dwd_trade_pay_detail_suc_inc partition (dt)
selectod.id,od.order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,payment_type,pay_dic.dic_name,date_format(callback_time,'yyyy-MM-dd') date_id,callback_time,source_id,source_type,src_dic.dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount,date_format(callback_time,'yyyy-MM-dd')
from
(selectdata.id,data.order_id,data.sku_id,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ${APP}.ods_order_detail_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) od
join
(selectdata.user_id,data.order_id,data.payment_type,data.callback_timefrom ${APP}.ods_payment_info_incwhere dt='$do_date'and type='bootstrap-insert'and data.payment_status='1602'
) pi
on od.order_id=pi.order_id
left join
(selectdata.id,data.province_idfrom ${APP}.ods_order_info_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ${APP}.ods_order_detail_activity_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ${APP}.ods_order_detail_coupon_incwhere dt = '$do_date'and type = 'bootstrap-insert'
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='11'
) pay_dic
on pi.payment_type=pay_dic.dic_code
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='24'
)src_dic
on od.source_type=src_dic.dic_code;
"
dwd_trade_refund_pay_suc_inc="
insert overwrite table ${APP}.dwd_trade_refund_pay_suc_inc partition(dt)
selectrp.id,user_id,rp.order_id,rp.sku_id,province_id,payment_type,dic_name,date_format(callback_time,'yyyy-MM-dd') date_id,callback_time,refund_num,total_amount,date_format(callback_time,'yyyy-MM-dd')
from
(selectdata.id,data.order_id,data.sku_id,data.payment_type,data.callback_time,data.total_amountfrom ${APP}.ods_refund_payment_incwhere dt='$do_date'and type = 'bootstrap-insert'and data.refund_status='1602'
)rp
left join
(selectdata.id,data.user_id,data.province_idfrom ${APP}.ods_order_info_incwhere dt='$do_date'and type='bootstrap-insert'
)oi
on rp.order_id=oi.id
left join
(selectdata.order_id,data.sku_id,data.refund_numfrom ${APP}.ods_order_refund_info_incwhere dt='$do_date'and type='bootstrap-insert'
)ri
on rp.order_id=ri.order_id
and rp.sku_id=ri.sku_id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='11'
)dic
on rp.payment_type=dic.dic_code;
"
dwd_traffic_action_inc="
set hive.cbo.enable=false;
insert overwrite table ${APP}.dwd_traffic_action_inc partition(dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,during_time,page_item,page_item_type,last_page_id,page_id,source_type,action_id,action_item,action_item_type,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') action_time
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,action.action_id,action.item action_item,action.item_type action_item_type,action.tsfrom ${APP}.ods_log_inc lateral view explode(actions) tmp as actionwhere dt='$do_date'and actions is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"
dwd_traffic_display_inc="
set hive.cbo.enable=false;
insert overwrite table ${APP}.dwd_traffic_display_inc partition(dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,during_time,page_item,page_item_type,last_page_id,page_id,source_type,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') display_time,display_type,display_item,display_item_type,display_order,display_pos_id
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,display.display_type,display.item display_item,display.item_type display_item_type,display.\`order\` display_order,display.pos_id display_pos_id,tsfrom ${APP}.ods_log_inc lateral view explode(displays) tmp as displaywhere dt='$do_date'and displays is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"
dwd_traffic_error_inc="
set hive.cbo.enable=false;
set hive.execution.engine=mr;
insert overwrite table ${APP}.dwd_traffic_error_inc partition(dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,page_item,page_item_type,last_page_id,page_id,source_type,entry,loading_time,open_ad_id,open_ad_ms,
open_ad_skip_ms,actions,displays,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') error_time,error_code,error_msg
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,\`start\`.entry,\`start\`.loading_time,\`start\`.open_ad_id,\`start\`.open_ad_ms,\`start\`.open_ad_skip_ms,actions,displays,err.error_code,err.msg error_msg,tsfrom ${APP}.ods_log_incwhere dt='$do_date'and err is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
set hive.execution.engine=spark;
"
dwd_traffic_page_view_inc="
set hive.cbo.enable=false;
insert overwrite table ${APP}.dwd_traffic_page_view_inc partition (dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,page_item,page_item_type,last_page_id,page_id,source_type,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') view_time,concat(mid_id,'-',last_value(session_start_point,true) over (partition by mid_id order by ts)) session_id,during_time
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,ts,if(page.last_page_id is null,ts,null) session_start_pointfrom ${APP}.ods_log_incwhere dt='$do_date'and page is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"
dwd_traffic_start_inc="
set hive.cbo.enable=false;
insert overwrite table ${APP}.dwd_traffic_start_inc partition(dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,entry,open_ad_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') action_time,loading_time,open_ad_ms,open_ad_skip_ms
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,\`start\`.entry,\`start\`.loading_time,\`start\`.open_ad_id,\`start\`.open_ad_ms,\`start\`.open_ad_skip_ms,tsfrom ${APP}.ods_log_incwhere dt='$do_date'and \`start\` is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"
dwd_user_login_inc="
insert overwrite table ${APP}.dwd_user_login_inc partition(dt='$do_date')
selectuser_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') login_time,channel,province_id,version_code,mid_id,brand,model,operate_system
from
(selectuser_id,channel,area_code,version_code,mid_id,brand,model,operate_system,tsfrom(selectuser_id,channel,area_code,version_code,mid_id,brand,model,operate_system,ts,row_number() over (partition by session_id order by ts) rnfrom(selectuser_id,channel,area_code,version_code,mid_id,brand,model,operate_system,ts,concat(mid_id,'-',last_value(session_start_point,true) over(partition by mid_id order by ts)) session_idfrom(selectcommon.uid user_id,common.ch channel,common.ar area_code,common.vc version_code,common.mid mid_id,common.ba brand,common.md model,common.os operate_system,ts,if(page.last_page_id is null,ts,null) session_start_pointfrom ${APP}.ods_log_incwhere dt='$do_date'and page is not null)t1)t2where user_id is not null)t3where rn=1
)t4
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on t4.area_code=bp.area_code;
"
dwd_user_register_inc="
insert overwrite table ${APP}.dwd_user_register_inc partition(dt)
selectui.user_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,channel,province_id,version_code,mid_id,brand,model,operate_system,date_format(create_time,'yyyy-MM-dd')
from
(selectdata.id user_id,data.create_timefrom ${APP}.ods_user_info_incwhere dt='$do_date'and type='bootstrap-insert'
)ui
left join
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_codefrom ${APP}.ods_log_incwhere dt='$do_date'and page.page_id='register'and common.uid is not null
)log
on ui.user_id=log.user_id
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"case $1 in"dwd_interaction_comment_inc" )hive -e "$dwd_interaction_comment_inc";;"dwd_interaction_favor_add_inc" )hive -e "$dwd_interaction_favor_add_inc";;"dwd_tool_coupon_get_inc" )hive -e "$dwd_tool_coupon_get_inc";;"dwd_tool_coupon_order_inc" )hive -e "$dwd_tool_coupon_order_inc";;"dwd_tool_coupon_pay_inc" )hive -e "$dwd_tool_coupon_pay_inc";;"dwd_trade_cancel_detail_inc" )hive -e "$dwd_trade_cancel_detail_inc";;"dwd_trade_cart_add_inc" )hive -e "$dwd_trade_cart_add_inc";;"dwd_trade_cart_full" )hive -e "$dwd_trade_cart_full";;"dwd_trade_order_detail_inc" )hive -e "$dwd_trade_order_detail_inc";;"dwd_trade_order_refund_inc" )hive -e "$dwd_trade_order_refund_inc";;"dwd_trade_pay_detail_suc_inc" )hive -e "$dwd_trade_pay_detail_suc_inc";;"dwd_trade_refund_pay_suc_inc" )hive -e "$dwd_trade_refund_pay_suc_inc";;"dwd_traffic_action_inc" )hive -e "$dwd_traffic_action_inc";;"dwd_traffic_display_inc" )hive -e "$dwd_traffic_display_inc";;"dwd_traffic_error_inc" )hive -e "$dwd_traffic_error_inc";;"dwd_traffic_page_view_inc" )hive -e "$dwd_traffic_page_view_inc";;"dwd_traffic_start_inc" )hive -e "$dwd_traffic_start_inc";;"dwd_user_login_inc" )hive -e "$dwd_user_login_inc";;"dwd_user_register_inc" )hive -e "$dwd_user_register_inc";;"all" )hive -e "$dwd_interaction_comment_inc$dwd_interaction_favor_add_inc$dwd_tool_coupon_get_inc$dwd_tool_coupon_order_inc$dwd_tool_coupon_pay_inc$dwd_trade_cancel_detail_inc$dwd_trade_cart_add_inc$dwd_trade_cart_full$dwd_trade_order_detail_inc$dwd_trade_order_refund_inc$dwd_trade_pay_detail_suc_inc$dwd_trade_refund_pay_suc_inc$dwd_traffic_action_inc$dwd_traffic_display_inc$dwd_traffic_error_inc$dwd_traffic_page_view_inc$dwd_traffic_start_inc$dwd_user_login_inc$dwd_user_register_inc"
esac

(3)增加脚本执行权限

[atguigu@hadoop102 bin]$ chmod +x ods_to_dwd_init.sh

(4)脚本用法

[atguigu@hadoop102 bin]$ ods_to_dwd_init.sh all 2020-06-14

9.20.2 每日装载脚本

(1)在hadoop102的/home/atguigu/bin目录下创建ods_to_dwd.sh

[atguigu@hadoop102 bin]$ vim ods_to_dwd.sh

(2)编写如下内容

#!/bin/bashAPP=gmall# 如果是输入的日期按照取输入日期;如果没输入日期取当前时间的前一天if [ -n "$2" ] ;thendo_date=$2
else do_date=`date -d "-1 day" +%F`
fidwd_interaction_comment_inc="
insert overwrite table ${APP}.dwd_interaction_comment_inc partition(dt='$do_date')
selectid,user_id,sku_id,order_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,appraise,dic_name
from
(selectdata.id,data.user_id,data.sku_id,data.order_id,data.create_time,data.appraisefrom ${APP}.ods_comment_info_incwhere dt='$do_date'and type='insert'
)ci
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='12'
)dic
on ci.appraise=dic.dic_code;
"
dwd_interaction_favor_add_inc="
insert overwrite table ${APP}.dwd_interaction_favor_add_inc partition(dt='$do_date')
selectdata.id,data.user_id,data.sku_id,date_format(data.create_time,'yyyy-MM-dd') date_id,data.create_time
from ${APP}.ods_favor_info_inc
where dt='$do_date'
and type = 'insert';
"dwd_tool_coupon_get_inc="
insert overwrite table ${APP}.dwd_tool_coupon_get_inc partition (dt='$do_date')
selectdata.id,data.coupon_id,data.user_id,date_format(data.get_time,'yyyy-MM-dd') date_id,data.get_time
from ${APP}.ods_coupon_use_inc
where dt='$do_date'
and type='insert';
"
dwd_tool_coupon_order_inc="
insert overwrite table ${APP}.dwd_tool_coupon_order_inc partition(dt='$do_date')
selectdata.id,data.coupon_id,data.user_id,data.order_id,date_format(data.using_time,'yyyy-MM-dd') date_id,data.using_time
from ${APP}.ods_coupon_use_inc
where dt='$do_date'
and type='update'
and array_contains(map_keys(old),'using_time');
"
dwd_tool_coupon_pay_inc="
insert overwrite table ${APP}.dwd_tool_coupon_pay_inc partition(dt='$do_date')
selectdata.id,data.coupon_id,data.user_id,data.order_id,date_format(data.used_time,'yyyy-MM-dd') date_id,data.used_time
from ${APP}.ods_coupon_use_inc
where dt='$do_date'
and type='update'
and array_contains(map_keys(old),'used_time');
"
dwd_trade_cancel_detail_inc="
insert overwrite table ${APP}.dwd_trade_cancel_detail_inc partition (dt='$do_date')
selectod.id,order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,date_format(canel_time,'yyyy-MM-dd') date_id,canel_time,source_id,source_type,dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount
from
(selectdata.id,data.order_id,data.sku_id,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ${APP}.ods_order_detail_incwhere (dt='$do_date' or dt=date_add('$do_date',-1))and (type = 'insert' or type= 'bootstrap-insert')
) od
join
(selectdata.id,data.user_id,data.province_id,data.operate_time canel_timefrom ${APP}.ods_order_info_incwhere dt = '$do_date'and type = 'update'and data.order_status='1003'and array_contains(map_keys(old),'order_status')
) oi
on order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ${APP}.ods_order_detail_activity_incwhere (dt='$do_date' or dt=date_add('$do_date',-1))and (type = 'insert' or type= 'bootstrap-insert')
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ${APP}.ods_order_detail_coupon_incwhere (dt='$do_date' or dt=date_add('$do_date',-1))and (type = 'insert' or type= 'bootstrap-insert')
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='24'
)dic
on od.source_type=dic.dic_code;
"dwd_trade_cart_add_inc="
insert overwrite table ${APP}.dwd_trade_cart_add_inc partition(dt='$do_date')
selectid,user_id,sku_id,date_id,create_time,source_id,source_type_code,source_type_name,sku_num
from
(selectdata.id,data.user_id,data.sku_id,date_format(from_utc_timestamp(ts*1000,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts*1000,'GMT+8'),'yyyy-MM-dd HH:mm:ss') create_time,data.source_id,data.source_type source_type_code,if(type='insert',data.sku_num,data.sku_num-old['sku_num']) sku_numfrom ${APP}.ods_cart_info_incwhere dt='$do_date'and (type='insert'or(type='update' and old['sku_num'] is not null and data.sku_num>cast(old['sku_num'] as int)))
)cart
left join
(selectdic_code,dic_name source_type_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='24'
)dic
on cart.source_type_code=dic.dic_code;
"
dwd_trade_cart_full="
insert overwrite table ${APP}.dwd_trade_cart_full partition(dt='$do_date')
selectid,user_id,sku_id,sku_name,sku_num
from ${APP}.ods_cart_info_full
where dt='$do_date'
and is_ordered='0';
"
dwd_trade_order_detail_inc="
insert overwrite table ${APP}.dwd_trade_order_detail_inc partition (dt='$do_date')
selectod.id,order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,date_id,create_time,source_id,source_type,dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount
from
(selectdata.id,data.order_id,data.sku_id,date_format(data.create_time, 'yyyy-MM-dd') date_id,data.create_time,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ${APP}.ods_order_detail_incwhere dt = '$do_date'and type = 'insert'
) od
left join
(selectdata.id,data.user_id,data.province_idfrom ${APP}.ods_order_info_incwhere dt = '$do_date'and type = 'insert'
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ${APP}.ods_order_detail_activity_incwhere dt = '$do_date'and type = 'insert'
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ${APP}.ods_order_detail_coupon_incwhere dt = '$do_date'and type = 'insert'
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='24'
)dic
on od.source_type=dic.dic_code;
"
dwd_trade_order_refund_inc="
insert overwrite table ${APP}.dwd_trade_order_refund_inc partition(dt='$do_date')
selectri.id,user_id,order_id,sku_id,province_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,refund_type,type_dic.dic_name,refund_reason_type,reason_dic.dic_name,refund_reason_txt,refund_num,refund_amount
from
(selectdata.id,data.user_id,data.order_id,data.sku_id,data.refund_type,data.refund_num,data.refund_amount,data.refund_reason_type,data.refund_reason_txt,data.create_timefrom ${APP}.ods_order_refund_info_incwhere dt='$do_date'and type='insert'
)ri
left join
(selectdata.id,data.province_idfrom ${APP}.ods_order_info_incwhere dt='$do_date'and type='update'and data.order_status='1005'and array_contains(map_keys(old),'order_status')
)oi
on ri.order_id=oi.id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code = '15'
)type_dic
on ri.refund_type=type_dic.dic_code
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code = '13'
)reason_dic
on ri.refund_reason_type=reason_dic.dic_code;
"dwd_trade_pay_detail_suc_inc="
insert overwrite table ${APP}.dwd_trade_pay_detail_suc_inc partition (dt='$do_date')
selectod.id,od.order_id,user_id,sku_id,province_id,activity_id,activity_rule_id,coupon_id,payment_type,pay_dic.dic_name,date_format(callback_time,'yyyy-MM-dd') date_id,callback_time,source_id,source_type,src_dic.dic_name,sku_num,split_original_amount,split_activity_amount,split_coupon_amount,split_total_amount
from
(selectdata.id,data.order_id,data.sku_id,data.source_id,data.source_type,data.sku_num,data.sku_num * data.order_price split_original_amount,data.split_total_amount,data.split_activity_amount,data.split_coupon_amountfrom ${APP}.ods_order_detail_incwhere (dt = '$do_date' or dt = date_add('$do_date',-1))and (type = 'insert' or type = 'bootstrap-insert')
) od
join
(selectdata.user_id,data.order_id,data.payment_type,data.callback_timefrom ${APP}.ods_payment_info_incwhere dt='$do_date'and type='update'and array_contains(map_keys(old),'payment_status')and data.payment_status='1602'
) pi
on od.order_id=pi.order_id
left join
(selectdata.id,data.province_idfrom ${APP}.ods_order_info_incwhere (dt = '$do_date' or dt = date_add('$do_date',-1))and (type = 'insert' or type = 'bootstrap-insert')
) oi
on od.order_id = oi.id
left join
(selectdata.order_detail_id,data.activity_id,data.activity_rule_idfrom ${APP}.ods_order_detail_activity_incwhere (dt = '$do_date' or dt = date_add('$do_date',-1))and (type = 'insert' or type = 'bootstrap-insert')
) act
on od.id = act.order_detail_id
left join
(selectdata.order_detail_id,data.coupon_idfrom ${APP}.ods_order_detail_coupon_incwhere (dt = '$do_date' or dt = date_add('$do_date',-1))and (type = 'insert' or type = 'bootstrap-insert')
) cou
on od.id = cou.order_detail_id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='11'
) pay_dic
on pi.payment_type=pay_dic.dic_code
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='24'
)src_dic
on od.source_type=src_dic.dic_code;
"
dwd_trade_refund_pay_suc_inc="
insert overwrite table ${APP}.dwd_trade_refund_pay_suc_inc partition(dt='$do_date')
selectrp.id,user_id,rp.order_id,rp.sku_id,province_id,payment_type,dic_name,date_format(callback_time,'yyyy-MM-dd') date_id,callback_time,refund_num,total_amount
from
(selectdata.id,data.order_id,data.sku_id,data.payment_type,data.callback_time,data.total_amountfrom ${APP}.ods_refund_payment_incwhere dt='$do_date'and type = 'update'and array_contains(map_keys(old),'refund_status')and data.refund_status='1602'
)rp
left join
(selectdata.id,data.user_id,data.province_idfrom ${APP}.ods_order_info_incwhere dt='$do_date'and type='update'and data.order_status='1006'and array_contains(map_keys(old),'order_status')
)oi
on rp.order_id=oi.id
left join
(selectdata.order_id,data.sku_id,data.refund_numfrom ${APP}.ods_order_refund_info_incwhere dt='$do_date'and type='update'and data.refund_status='0705'and array_contains(map_keys(old),'refund_status')
)ri
on rp.order_id=ri.order_id
and rp.sku_id=ri.sku_id
left join
(selectdic_code,dic_namefrom ${APP}.ods_base_dic_fullwhere dt='$do_date'and parent_code='11'
)dic
on rp.payment_type=dic.dic_code;
"
dwd_traffic_action_inc="
set hive.cbo.enable=false;
insert overwrite table ${APP}.dwd_traffic_action_inc partition(dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,during_time,page_item,page_item_type,last_page_id,page_id,source_type,action_id,action_item,action_item_type,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') action_time
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,action.action_id,action.item action_item,action.item_type action_item_type,action.tsfrom ${APP}.ods_log_inc lateral view explode(actions) tmp as actionwhere dt='$do_date'and actions is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"
dwd_traffic_display_inc="
set hive.cbo.enable=false;
insert overwrite table ${APP}.dwd_traffic_display_inc partition(dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,during_time,page_item,page_item_type,last_page_id,page_id,source_type,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') display_time,display_type,display_item,display_item_type,display_order,display_pos_id
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,display.display_type,display.item display_item,display.item_type display_item_type,display.\`order\` display_order,display.pos_id display_pos_id,tsfrom ${APP}.ods_log_inc lateral view explode(displays) tmp as displaywhere dt='$do_date'and displays is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"
dwd_traffic_error_inc="
set hive.cbo.enable=false;
set hive.execution.engine=mr;
insert overwrite table ${APP}.dwd_traffic_error_inc partition(dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,page_item,page_item_type,last_page_id,page_id,source_type,entry,loading_time,open_ad_id,open_ad_ms,open_ad_skip_ms,actions,displays,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') error_time,error_code,error_msg
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,\`start\`.entry,\`start\`.loading_time,\`start\`.open_ad_id,\`start\`.open_ad_ms,\`start\`.open_ad_skip_ms,actions,displays,err.error_code,err.msg error_msg,tsfrom ${APP}.ods_log_incwhere dt='$do_date'and err is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
set hive.execution.engine=spark;
"
dwd_traffic_page_view_inc="
set hive.cbo.enable=false;
insert overwrite table ${APP}.dwd_traffic_page_view_inc partition (dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,page_item,page_item_type,last_page_id,page_id,source_type,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') view_time,concat(mid_id,'-',last_value(session_start_point,true) over (partition by mid_id order by ts)) session_id,during_time
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,page.during_time,page.item page_item,page.item_type page_item_type,page.last_page_id,page.page_id,page.source_type,ts,if(page.last_page_id is null,ts,null) session_start_pointfrom ${APP}.ods_log_incwhere dt='$do_date'and page is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"
dwd_traffic_start_inc="
set hive.cbo.enable=false;
insert overwrite table ${APP}.dwd_traffic_start_inc partition(dt='$do_date')
selectprovince_id,brand,channel,is_new,model,mid_id,operate_system,user_id,version_code,entry,open_ad_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') action_time,loading_time,open_ad_ms,open_ad_skip_ms
from
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.is_new,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_code,\`start\`.entry,\`start\`.loading_time,\`start\`.open_ad_id,\`start\`.open_ad_ms,\`start\`.open_ad_skip_ms,tsfrom ${APP}.ods_log_incwhere dt='$do_date'and \`start\` is not null
)log
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"
dwd_user_login_inc="
insert overwrite table ${APP}.dwd_user_login_inc partition(dt='$do_date')
selectuser_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd') date_id,date_format(from_utc_timestamp(ts,'GMT+8'),'yyyy-MM-dd HH:mm:ss') login_time,channel,province_id,version_code,mid_id,brand,model,operate_system
from
(selectuser_id,channel,area_code,version_code,mid_id,brand,model,operate_system,tsfrom(selectuser_id,channel,area_code,version_code,mid_id,brand,model,operate_system,ts,row_number() over (partition by session_id order by ts) rnfrom(selectuser_id,channel,area_code,version_code,mid_id,brand,model,operate_system,ts,concat(mid_id,'-',last_value(session_start_point,true) over(partition by mid_id order by ts)) session_idfrom(selectcommon.uid user_id,common.ch channel,common.ar area_code,common.vc version_code,common.mid mid_id,common.ba brand,common.md model,common.os operate_system,ts,if(page.last_page_id is null,ts,null) session_start_pointfrom ${APP}.ods_log_incwhere dt='$do_date'and page is not null)t1)t2where user_id is not null)t3where rn=1
)t4
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on t4.area_code=bp.area_code;
"
dwd_user_register_inc="
insert overwrite table ${APP}.dwd_user_register_inc partition(dt='$do_date')
selectui.user_id,date_format(create_time,'yyyy-MM-dd') date_id,create_time,channel,province_id,version_code,mid_id,brand,model,operate_system
from
(selectdata.id user_id,data.create_timefrom ${APP}.ods_user_info_incwhere dt='$do_date'and type='insert'
)ui
left join
(selectcommon.ar area_code,common.ba brand,common.ch channel,common.md model,common.mid mid_id,common.os operate_system,common.uid user_id,common.vc version_codefrom ${APP}.ods_log_incwhere dt='$do_date'and page.page_id='register'and common.uid is not null
)log
on ui.user_id=log.user_id
left join
(selectid province_id,area_codefrom ${APP}.ods_base_province_fullwhere dt='$do_date'
)bp
on log.area_code=bp.area_code;
"
case $1 in"dwd_interaction_comment_inc" )hive -e "$dwd_interaction_comment_inc";;"dwd_interaction_favor_add_inc" )hive -e "$dwd_interaction_favor_add_inc";;"dwd_tool_coupon_get_inc" )hive -e "$dwd_tool_coupon_get_inc";;"dwd_tool_coupon_order_inc" )hive -e "$dwd_tool_coupon_order_inc";;"dwd_tool_coupon_pay_inc" )hive -e "$dwd_tool_coupon_pay_inc";;"dwd_trade_cancel_detail_inc" )hive -e "$dwd_trade_cancel_detail_inc";;"dwd_trade_cart_add_inc" )hive -e "$dwd_trade_cart_add_inc";;"dwd_trade_cart_full" )hive -e "$dwd_trade_cart_full";;"dwd_trade_order_detail_inc" )hive -e "$dwd_trade_order_detail_inc";;"dwd_trade_order_refund_inc" )hive -e "$dwd_trade_order_refund_inc";;"dwd_trade_pay_detail_suc_inc" )hive -e "$dwd_trade_pay_detail_suc_inc";;"dwd_trade_refund_pay_suc_inc" )hive -e "$dwd_trade_refund_pay_suc_inc";;"dwd_traffic_action_inc" )hive -e "$dwd_traffic_action_inc";;"dwd_traffic_display_inc" )hive -e "$dwd_traffic_display_inc";;"dwd_traffic_error_inc" )hive -e "$dwd_traffic_error_inc";;"dwd_traffic_page_view_inc" )hive -e "$dwd_traffic_page_view_inc";;"dwd_traffic_start_inc" )hive -e "$dwd_traffic_start_inc";;"dwd_user_login_inc" )hive -e "$dwd_user_login_inc";;"dwd_user_register_inc" )hive -e "$dwd_user_register_inc";;"all" )hive -e "$dwd_interaction_comment_inc$dwd_interaction_favor_add_inc$dwd_tool_coupon_get_inc$dwd_tool_coupon_order_inc$dwd_tool_coupon_pay_inc$dwd_trade_cancel_detail_inc$dwd_trade_cart_add_inc$dwd_trade_cart_full$dwd_trade_order_detail_inc$dwd_trade_order_refund_inc$dwd_trade_pay_detail_suc_inc$dwd_trade_refund_pay_suc_inc$dwd_traffic_action_inc$dwd_traffic_display_inc$dwd_traffic_error_inc$dwd_traffic_page_view_inc$dwd_traffic_start_inc$dwd_user_login_inc$dwd_user_register_inc"
esac

(3)增加脚本执行权限

[atguigu@hadoop102 bin]$ chmod +x ods_to_dwd.sh

(4)脚本用法

[atguigu@hadoop102 bin]$ ods_to_dwd.sh all 2020-06-14

operate_system,
ts,
concat(mid_id,‘-’,last_value(session_start_point,true) over(partition by mid_id order by ts)) session_id
from
(
select
common.uid user_id,
common.ch channel,
common.ar area_code,
common.vc version_code,
common.mid mid_id,
common.ba brand,
common.md model,
common.os operate_system,
ts,
if(page.last_page_id is null,ts,null) session_start_point
from APP.odslogincwheredt=′{APP}.ods_log_inc where dt='APP.odsl​ogi​ncwheredt=′do_date’
and page is not null
)t1
)t2
where user_id is not null
)t3
where rn=1
)t4
left join
(
select
id province_id,
area_code
from APP.odsbaseprovincefullwheredt=′{APP}.ods_base_province_full where dt='APP.odsb​asep​rovincef​ullwheredt=′do_date’
)bp
on t4.area_code=bp.area_code;
"
dwd_user_register_inc="
insert overwrite table APP.dwduserregisterincpartition(dt=′{APP}.dwd_user_register_inc partition(dt='APP.dwdu​serr​egisteri​ncpartition(dt=′do_date’)
select
ui.user_id,
date_format(create_time,‘yyyy-MM-dd’) date_id,
create_time,
channel,
province_id,
version_code,
mid_id,
brand,
model,
operate_system
from
(
select
data.id user_id,
data.create_time
from APP.odsuserinfoincwheredt=′{APP}.ods_user_info_inc where dt='APP.odsu​seri​nfoi​ncwheredt=′do_date’
and type=‘insert’
)ui
left join
(
select
common.ar area_code,
common.ba brand,
common.ch channel,
common.md model,
common.mid mid_id,
common.os operate_system,
common.uid user_id,
common.vc version_code
from APP.odslogincwheredt=′{APP}.ods_log_inc where dt='APP.odsl​ogi​ncwheredt=′do_date’
and page.page_id=‘register’
and common.uid is not null
)log
on ui.user_id=log.user_id
left join
(
select
id province_id,
area_code
from APP.odsbaseprovincefullwheredt=′{APP}.ods_base_province_full where dt='APP.odsb​asep​rovincef​ullwheredt=′do_date’
)bp
on log.area_code=bp.area_code;
"
case 1in"dwdinteractioncommentinc")hive−e"1 in "dwd_interaction_comment_inc" ) hive -e "1in"dwdi​nteractionc​ommenti​nc")hive−e"dwd_interaction_comment_inc"
;;
“dwd_interaction_favor_add_inc” )
hive -e “dwdinteractionfavoraddinc";;"dwdtoolcoupongetinc")hive−e"dwd_interaction_favor_add_inc" ;; "dwd_tool_coupon_get_inc" ) hive -e "dwdi​nteractionf​avora​ddi​nc";;"dwdt​oolc​oupong​eti​nc")hive−e"dwd_tool_coupon_get_inc”
;;
“dwd_tool_coupon_order_inc” )
hive -e “dwdtoolcouponorderinc";;"dwdtoolcouponpayinc")hive−e"dwd_tool_coupon_order_inc" ;; "dwd_tool_coupon_pay_inc" ) hive -e "dwdt​oolc​oupono​rderi​nc";;"dwdt​oolc​ouponp​ayi​nc")hive−e"dwd_tool_coupon_pay_inc”
;;
“dwd_trade_cancel_detail_inc” )
hive -e “dwdtradecanceldetailinc";;"dwdtradecartaddinc")hive−e"dwd_trade_cancel_detail_inc" ;; "dwd_trade_cart_add_inc" ) hive -e "dwdt​radec​anceld​etaili​nc";;"dwdt​radec​arta​ddi​nc")hive−e"dwd_trade_cart_add_inc”
;;
“dwd_trade_cart_full” )
hive -e “dwdtradecartfull";;"dwdtradeorderdetailinc")hive−e"dwd_trade_cart_full" ;; "dwd_trade_order_detail_inc" ) hive -e "dwdt​radec​artf​ull";;"dwdt​radeo​rderd​etaili​nc")hive−e"dwd_trade_order_detail_inc”
;;
“dwd_trade_order_refund_inc” )
hive -e “dwdtradeorderrefundinc";;"dwdtradepaydetailsucinc")hive−e"dwd_trade_order_refund_inc" ;; "dwd_trade_pay_detail_suc_inc" ) hive -e "dwdt​radeo​rderr​efundi​nc";;"dwdt​radep​ayd​etails​uci​nc")hive−e"dwd_trade_pay_detail_suc_inc”
;;
“dwd_trade_refund_pay_suc_inc” )
hive -e “dwdtraderefundpaysucinc";;"dwdtrafficactioninc")hive−e"dwd_trade_refund_pay_suc_inc" ;; "dwd_traffic_action_inc" ) hive -e "dwdt​rader​efundp​ays​uci​nc";;"dwdt​raffica​ctioni​nc")hive−e"dwd_traffic_action_inc”
;;
“dwd_traffic_display_inc” )
hive -e “dwdtrafficdisplayinc";;"dwdtrafficerrorinc")hive−e"dwd_traffic_display_inc" ;; "dwd_traffic_error_inc" ) hive -e "dwdt​rafficd​isplayi​nc";;"dwdt​raffice​rrori​nc")hive−e"dwd_traffic_error_inc”
;;
“dwd_traffic_page_view_inc” )
hive -e “dwdtrafficpageviewinc";;"dwdtrafficstartinc")hive−e"dwd_traffic_page_view_inc" ;; "dwd_traffic_start_inc" ) hive -e "dwdt​rafficp​agev​iewi​nc";;"dwdt​raffics​tarti​nc")hive−e"dwd_traffic_start_inc”
;;
“dwd_user_login_inc” )
hive -e “dwduserlogininc";;"dwduserregisterinc")hive−e"dwd_user_login_inc" ;; "dwd_user_register_inc" ) hive -e "dwdu​serl​ogini​nc";;"dwdu​serr​egisteri​nc")hive−e"dwd_user_register_inc”
;;
“all” )
hive -e “dwdinteractioncommentincdwd_interaction_comment_incdwdi​nteractionc​ommenti​ncdwd_interaction_favor_add_incdwdtoolcoupongetincdwd_tool_coupon_get_incdwdt​oolc​oupong​eti​ncdwd_tool_coupon_order_incdwdtoolcouponpayincdwd_tool_coupon_pay_incdwdt​oolc​ouponp​ayi​ncdwd_trade_cancel_detail_incdwdtradecartaddincdwd_trade_cart_add_incdwdt​radec​arta​ddi​ncdwd_trade_cart_fulldwdtradeorderdetailincdwd_trade_order_detail_incdwdt​radeo​rderd​etaili​ncdwd_trade_order_refund_incdwdtradepaydetailsucincdwd_trade_pay_detail_suc_incdwdt​radep​ayd​etails​uci​ncdwd_trade_refund_pay_suc_incdwdtrafficactionincdwd_traffic_action_incdwdt​raffica​ctioni​ncdwd_traffic_display_incdwdtrafficerrorincdwd_traffic_error_incdwdt​raffice​rrori​ncdwd_traffic_page_view_incdwdtrafficstartincdwd_traffic_start_incdwdt​raffics​tarti​ncdwd_user_login_inc$dwd_user_register_inc”
esac


(3)增加脚本执行权限

[atguigu@hadoop102 bin]$ chmod +x ods_to_dwd.sh


(4)脚本用法

[atguigu@hadoop102 bin]$ ods_to_dwd.sh all 2020-06-14


离线数仓12—— 数仓开发之DWD层相关推荐

  1. 9. 数仓开发之 DWD 层

    9. 数仓开发之 DWD 层 1. 交易域 1.1 加购事务事实表 购物车表(cart_info): 字典表(base_dic) : 建表 分区规划 数据流向 首日装载 每日装载 1.2 下单事务事实 ...

  2. 数仓开发之DWD层(二)

    目录 三:流量域用户跳出事务事实表 3.1 主要任务 3.2 思路分析 3.3 图解 3.4 代码 四:交易域加购事务事实表 4.1 主要任务 4.2 思路分析 4.3 图解 4.4 代码 三:流量域 ...

  3. 数仓开发之DWD层(一)

    目录 一:流量域未经加工的事务事实表 1.1 主要任务 1.2 思路 1.3 图解 1.4 代码 二:流量域独立访客事务事实表 2.1 主要任务 2.2 思路分析 2.3 图解 2.4 代码 DWD层 ...

  4. 离线实时一体化数仓与湖仓一体—云原生大数据平台的持续演进

    简介:阿里云智能研究员 林伟 :阿里巴巴从湖到仓的演进给我们带来了湖仓一体的思考,使得湖的灵活性.数据种类丰富与仓的可成长性和企业级管理得到有机融合,这是阿里巴巴最佳实践的宝贵资产,是大数据的新一代架 ...

  5. 电商离线数仓-业务数仓指标(GMV主题/转化率主题)

    GMV和转化率 GMV主题 GMV的概念 GMV表的创建 GMV表里导入数据 转化率 转化率概念 转化率表的创建 转化率表里导入数据 ADS层用户行为漏斗分析 GMV主题 GMV的概念 什么是GMV? ...

  6. 离线数仓 (十三) --------- DWD 层搭建

    目录 前言 一.DWD 层 (用户行为日志) 1. 日志解析思路 2. get_json_object 函数使用 3. 启动日志表 4. 页面日志表 5. 动作日志表 6. 曝光日志表 7. 错误日志 ...

  7. 数仓建模—数仓架构发展史(02)

    发展史 时代的变迁,生死的轮回,历史长河滔滔,没有什么是永恒的,只有变化才是不变的,技术亦是如此,当你选择互联网的那一刻,你就相当于乘坐了一个滚滚向前的时代列车,开往未知的方向,不论什么样的技术架构只 ...

  8. 金蝶ERP实现产品入库冲减生产现场虚仓毛坯数

    金蝶ERP实现产品入库冲减生产现场虚仓毛坯数,销售成品代码前一位数对应毛坯件的唯一代码,一一对应,具体语句如下: if exists(select 1 from sysobjects where na ...

  9. 数仓学习笔记(5)——数仓搭建(DWD层)

    目录 一.数仓搭建--DWD层 1.DWD层(用户行为日志) 1.1 日志解析思路 1.2 get_json_object函数使用 1.3 启动日志表 1.4 页面日志表 1.5 动作日志表 1.6 ...

最新文章

  1. 【网络知识点】防火墙主备冗余技术
  2. 使用GoLand创建并运行项目
  3. 5-1 逻辑回归代码(含warning解释)
  4. 关于待机、休眠、睡眠的区别和优缺点
  5. Silverlight 打印
  6. 出道50年+!乘风破浪的编程语言们,能二次翻红吗?
  7. 项目:关于分层,DAO、domain、service、servlet的一些理解
  8. 记录——《C Primer Plus (第五版)》第十章编程练习第九题
  9. 2018青岛大学计算机考研真题,2018年青岛大学师范学院880数学基础综合[专业硕士]之数学分析考研基础五套测试题...
  10. 【开源】在线教育系统.net源码-PC端-手机端
  11. 如何在ppt中生成柱状图_在PPT中怎么制作图表?PPT制作图表的方法
  12. CSS系列之 box-sizing
  13. 【k8s-5】kubeadm init过程的错误
  14. Photoshop技巧:[2]如何抠头发?
  15. python用bbp公式计算圆周率_圆周率π现在已经算到多少位了?具体是什么数字?...
  16. ios面试题 看了就知道哪方面需要加强上机练习
  17. matlab iri模型,IRI2012电离层模型 matlab代码
  18. Unity3D音频的制作
  19. Kubernetes集群高可用方案
  20. 大数据核心框架Hadoop

热门文章

  1. 云服务卸载MySQL
  2. 文件操作opendir()/readdir()/closedir()/stat()
  3. (附源码)springboot高校学生健康打卡系统的设计与实现 毕业设计 021009
  4. flam3 ubuntu 依赖文件
  5. 拷贝服务器文件有监控的,服务器文件拷贝监控
  6. Windows Server 2012 R2配置IIS搭载PHP发生HTTP500解决办法
  7. 中级计算机程序设计员,计算机程序设计员中级试卷A
  8. MySQL中DELETE操作磁盘空间不会减少的原因
  9. python 爬虫-(2)认识爬虫
  10. 30个HTML+CSS前端开发案例(一)