Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;
1.现象
spark_sql运行有报警如下;
有问题的sql:
select g.dt, frequent , wk , hr , user_id , k.`$name` as user_name , os , manufacturer , page_name , page_url , regexp_replace(button_name,'\\n|\\r|\\t','') as button_name , button_type , first_visit_time
, last_visit_time , pv , session_cnt , page_cnt , session_dur , total_dur , load_dur , max_load_dur , min_load_dur , search_content , search_cnt
, max_search_dur , min_search_dur , total_search_dur , max_search_cnt , page_visit_dur , buy_time , error_reason , type , uv , father , son , index,g.dt
from (
select dt , frequent , wk , hr , user_id , os , manufacturer , page_name , page_url , button_name , button_type , first_visit_time
, last_visit_time , pv , session_cnt , page_cnt , session_dur , total_dur , load_dur , max_load_dur , min_load_dur , search_content , search_cnt
, max_search_dur , min_search_dur , total_search_dur , max_search_cnt , page_visit_dur , buy_time , error_reason , type , uv , father , son , index
from day_total
union all select * from hour_total
union all select * from day_page
union all select * from day_button
union all select * from hour_error
union all select * from launch
union all select * from decision
union all select * from visit_back
union all select * from province
union all select * from os
union all select * from manufacturer
union all select * from roadmap1
union all select * from roadmap2
) g
left join users k on g.user_id = k.id
报警详细信息:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;at org.apache.spark.sql.util.SchemaUtils$.checkColumnNameDuplication(SchemaUtils.scala:85)at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:65)at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:668)at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:276)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:270)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:228)at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:656)at com.tcl.kudu.crumb_applet$.main(crumb_applet.scala:476)at com.tcl.kudu.crumb_applet.main(crumb_applet.scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:926)at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:935)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2.问题解决
最后查询的sql有两个相同的dt 的字段, g.dt 删除一个后恢复,
Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;相关推荐
- SQL Error: 957, SQLState: 42000 ORA-00957: duplicate column name
1. PositionLinkage config <?xml version="1.0" encoding="utf-8"?> <!DOCT ...
- SQL错误Duplicate column name 'NAME'名字重复应使用别名
SQL错误Duplicate column name 'NAME'名字重复应使用别名 <select id="getCurrentAndInformation" result ...
- Java java.sql.SQLSyntaxErrorException:Duplicate column name ‘xxx‘问题解决
问题描述: java.sql.SQLSyntaxErrorException: Duplicate column name 'username'; bad SQL grammar []; 问题分析: ...
- duplicate column name
在建数据库表的字段时,duplicate column name是因为有重复列,删除一个或者改个别的名字就可以了
- Duplicate column name错误办法
mysql报错Duplicate column name错误怎么解决? 将重复的列名重命名即可.
- 创建表时出现Duplicate column product问题的解决
本文将介绍MySQL在创建表时出现'Duplicate column product'问题的解决方法,问题如下. 在应用表自连接时,出现Duplicate column product的问题. 重复的 ...
- Navicat 筛选或插入某个字段出现1060 - Duplicate column name ‘XXX‘错误,以及导入sql文件时数据丢失问题。
在mysql中,多个表联合查询或添加某个字段时,出现错误:[Err] 1060 - Duplicate column name 'XXX',主要原因是表中存在重复字段造成的结果,分两种情况: (1)使 ...
- mysql错误代码: 1060 Duplicate column name ‘sno‘
文章目录 报错信息 报错原因解释 解决办法 报错信息 错误代码: 1060 Duplicate column name 'sno' 报错:重复列名'sno' 报错原因解释 官方说法:当查出来的虚拟表中 ...
- 解决 duplicate column name
duplicate column name ,列名重复. 原因: 1.有重复的列 解决: 1.删除其中一个列
最新文章
- 什么是ECS以及如何使用登陆
- 黑客入侵“在线影院”全过程2
- 小程序使用wxParse解析html
- MySQL表碎片化(Table Fragmentation)以及处理
- Hybris Enterprise Commerce Platform 服务层的设计与实现
- Hadoop 2.2.0源码浏览:4. NodeManager
- 网页加载出现没有合适的负载均衡器_分布式必知必会-七层负载和四层负载到底是什么?...
- 计组之存储系统:2、SRAM(区别、栅极电容、双稳态触发器、DRAM刷新、地址复用)和DRAM(MROM、PROM、EPROM、EEPROM)
- 如何从Web浏览器远程监视Linux服务器和桌面
- python 线程 的类库_python类库32[多线程同步Lock+RLock+Semaphore+Event]
- java rsa2加密算法_java RSA加密解密
- paip.DEVSUIT WEB .NET ASPX网站打开慢的原因
- NoSQL数据库简介——《大数据技术原理与应用》课程学习总结
- HTML5 Notification实现浏览器通知
- 2022年新冠疫情后上海的电子商务,数字化经济可能的新趋势
- 2017年总结:人生百味,有你真好
- Laravel 消息通知使用 EasySms 短信包插件
- Jini 能给您带来什么
- 2023年Java学习路线图(适合自学详细版)
- sql 约束(sql server 环境)