1.现象

spark_sql运行有报警如下;
有问题的sql:

select g.dt, frequent , wk , hr , user_id , k.`$name` as user_name , os , manufacturer , page_name , page_url , regexp_replace(button_name,'\\n|\\r|\\t','') as button_name , button_type , first_visit_time
, last_visit_time , pv , session_cnt , page_cnt , session_dur , total_dur , load_dur , max_load_dur , min_load_dur , search_content , search_cnt
, max_search_dur , min_search_dur , total_search_dur , max_search_cnt , page_visit_dur , buy_time , error_reason , type , uv , father , son , index,g.dt
from (
select dt , frequent , wk , hr , user_id , os , manufacturer , page_name , page_url , button_name , button_type , first_visit_time
, last_visit_time , pv , session_cnt , page_cnt , session_dur , total_dur , load_dur , max_load_dur , min_load_dur , search_content , search_cnt
, max_search_dur , min_search_dur , total_search_dur , max_search_cnt , page_visit_dur , buy_time , error_reason , type , uv , father , son , index
from day_total
union all select * from hour_total
union all select * from day_page
union all select * from day_button
union all select * from hour_error
union all select * from launch
union all select * from decision
union all select * from visit_back
union all select * from province
union all select * from os
union all select * from manufacturer
union all select * from roadmap1
union all select * from roadmap2
) g
left join users k on g.user_id = k.id

报警详细信息:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;at org.apache.spark.sql.util.SchemaUtils$.checkColumnNameDuplication(SchemaUtils.scala:85)at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:65)at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:668)at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:276)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:270)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:228)at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:656)at com.tcl.kudu.crumb_applet$.main(crumb_applet.scala:476)at com.tcl.kudu.crumb_applet.main(crumb_applet.scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:926)at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:935)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

2.问题解决

最后查询的sql有两个相同的dt 的字段, g.dt 删除一个后恢复,

Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;相关推荐

  1. SQL Error: 957, SQLState: 42000 ORA-00957: duplicate column name

    1. PositionLinkage config <?xml version="1.0" encoding="utf-8"?> <!DOCT ...

  2. SQL错误Duplicate column name 'NAME'名字重复应使用别名

    SQL错误Duplicate column name 'NAME'名字重复应使用别名 <select id="getCurrentAndInformation" result ...

  3. Java java.sql.SQLSyntaxErrorException:Duplicate column name ‘xxx‘问题解决

    问题描述: java.sql.SQLSyntaxErrorException: Duplicate column name 'username'; bad SQL grammar []; 问题分析: ...

  4. duplicate column name

    在建数据库表的字段时,duplicate column name是因为有重复列,删除一个或者改个别的名字就可以了

  5. Duplicate column name错误办法

    mysql报错Duplicate column name错误怎么解决? 将重复的列名重命名即可.

  6. 创建表时出现Duplicate column product问题的解决

    本文将介绍MySQL在创建表时出现'Duplicate column product'问题的解决方法,问题如下. 在应用表自连接时,出现Duplicate column product的问题. 重复的 ...

  7. Navicat 筛选或插入某个字段出现1060 - Duplicate column name ‘XXX‘错误,以及导入sql文件时数据丢失问题。

    在mysql中,多个表联合查询或添加某个字段时,出现错误:[Err] 1060 - Duplicate column name 'XXX',主要原因是表中存在重复字段造成的结果,分两种情况: (1)使 ...

  8. mysql错误代码: 1060 Duplicate column name ‘sno‘

    文章目录 报错信息 报错原因解释 解决办法 报错信息 错误代码: 1060 Duplicate column name 'sno' 报错:重复列名'sno' 报错原因解释 官方说法:当查出来的虚拟表中 ...

  9. 解决 duplicate column name

    duplicate column name ,列名重复. 原因: 1.有重复的列 解决: 1.删除其中一个列

最新文章

  1. 什么是ECS以及如何使用登陆
  2. 黑客入侵“在线影院”全过程2
  3. 小程序使用wxParse解析html
  4. MySQL表碎片化(Table Fragmentation)以及处理
  5. Hybris Enterprise Commerce Platform 服务层的设计与实现
  6. Hadoop 2.2.0源码浏览:4. NodeManager
  7. 网页加载出现没有合适的负载均衡器_分布式必知必会-七层负载和四层负载到底是什么?...
  8. 计组之存储系统:2、SRAM(区别、栅极电容、双稳态触发器、DRAM刷新、地址复用)和DRAM(MROM、PROM、EPROM、EEPROM)
  9. 如何从Web浏览器远程监视Linux服务器和桌面
  10. python 线程 的类库_python类库32[多线程同步Lock+RLock+Semaphore+Event]
  11. java rsa2加密算法_java RSA加密解密
  12. paip.DEVSUIT WEB .NET ASPX网站打开慢的原因
  13. NoSQL数据库简介——《大数据技术原理与应用》课程学习总结
  14. HTML5 Notification实现浏览器通知
  15. 2022年新冠疫情后上海的电子商务,数字化经济可能的新趋势
  16. 2017年总结:人生百味,有你真好
  17. Laravel 消息通知使用 EasySms 短信包插件
  18. Jini 能给您带来什么
  19. 2023年Java学习路线图(适合自学详细版)
  20. sql 约束(sql server 环境)

热门文章

  1. 使用正则表达式 匹配 HTML 标签内的内容
  2. Amazon亚马逊开发者账号申请
  3. OPNET报错总结及注意事项
  4. 安徽理工大学计算机研究生学院,计算机学院第二届研究生学术论坛圆满闭幕
  5. OpenGL入门学习 (转)
  6. 单片机c语言1ms 2ms 4ms方波,第4章 7~10节 单片机C语言.ppt
  7. poj 2228 Naptime(DP的后效性处理)
  8. 【信号系统实验2】MATLAB—连续时间信号与系统的频域分析
  9. Gmail 邮箱访问登录
  10. RE模块:Python编译正则的模块