10 Common Mistakes Java Developers Make when Writing SQL

Java developers mix object-oriented thinking with imperative thinking, depending on their levels of:

Skill (anyone can code imperatively)Dogma (some use the “Pattern-Pattern”, i.e. the pattern of applying patterns everywhere and giving them names)Mood (true OO is more clumsy to write than imperative code. At first)

But when Java developers write SQL, everything changes. SQL is a declarative language that has nothing to do with either object-oriented or imperative thinking. It is very easy to express a query in SQL. It is not so easy to express it optimally or correctly. Not only do developers need to re-think their programming paradigm, they also need to think in terms of set theory.

Here are common mistakes that a Java developer makes when writing SQL through JDBC or jOOQ (in no particular order). For 10 More Common Mistakes, see this article here.

Here are common mistakes that a Java developer makes when writing SQL (in no particular order):

1. Forgetting about NULL

Misunderstanding NULL is probably the biggest mistake a Java developer can make when writing SQL. This is also (but not exclusively) due to the fact that NULL is also called UNKNOWN. If it were only called UNKNOWN, it would be easier to understand. Another reason is that JDBC maps SQL NULL to Java null when fetching data or when binding variables. This may lead to thinking that NULL = NULL (SQL) would behave the same way as null == null (Java)

One of the crazier examples of misunderstanding NULL is when NULL predicates are used with row value expressions.

Another, subtle problem appears when misunderstanding the meaning ofNULL in NOT IN anti-joins.

The Cure:

Train yourself. There’s nothing but explicitly thinking about NULL, every time you write SQL:

Is this predicate correct with respect to NULL?Does NULL affect the result of this function?

2. Processing data in Java memory

Few Java developers know SQL very well. The occasional JOIN, the odd UNION, fine. But window functions? Grouping sets? A lot of Java developers load SQL data into memory, transform the data into some appropriate collection type, execute nasty maths on that collection with verbose loop structures (at least, before Java 8’s Collection improvements).

But some SQL databases support advanced (and SQL standard!) OLAP features that tend to perform a lot better and are much easier to write. A (non-standard) example is Oracle’s awesome MODEL clause. Just let the database do the processing and fetch only the results into Java memory. Because after all some very smart guys have optimised these expensive products. So in fact, by moving OLAP to the database, you gain two things:

Simplicity. It’s probably easier to write correctly in SQL than in JavaPerformance. The database will probably be faster than your algorithm. And more importantly, you don’t have to transmit millions of records over the wire.

The Cure:

Every time you implement a data-centric algorithm in Java, ask yourself: Is there a way to let the database perform that work for me?

3. Using UNION instead of UNION ALL

It’s a shame that UNION ALL needs an extra keyword compared to UNION. It would be much better if the SQL standard had been defined to support:

UNION (allowing duplicates)UNION DISTINCT (removing duplicates)

Not only is the removal of duplicates rarely needed (or sometimes even wrong), it is also quite slow for large result sets with many columns, as the two subselects need to be ordered, and each tuple needs to be compared with its subsequent tuple.

Note that even if the SQL standard specifies INTERSECT ALL and EXCEPT ALL, hardly any database implements these less useful set operations.

The Cure:

Every time you write a UNION, think if you actually wanted to write UNION ALL.

4. Using JDBC Pagination to paginate large results

Most databases support some way of paginating ordered results through LIMIT .. OFFSET, TOP .. START AT, OFFSET .. FETCH clauses. In the absence of support for these clauses, there is still the possibility forROWNUM (Oracle) or ROW_NUMBER() OVER() filtering (DB2, SQL Server 2008 and less), which is much faster than pagination in memory. This is specifically true for large offsets!

The Cure:

Just use those clauses, or a tool (such as jOOQ) that can simulate those clauses for you.

5. Joining data in Java memory

From early days of SQL, some developers still have an uneasy feeling when expressing JOINs in their SQL. There is an inherent fear of JOIN being slow. This can be true if a cost-based optimiser chooses to perform a nested loop, possibly loading complete tables into database memory, before creating a joined table source. But that happens rarely. With appropriate predicates, constraints and indexes, MERGE JOIN and HASH JOIN operations are extremely fast. It’s all about the correct metadata (I cannot cite Tom Kyte often enough for this). Nonetheless, there are probably still quite a few Java developers who will load two tables from separate queries into maps and join them in Java memory in one way or another.

The Cure:

If you’re selecting from various tables in various steps, think again to see if you cannot express your query in a single statement.

6. Using DISTINCT or UNION to remove duplicates from an accidental cartesian product

With heavy joining, one can loose track of all the relations that are playing a role in a SQL statement. Specifically, if multi-column foreign key relationships are involved, it is possible to forget to add the relevant predicates in JOIN .. ON clauses. This might result in duplicate records, but maybe only in exceptional cases. Some developers may then choose to use DISTINCT to remove those duplicates again. This is wrong in three ways:

It (may) solve the symptoms but not the problem. It may as well not solve the symptoms in edge-cases.It is slow for large result sets with many columns. DISTINCT performs an ORDER BY operation to remove duplicates.It is slow for large cartesian products, which will still load lots of data into memory

The Cure:

As a rule of thumb, when you get unwanted duplicates, always review your JOIN predicates. There’s probably a subtle cartesian product in there somewhere.

7. Not using the MERGE statement

This isn’t really a mistake, but probably some lack of knowledge or some fear towards the powerful MERGE statement. Some databases know other forms of UPSERT statements, e.g. MySQL’s ON DUPLICATE KEY UPDATE clause. But MERGE is really so powerful, most importantly in databases that heavily extend the SQL standard, such as SQL Server.

The Cure:

If you’re UPSERTING by chaining INSERT and UPDATE or by chaining SELECT .. FOR UPDATE and then INSERT or UPDATE, think again. Apart from risking race conditions, you might be able to express a simpler MERGE statement.

8. Using aggregate functions instead of window functions

Before the introduction of window functions, the only means to aggregate data in SQL was by using a GROUP BY clause along with aggregate functions in the projection. This works well in many cases, and if aggregation data needed to be enriched with regular data, the grouped query can be pushed down into a joined subquery.

But SQL:2003 defined window functions, which are implemented by many popular database vendors. Window functions can aggregate data on result sets that are not grouped. In fact, each window function supports its own, independent PARTITION BY clause, which is an awesome tool for reporting.

Using window functions will:

Lead to more readable SQL (less dedicated GROUP BY clauses in subqueries)Improve performance, as a RDBMS is likely to optimise window functions more easily

The Cure:

When you write a GROUP BY clause in a subquery, think again if this cannot be done with a window function.

9. Using in-memory sorting for sort indirections

The SQL ORDER BY clause supports many types of expressions, including CASE statements, which can be very useful for sort indirections. You should probably never sort data in Java memory because you think that

SQL sorting is too slowSQL sorting cannot do it

The Cure:

If you sort any SQL data in memory, think again if you cannot push sorting into your database. This goes along well with pushing pagination into the database.

10. Inserting lots of records one by one

JDBC knows batching, and you should use it. Do not INSERT thousands of records one by one, re-creating a new PreparedStatement every time. If all of your records go to the same table, create a batch INSERT statement with a single SQL statement and multiple bind value sets. Depending on your database and database configuration, you may need to commit after a certain amount of inserted records, in order to keep the UNDO log slim.

The Cure:

Always batch-insert large sets of data.

Some interesting books

Some very interesting books on similar topics are

SQL Antipatterns by Bill KarwinSQL Performance Explained by Markus Winand

from:http://blog.jooq.org/2013/07/30/10-common-mistakes-java-developers-make-when-writing-sql/

java里面queries怎么写,Java程序员在写SQL时常犯的10个错误相关推荐

  1. Java开发者写SQL时常犯的10个错误

    首页 所有文章 资讯 Web 架构 基础技术 书籍 教程 我要投稿 更多频道 » - 导航条 -首页所有文章资讯Web架构基础技术书籍教程我要投稿更多频道 »- iOS- Python- Androi ...

  2. Java程序员最常犯的 10 个错误

    转载自 Java程序员最常犯的 10 个错误 这个列表总结了Java开发人员经常犯的10个错误. 一 .把数组转成ArrayList 为了将数组转换为ArrayList,开发者经常会这样做: List ...

  3. java中forward和redirect_好程序员Java教程分享:Forward和Redirect的区别

    Java教程分享:Forward和Redirect的区别,用户向服务器发送了一次HTTP请求,该请求可能会经过多个信息资源处理以后才返回给用户,各个信息资源使用请求转发机制相互转发请求,但是用户是感觉 ...

  4. [转]为什么程序员总是写糟糕的代码?这3个原因

    原文请看:为什么程序员总是写糟糕的代码?这3个原因 我最近一直在想我们作为一个行业为什么总是产出糟糕代码的原因. 1.明显原因-- 我一下子想到的最明显的原因是,有好的程序员,也有不那么好的程序员,有 ...

  5. 程序员如何写简历?来自硅谷的八条建议

    转载自  程序员如何写简历?来自硅谷的八条建议 前言 半个月前我发起了程序员内推项目之后,收到一些邮件,对方单纯希望我帮忙优化一下简历.我提了一些修改意见之后,有一位同学专门给我送了一张亚马逊的礼品卡 ...

  6. php程序员如何写简历

    第一部分:基本信息. 必须有的:名字.联系方式(邮箱+电话).出生年月(一般写个年份就好).应聘职位.工作年限.地址(城市要有,具体地址可写可不写). 可以有的:座右铭.社交网络地址(里面有乱七八糟内 ...

  7. 整理UML建模概念和图形~(啥?程序员不再写代码,变成画图工程师?)

    前言: 工程图纸对于工程师的重要性就不需要我多说了吧,对事物建模也是很重要的.我们在软件工程部分说过"建造一个狗窝和一栋大厦是完全不一样的",很多时候如果你有很好的idea,但是对 ...

  8. 命名自喜剧团体,宅男程序员三个月写出的编程语言是如何改变世界的?

    大数据文摘出品 来源:Zdnet 编译:洪颖菲.李可.Vicky.李雷 1989年,荷兰的一位叫Guido Van Rossum的宅男程序员觉得其他语言都不好用,于是花三个月创造了一种新的编程语言. ...

  9. 作为程序员怎么写好一份简历

    笔者在前程无忧有过半年的实习经历,作为 RPO 项目助理,每天要阅读成千上百份的简历,曾协助富士康.迅雷.顺丰.平安金服等企业招聘各级技术岗位:本场 Chat 将会以猎头的角度,告诉你一份优秀的简历是 ...

最新文章

  1. 查询Oracle中字段名带.的数据
  2. python爬虫requests实战_Python_爬虫_requests小实战
  3. 这款AI耳机可以主宰你的情绪,决定让你是哭还是笑
  4. 问题解决:org.apache.struts2.dispatcher.ng.filter.StrutsPrepareAndExecuteFilter
  5. 利用html的header下载文件
  6. Navicat for MySQL - 破解
  7. Python: 编程遇到的一些问题以及网上解决办法?
  8. Codeforces Global Round 14 E. Phoenix and Computers 思维 + dp
  9. Android VideoView无法播放网络视频
  10. golang之tcp自动重连
  11. JSON Funcs
  12. AcWing1069.凸多边形的划分(区间DP)题解
  13. C#类、方法的访问修饰符
  14. ISO-3166国家代码一览表
  15. 【Pix4d精品教程】安装Pix4Dmapper时提示“无法启动此程序,因为计算机中丢失api-ms-win-crt-runtime-l1-1-0.dll”完全解决办法
  16. 360浏览器如何改html5,360安全浏览器如何设置为默认浏览器
  17. 记2019届阿里校招第一面
  18. 靠这篇竟然理解了CAN协议!实战STM32
  19. 基于java springboot和vue的酒店管理系统
  20. android TVBOX OTT IPTV

热门文章

  1. 反恐精英ol永恒python可以隐身_放个大招!老鸟用Python打造了一款哈利波特的“隐身衣”...
  2. Android版的手机模拟信号示波器
  3. PeckShield CEO 蒋旭宪:智能合约安全问题不可怕,预防和响应机制才是关键
  4. 卡尔曼滤波算法在DS18B20温度检测中的应用
  5. 上海亚商投顾:兔年首日开门红 北向资金净流入超186亿
  6. 基于tensorflow的minst手写体数字识别
  7. 人人网 查看隐私照片_带有位置标签的照片真的是隐私问题吗?
  8. 网络设备配置--10、利用ACL配置访问控制
  9. 在含量中php是什么意思,ar测量是什么意思
  10. Linux---MISC杂项驱动