
Hashing is an important topic for programmers and computer science students to be familiar with. This article is specifically targeted to students, and programmers with a few months to a year of coding experience.

哈希是程序员和计算机科学专业学生熟悉的重要主题。 本文专门针对具有几个月到一年的编码经验的学生和程序员。

什么是散列 (What Hashing Is)

Hashing: generating a value or values from a string using a mathematical function


Hashes are mostly used for three things:


  1. Storing stuff without actually knowing what it is在不实际知道什么的情况下存储东西
  2. As a convenient way to remember where you put something作为记住放置位置的便捷方式
  3. To make sure the thing you received is the thing you wanted确保您收到的东西是您想要的东西

That’s super confusing but bear with me.


这个怎么运作 (How it works)

Hashing is otherwise described as doing a non-reversible operation on a thing that turns it into a completely different thing but would turn into the same thing if you did it again with the same input.


It’s a bit like hard boiled eggs. You can’t un-boil an egg, but you know what you’ll get out if you put a raw egg in some boiling water for 6-8 minutes. In much the same way, you can’t un-hash something.

有点像煮鸡蛋。 您不能将鸡蛋煮沸 ,但是您知道将生鸡蛋放在沸水中6-8分钟会得到什么。 以几乎相同的方式,您不能取消哈希处理

Photo by Jason Leung on Unsplash
照片由 Jason Leung在 Unsplash上 拍摄

Here’s a sample of a very simple “hashing function” for integers: it divides a number by 10, then takes the remainder:


modulo10 (egg) {    return egg % 10}

If egg=55 it will give me 5, but I have no way of turning 5 back into 55. For modulo10(), the numbers 9, 23950829, 309 and 29 will all turn up 9. We have an infinite number of values that could have gone through that hashing function and returned the same thing.

如果egg=55 ,它将给我5 ,但是我无法将5 55 。 对于modulo10()数字92395082930929将全部转起来9 。 我们可以通过该哈希函数并返回相同内容的值是无限的。

When two things have the same hash, it’s called a collision. In a cryptographic hashing function, it should be very improbable for two values to have the same hash.

当两个事物具有相同的哈希值时,称为冲突。 在密码哈希函数中,两个值具有相同的哈希应该非常不可能。

There are two types of hashing functions which are used for different things. Fast ones and slow ones.

有两种类型的哈希函数用于不同的事物。 快的和慢的。

The fast ones are used for when you don’t care if anyone knows that 5 came from a 25. They’re used in a few data structures where you need to look stuff up really fast. An example is a hash table which is pretty neat (I wrote a whole article about them). Fast hashes are also used for verifying data integrity.

如果您不在乎是否有人知道5来自25,则使用快速的那些。它们用于一些数据结构中,您需要真正快速地查找它们。 一个例子是一个非常整洁的哈希表 (我写了一篇关于它们的整篇文章 )。 快速哈希还用于验证数据完整性。

数据的完整性 (Data Integrity)

Lets say I torrent a piece of software, the ISO for a Linux distro for example. I might be unsure if what I got is what I meant to download. I could have missed a piece in transit, it could be an older version, or someone may have tampered with it. Lucky for me, I can go to an authority and find a checksum which I can compare my file against. A checksum is the value the developers got when they hashed the file they released. Since I have the ability to hash the file I got in the exact same way and compare the two values, I can verify that I have the correct file.

可以说我洪流了一个软件, 例如 Linux发行版的ISO。 我可能不确定我所得到的是我要下载的内容。 我可能会错过运输中的作品,它可能是较旧的版本,或者有人对其进行了篡改。 对我来说幸运的是,我可以去找一个权威机构,然后找到一个可以与我的文件进行比较的校验和 。 校验和是开发人员对发布的文件进行哈希处理时获得的值。 由于我可以用完全相同的方式对获得的文件进行哈希处理并比较两个值,因此可以验证我是否具有正确的文件。

You can also use slow hashes for data integrity but it’s not a huge deal if you used something too fast like MD5.


密码 (Passwords)

Slow hashes are for when you need to keep whatever you hashed a secret. Because they’re slow and take a lot of computing power, they’re harder to ‘crack’ or figure out what their original value was. Slow hashes are perfect for passwords. This is why we talk about ‘cracking passwords’.

慢散列用于需要将散列的内容保密的情况。 由于它们运行缓慢且需要大量计算能力,因此很难“破解”或弄清楚它们的原始价值是什么。 慢散列非常适合密码。 这就是为什么我们谈论“破解密码”的原因。

On some sites, when you enter a password, the site matches what you typed in with what it has on the server. However, it doesn’t actually know what your password is. When you sign up, the site generates a bit of random data (a salt), tacks it on to the password you chose, and puts it through a hashing function. It then stores the result of that hash and the salt it used.

在某些站点上,当您输入密码时,该站点将您输入的内容与服务器上的内容进行匹配。 但是, 它实际上并不知道您的密码是 。 当您注册时,该站点会生成一些随机数据(盐),将其附加到您选择的密码上,并通过哈希函数进行处理。 然后,它存储该哈希的结果及其使用的盐。

When you want to use your password to log in again, it grabs the salt (which is usually kept in the same place as the password hash), does the same process again, and then compares the two results.


如何破解密码 (How to crack passwords)

Remember, since it’s impossible to know for 100% sure what the original value of a hash was, we have to use our best guess. Most of the time, this involves using a list of common passwords and trying each of them against each hash. To do that you have to compute each one, so the slower the hash, the more expensive it will be for a hacker to guess passwords.

请记住,由于不可能100%知道哈希的原始值是什么,因此我们必须使用最佳猜测。 在大多数情况下,这涉及使用常见密码列表,并针对每个哈希尝试对每个密码进行尝试。 为此,您必须计算每个密码,因此哈希值越慢,黑客猜测密码的成本就越高。

A salt is important too.


Lets say User1 and User2 both used pa$$word as their passwords. The MD5 hash for pa$$word is A61A78E492EE60C63ED8F2BB3A6A0072. Hackers already know what the hashes for all the top passwords are. In fact, you can even look up MD5 hashes on sites like crackstation.net. Additionally, if a password is less common, they can guess it once and then compromise the accounts of everyone else who used that password.

假设User1和User2都使用了pa$$word作为密码。 pa$$word的MD5哈希为A61A78E492EE60C63ED8F2BB3A6A0072 。 黑客已经知道所有顶级密码的哈希值是什么。 实际上,您甚至可以在crackstation.net等网站上查找MD5哈希。 此外,如果密码不太常见,他们可以猜测一次,然后破坏使用该密码的其他所有人的帐户。

If I add a salt, then the hashes will be different. For example, using usernames as a salt (just an example, not a good idea in practice):

如果我加盐,则哈希值将有所不同。 例如,使用用户名作为补充(仅作为示例,实际上不是一个好主意):

user1.pa$$word = 8CF41DEBA430F88EBC5DDA0936B3435Buser2.pa$$word = 5161758DEEF000FA5C190573574FAFB9 # <-- completely different hash

See? Completely different hashes. If we had used something other than MD5, those user accounts would be as safe as they can be (which is not very because ‘pa$$word’ is a terrible password).

看到? 完全不同的哈希。 如果我们使用的不是MD5,则这些用户帐户将尽可能安全(这不是非常安全,因为“ pa $$ word”是一个糟糕的密码)。

再见MD5 (Goodbye MD5)

I used a pretty bad example of a slow hash. MD5 was originally designed to be good enough to use on passwords, and it was — up until around 2005. Now it is considered broken and unsafe to use — mostly because it’s too fast. Computers have gotten more powerful so we need stronger encryption. Some better alternatives nowadays are bcrypt and PBKDF2.

我使用了一个很慢的哈希哈希示例。 MD5最初被设计为足以在密码上使用,直到2005年左右 。 现在,它被认为是损坏的并且使用不安全-主要是因为它太快了 。 计算机变得越来越强大,因此我们需要更强大的加密。 如今,一些更好的替代方法是bcrypt和PBKDF2 。

当技术发展太快时 (When Technology Moves Too Fast)

Unfortunately, MD5 is still widely used. If you look at HaveIBeenPwned.com and search for ‘MD5’, lots of results come up from sites that were hacked long after 2005. Why haven’t companies moved away from this highly insecure method?

不幸的是,MD5仍被广泛使用。 如果您查看HaveIBeenPwned.com并搜索“ MD5”,则很多结果来自于2005年以后被黑客入侵的网站。为什么公司没有放弃这种高度不安全的方法?

Part of the problem is that overhauling software, much like cracking secure passwords, can be time consuming and expensive. The other problem is the nature of hashing itself.

问题的一部分是,大修软件(类似于破解安全密码)可能既耗时又昂贵。 另一个问题是散列本身的性质。

If you don’t actually know what anyone’s password is, you can’t just change the hashing method. Since you can’t turn a hash back into a password, you definitely can’t turn a hash into a different hash that works for the same password.

如果您实际上不知道任何人的密码,那么就不能仅更改哈希方法。 由于您不能将哈希转换回密码,因此您绝对不能将哈希转换为适用于相同密码的其他哈希。

The best method to deal with this is to send out an email and force everyone to change their passwords. Users really hate this, so many companies have opted to re-hash passwords the next time the user logs in, but still support the old method until every password has been replaced. That’s why you’ll see MD5 on some sites which also used another method.

解决此问题的最佳方法是发送电子邮件,并强迫每个人更改密码。 用户真的很讨厌这一点,因此许多公司在用户下次登录时选择重新哈希密码,但是仍然支持旧方法,直到替换了每个密码为止。 这就是为什么您会在某些使用其他方法的站点上看到MD5的原因。

散列不是什么 (What Hashing Isn’t)

Encoding and encryption are two things that may be confused for hashing. They all have one thing in common: they turn data into other data that looks different to a human.

编码和加密是哈希可能混淆的两件事。 它们都有一个共同点:将数据转换为看起来与人类不同的其他数据。

正在加密 (Encrypting)

Encryption is different from hashing because it allows you to turn encrypted data back into what it was originally: to decrypt it. To do this you need a special key.

加密与哈希处理不同,因为它使您可以将加密的数据恢复为原始数据: 解密 。 为此,您需要一个特殊的密钥。

Sometimes you might hear bloggers or tech writers say “passwords are encrypted”, this is not technically the case. Passwords should always be ‘hashed’ with one exception: when they are in transit between your keyboard and the program that hashes them.

有时您可能会听到博客作者或技术作家说“密码已加密”,从技术上讲并非如此。 密码应始终“散列”,但有一个例外:在键盘和对其进行哈希处理的程序之间传递密码时。

编码方式 (Encoding)

Students and novice programmers often confuse encoding for hashing or encrypting. This is not good because encoding, like encryption allows you to turn encoded data back into its original form — except you don’t need a key to do it at all. Anyone can decode encoded data provided they know what encoding it currently uses and originally used. Encoding data does not protect it from being seen by prying eyes.

学生和新手程序员经常混淆编码以进行哈希或加密。 这是不好的,因为像加密一样的编码允许您将编码的数据转换回其原始形式-除非您根本不需要密钥即可执行该操作。 任何人都可以解码编码的数据,前提是他们知道当前使用和最初使用的编码。 对数据进行编码并不能防止被窥视。

An example is JWTs: JSON Web Tokens.

一个示例是JWT:JSON Web令牌。

An example JWT looks like the following: not legible to a human unless you can convert Base64 in your head (I doubt anyone could do that for a string this long).



JWTs are pretty cool! However, students and newbies often look at them and think the data is secret because they can’t read it. In reality, JWTs are Base64Url encoded, not hashed or encrypted. This means anyone can read the first and second parts of them (in fact there’s a handy tool for it, try it out). The signature at the end is proof that it really came from where it claims to have come from. You can encrypt JWTs if you want, but they are readable by default.

JWT非常酷! 但是,学生和新手经常看着他们,认为数据是秘密的,因为他们看不懂。 实际上,JWT是Base64Url编码的,不是散列或加密的。 这意味着任何人都可以阅读其中的第一部分和第二部分(实际上有一个方便的工具 ,可以尝试一下)。 最后的签名证明它确实来自它声称的来源。 您可以根据需要加密JWT,但默认情况下它们是可读的。

Does this mean JWTs are insecure? No! This is by design. Just don’t put anything you don’t want the end user or a hacker to see in one.

这是否意味着JWT不安全? 没有! 这是设计使然。 只是不要放入您不希望最终用户或黑客看到的任何东西。

摘要 (Summary)

Hashing is pretty cool. You can use it to:

哈希很酷。 您可以使用它来:

  1. make hash tables that can store data in a way that makes it fast to retrieve制作可以以快速检索方式存储数据的哈希表
  2. store passwords in a way that keeps them super secret以使密码超级机密的方式存储密码
  3. verify the integrity of data in case it was corrupted in transit or tampered with验证数据的完整性,以防数据在传输中被破坏或被篡改
  4. A whole bunch of other stuff I didn’t cover.


Hashing is not the same as encoding or encrypting and it’s important to understand the difference between these.


翻译自: https://medium.com/@jasminedevv/hashing-whats-it-for-fb0340c3330c




  • @哈希表@
  • React源码分析2 — 组件和对象的创建(createClass,createElement)
  • React源码分析5 — setState机制
  • Redux源码分析
  • Spring源码分析4 — spring bean创建和初始化
  • Spring源码分析3 — spring XML配置文件的解析流程
  • Spring源码分析2 — 容器启动流程
  • mybatis源码分析3 - sqlSession的创建
  • mybatis源码分析2 - SqlSessionFactory的创建
  • mybatis源码分析5 - mapper读写数据库完全解析
  • mybatis源码分析7 - mybatis-spring读写数据库全过程
  • mybatis源码分析6 - mybatis-spring容器初始化
  • mybatis源码分析1 - 框架
  • 机器学习算法简要
  • 任务8
  • JVM类加载与运行时优化
  • 短信猫 java 开发包,程序员福音!BAT企业联合出品《Java开发手册》强势来袭
  • 【全奖博士生】悉尼科技大学ReLER实验室招收CV/AI方向
  • Noah Mt4跟单系统制作第二篇 Mt4TradeApi连接服务器篇
  • 合约自动化跟单系统项目开发逻辑(代码演示方案)
  • 高性能Web框架
  • 百度架构师高并发web架构分析
  • 微信小程序商城项目
  • Chrome插件Loom录制视频音频
  • 用计算机实现的动画分为,计算机动画分为哪两种
  • 二维动画制作相对于传统宣传的优势
  • h5的二维三维动画杂谈
  • OpenGL -- 二维动画 glutTimerFunc 函数
  • 使用Qt动画框架设计角色的二维动画
  • 简易数字电压表


  1. 哈希扩展长度攻击_哈希长度扩展攻击

    哈希扩展长度攻击 在这篇文章中,我将尽力避免夏季的低迷,而将重点放在比抱怨天气更有趣的事情上-哈希长度扩展攻击. 散列长度扩展攻击并不复杂也不复杂,说实话,这只是关于如何使用散列函数. 正如我以前的一 ...

  2. 哈希密码_哈希生日和密码

    哈希密码 什么是哈希函数? (What is a Hash function?) It's an algorithm that maps an input of arbitrary length to ...

  3. python 哈希表_哈希表哪家强?编程语言找你来帮忙!

    点击关注上方"五分钟学算法", 设为"置顶或星标",第一时间送达干货. 转自编程技术宇宙 哈希表华山论剑 比特宇宙编程语言联合委员会准备举办一次大会,主题为哈希 ...

  4. c++ 哈希_详解Python中的可哈希对象与不可哈希对象(二)

    点击上方"机器学习与python集中营",星标公众号重磅干货,第一时间送达☞机器学习.深度学习.python全栈开发干货作者:草yang年华来源:个人原创 前言:我们经常会听见很多 ...

  5. 除留余数法构造哈希表_哈希表算法原理

    基本概念 哈希表(Hash Table)是一种根据关键字直接访问内存存储位置的数据结构.通过哈希表,数据元素的存放位置和数据元素的关键字之间建立起某种对应关系,建立这种对应关系的函数称为哈希函数. 哈 ...

  6. java 哈希一致算法_一致哈希算法Java实现

    一致哈希算法(Consistent Hashing Algorithms)是一个分布式系统中常用的算法.传统的Hash算法当槽位(Slot)增减时,面临所有数据重新部署的问题,而一致哈希算法确可以保证 ...

  7. ruby 生成哈希值_哈希== Ruby中的运算符

    ruby 生成哈希值 In the last article, we have seen how we can compare two hash objects with the help of &l ...

  8. ruby 生成哈希值_哈希 Ruby中的运算符

    ruby 生成哈希值 In the last article, we have seen how we can carry out a comparison between two hash obje ...

  9. C++ 哈希表查询_进入哈希函数结界的世界

    1. 前言 哈希表或称为散列表,是一种常见的.使用频率非常高的数据存储方案. 哈希表属于抽象数据结构,需要开发者按哈希表数据结构的存储要求进行 API 定制,对于大部分高级语言而言,都会提供已经实现好 ...


  1. 使用Vue.js进行数据绑定以及父子组件传值
  2. ct读片软件_伦琴影领影像诊断中心:这六大MRI读片技巧,影像医生必须掌握
  3. 数列分块入门2(区间小于c的个数)
  4. “东湖”的艄公--漫步绍兴(四)
  5. do not lie on the bed to watch pc or phones
  6. 嘿嘿,又中毒了spoolsv.exe
  7. Ubuntu系统配置JDK环境变量
  8. VTK:受约束的 Delaunay 2D用法实战
  9. oracle回退脚本怎么写_短视频爆款文案怎么写?130个短视频爆款文案、脚本范例分享!...
  10. 1章 SpringBoot介绍
  11. 交易系统开发(一)——交易系统简介
  12. Print Conductor中文版
  13. SHT20温湿度传感器工作原理
  14. Codeforces Round #393 Frodo and pillows
  15. python word文档文字批量替换与删除
  16. 问题记录:Ubuntu中source运行.sh shell脚本报错:command not found 未找到命令
  17. ERP的实施--把握三大计划
  18. view.setAlpha(float alpha)与view.getBackground().setAlpha(int alpha)的区别
  19. Windows 11配置WSL及Linux子系统安装
  20. 解密:IT运维艺术之负载均衡之术


  1. 【sdx62】WCN685X IPA注册失败问题分析及解决方案
  2. 使用oath作totp一次性口令openssh认证
  3. 我爱天文 - 你知道几种天文望远镜?
  4. jit java 怎么配置_新的Java JIT编译器Graal简介
  5. 【路径规划】蚁群算法求解电动汽车充电站与换电站协调路径规划【含GUI Matlab源码 796期】
  6. springboot整合poi读取数据库数据和图片动态导出excel
  7. 利用Python爬取散文网的文章实例
  8. 关于一些初级ACM竞赛题目的分析和题解(六)。
  9. 前端报错Module not found: Error: Can‘t resolve巴拉巴拉的
  10. CoreDNS Windows