1.中文常见姓氏词典

该词典来源于盘古分词中文分词开源软件,盘古分词用该词典识别人名

http://pangusegment.codeplex.com/SourceControl/latest#PanGuSegment/PanGu/Dict/ChsName.cs

//有明显歧异的姓氏

"王","张","黄","周","徐","胡","高","林","马","于",

"程","傅","曾","叶","余","夏","钟","田","任","方",

"石","熊","白","毛","江","史","候","龙","万","段",

"雷","钱","汤","易","常","武","赖","文","查",

//没有明显歧异的姓氏

"赵","肖","孙","李","吴","郑","冯","陈",

"褚","卫","蒋","沈","韩","杨","朱","秦",

"尤","许","何","吕","施","桓","孔","曹",

"严","华","金","魏","陶","姜","戚","谢",

"邹","喻","柏","窦","苏","潘","葛","奚",

"范","彭","鲁","韦","昌","俞","袁","酆",

"鲍","唐","费","廉","岑","薛","贺","倪",

"滕","殷","罗","毕","郝","邬","卞","康",

"卜","顾","孟","穆","萧","尹","姚","邵",

"湛","汪","祁","禹","狄","贝","臧","伏",

"戴","宋","茅","庞","纪","舒","屈","祝",

"董","梁","杜","阮","闵","贾","娄","颜",

"郭","邱","骆","蔡","樊","凌","霍","虞",

"柯","昝","卢","柯","缪","宗","丁","贲",

"邓","郁","杭","洪","崔","龚","嵇","邢",

"滑","裴","陆","荣","荀","惠","甄","芮",

"羿","储","靳","汲","邴","糜","隗","侯",

"宓","蓬","郗","仲","栾","钭","历","戎",

"刘","詹","幸","韶","郜","黎","蓟","溥",

"蒲","邰","鄂","咸","卓","蔺","屠","乔",

"郁","胥","苍","莘","翟","谭","贡","劳",

"冉","郦","雍","璩","桑","桂","濮","扈",

"冀","浦","庄","晏","瞿","阎","慕","茹",

"习","宦","艾","容","慎","戈","廖","庾",

"衡","耿","弘","匡","阙","殳","沃","蔚",

"夔","隆","巩","聂","晁","敖","融","訾",

"辛","阚","毋","乜","鞠","丰","蒯","荆",

"竺","盍","单","欧",

//复姓

"司马","上官","欧阳","夏侯","诸葛","闻人",

"东方","赫连","皇甫","尉迟","公羊","澹台",

"公冶","宗政","濮阳","淳于","单于","太叔",

"申屠","公孙","仲孙","轩辕","令狐","徐离",

"宇文","长孙","慕容","司徒","司空","万俟"

2.双字人名的首字词典

//该词典来源于开源软件盘古分词ChsDoubleName1.txt词典,盘古分词用该词典识别人名

建,小,晓,文,志,国,玉,丽,永,海,春,金,明,新,德,秀,红,亚,乐, 三

伟,雪,俊, 桂, 爱, 美, 世, 正, 庆, 学, 家, 立, 淑, 振, 云, 华, 光, 惠, 兴, 天, 长, 艳, 慧, 利, 宏, 佳, 瑞, 凤, 荣, 秋,

继, 嘉, 卫, 燕, 思, 维, 少, 福, 忠, 宝, 子, 成, 月, 洪, 东, 一, 泽, 林, 大, 素, 旭, 宇, 智, 锦, 冬, 玲, 雅, 伯, 翠, 传,

启, 剑, 安, 树, 良, 中, 梦, 广, 昌, 元, 万, 清, 静, 友, 宗, 兆, 丹, 克, 彩, 绍, 喜, 远, 朝, 敏, 培, 胜, 祖, 先, 菊, 士,

向, 有, 连, 军, 健, 巧, 耀, 莉, 英, 方, 和, 仁, 孝, 梅, 汉, 兰, 松, 水, 江, 益, 开, 景, 运, 贵, 祥, 青, 芳, 碧, 婷, 龙,

鹏, 自, 顺, 双, 书, 生, 义, 跃, 银, 佩, 雨, 保, 贤, 仲, 鸿, 浩, 加, 定, 炳, 飞, 锡, 柏, 发, 超, 道, 怀, 进, 其, 富, 平,

全, 阳, 吉, 茂, 彦, 诗, 洁, 润, 承, 治, 焕, 如, 君, 增, 善, 希, 根, 应, 勇, 宜, 守, 会, 凯, 育, 湘, 凌, 本, 敬, 博, 延,

2.双字人名的末字词典

// 该词典来源于开源软件盘古分词ChsDoubleName2.txt词典,盘古分词用该词典识别人名

薇, 华, 平, 明, 英, 军, 林, 萍, 芳, 玲, 红, 生, 霞, 梅, 文, 荣, 珍, 兰, 娟, 峰, 琴, 云, 辉, 东, 龙, 敏, 伟, 强, 丽, 春, 杰,

燕, 民, 君, 波, 国, 芬, 清, 祥, 斌, 婷, 飞, 良, 忠, 新, 凤, 锋, 成, 勇, 刚, 玉, 元, 宇, 海, 兵, 安, 庆, 涛, 鹏, 亮, 青, 阳,

艳, 松, 江, 莲, 娜, 兴, 光, 德, 武, 香, 俊, 秀, 慧, 雄, 才, 宏, 群, 琼, 胜, 超, 彬, 莉, 中, 山, 富, 花, 宁, 利, 贵, 福, 发,

义, 蓉, 喜, 娥, 昌, 仁, 志, 全, 宝, 权, 美, 琳, 建, 金, 贤, 星, 丹, 根, 和, 珠, 康, 菊, 琪, 坤, 泉, 秋, 静, 佳, 顺, 源, 珊

达, 欣, 如, 莹, 章, 浩, 勤, 芹, 容, 友, 芝, 豪, 洁, 鑫, 惠, 洪, 旺, 虎, 远, 妮, 森, 妹, 南, 雯, 奇, 健, 卿, 虹, 娇, 媛, 怡,

铭, 川, 进, 博, 智, 来, 琦, 学, 聪, 洋, 乐, 年, 翔, 然, 栋, 凯, 颖, 鸣, 丰, 瑞, 奎, 立, 堂, 威, 雪, 鸿, 晶, 桂, 凡, 娣, 先,

洲, 毅, 雅, 月, 旭, 田, 晖, 方, 恒, 亚, 泽, 风, 银, 高, 贞, 九

3.单字人名常用字词典

/ /该词典来源于开源软件盘古分词ChsSingleName.txt词典,盘古分词用该词典识别人名

敏, 伟, 勇, 军, 斌, 静, 丽, 涛, 芳, 杰, 萍, 强, 俊, 明, 燕, 磊, 玲, 华, 平, 鹏, 健, 波, 红, 丹, 辉, 超, 艳, 莉, 刚, 娟, 峰,

婷, 亮, 洁, 颖, 琳, 英, 慧, 飞, 霞, 浩, 凯, 宇, 毅, 林, 佳, 云, 莹, 娜, 晶, 洋, 文, 鑫, 欣, 琴, 宁, 琼, 兵, 青, 琦, 翔, 彬,

锋, 阳, 璐, 旭, 蕾, 剑, 虹, 蓉, 建, 倩, 梅, 宏, 威, 博, 君, 力, 龙, 晨, 薇, 雪, 琪, 欢, 荣, 江, 炜, 成, 庆, 冰, 东, 帆, 雷,

楠, 锐, 进, 海, 凡, 巍, 维, 迪, 媛, 玮, 杨, 群, 瑛, 悦, 春, 瑶, 婧, 兰, 茜, 松, 爽, 立, 瑜, 睿, 晖, 聪, 帅, 瑾, 骏, 雯, 晓,

昊, 勤, 新, 瑞, 岩, 星, 忠, 志, 怡, 坤, 康, 航, 利, 畅, 坚, 雄, 智, 萌, 哲, 岚, 洪, 捷, 珊, 恒, 靖, 清, 扬, 昕, 乐, 武, 玉,

诚, 菲, 锦, 凤, 珍, 晔, 妍, 璇, 胜, 菁, 科, 芬, 露, 越, 彤, 曦, 义, 良, 鸣, 芸, 方, 月, 铭, 光, 震, 冬, 源, 政, 虎, 莎, 彪,

蓓, 钢, 凌, 奇, 卫, 彦, 烨, 可, 黎, 川, 淼, 惠, 祥, 然, 三

自然语言处理人名识别常用词典相关推荐

  1. hanlp自然语言处理包的人名识别代码解析

    HanLP发射矩阵词典nr.txt中收录单字姓氏393个.袁义达在<中国的三大姓氏是如何统计出来的>文献中指出:当代中国100个常见姓氏中,集中了全国人口的87%,根据这一数据我们只保留n ...

  2. 基于分布式的短文本命题实体识别之----人名识别(python实现)

    目前对中文分词精度影响最大的主要是两方面:未登录词的识别和歧义切分. 据统计:未登录词中中文姓人名在文本中一般只占2%左右,但这其中高达50%以上的人名会产生切分错误.在所有的分词错误中,与人名有关的 ...

  3. 关于《后浪》的B站弹幕分析总结(二)——jieba分词、常用词典、颜文字处理以及字符格式统一

    目录 一.你需要知道的几个常用词典 - **停用词典(停用词,颜文字,emoji)** - 否定词典,程度副词词典 - 情感极性词典,多维情感词典 二.统一字符.统一大小写.统一繁简体 - 统一字符 ...

  4. 操作系统常用词典(三)

    操作系统常用词典(三) 电阻式触摸屏(Resistive touchscreens):电阻式触摸屏基于施加到屏幕上的压力来工作.电阻屏由许多层组成.当按下屏幕时,外部的后面板将被推到下一层,下一层会感 ...

  5. java人名识别_HanLP中人名识别分析(示例代码)

    HanLP中人名识别分析 在看源码之前,先看几遍论文<基于角色标注的中国人名自动识别研究> 关于命名识别的一些问题,可参考下列一些issue: HanLP参考博客: 分词 分词:给定一个字 ...

  6. python中文人名识别(使用hanlp,LTP,LAC)

    中文人名识别属于命名实体识别的范畴,解决问题的思路很多,但是在实际的应用过程中各种库做的参差不齐,下面是3个开源库的使用方法与效果展示: 首先是hanlp hanlp github主页:https:/ ...

  7. 人脸识别常用数据集大全(12/20更新)

    人脸识别常用数据集大全(12/20更新) 原文首发地址:人脸识别常用数据集大全(12/20更新) - 极市博客 https://www.cnblogs.com/ansang/p/8137413.htm ...

  8. python 识别人名_HanLP中人名识别分析

    HanLP中人名识别分析 在看源码之前,先看几遍论文<基于角色标注的中国人名自动识别研究> 关于命名识别的一些问题,可参考下列一些issue: HanLP参考博客: 分词 分词:给定一个字 ...

  9. HanLP中人名识别分析

    在看源码之前,先看几遍论文<基于角色标注的中国人名自动识别研究> 关于命名识别的一些问题,可参考下列一些issue: 名字识别的问题 #387 机构名识别错误 关于层叠HMM中文实体识别的 ...

最新文章

  1. MATLAB【四】 ————批量适配图片信息与excel/txt等文档信息,批量移动拷贝图片,批量存图片中点和方框
  2. 一次 SQL 查询优化原理分析(900W+ 数据,从 17s 到 300ms)
  3. vue写的简单版todolist
  4. 第五章 文件和目录管理
  5. uva-10954-贪心
  6. windows7环境下使用pip安装MySQLdb
  7. 个人计算机与手机的区别,手机与电脑的CPU是一回事吗?一共有六大区别,看看你知道多少!...
  8. duration java_Java Duration类| toNanos()方法与示例
  9. leetcode619. 只出现一次的最大数字(SQL)
  10. Android6.0之前版本(AwesomePlayer)OMXCodec执行流程细节
  11. E-MapReduce 2.0.0 版本发布
  12. Linux 2.4.x 网络协议栈QoS模块(TC)的设计与实现
  13. COPRA RF 2005 SR1最新版 (冷弯成型,轧辊设计)
  14. QT之信号和槽机制详解
  15. 阿里云华为云对比分析
  16. 香港、英国、南非...中国!区块链将迎来又一波高潮?
  17. 沃达丰高管:澳大利亚将于2020年迎来5G
  18. 电脑开不了机是不是电源坏了
  19. 零基础安装tensorflow-cpu/gpu+导入pycharm内核
  20. AutoCAD .Net 不同文档间复制对象

热门文章

  1. funs[0]() ;//3 funs[1]() ; //3 funs[2]() ; //3
  2. fget 和 gets之间的区别
  3. 做软件这几年,学习使我快乐
  4. Java中数据库连接池原理机制的详细讲解
  5. 突袭4 linux版本,突袭4配置要求一览 突袭4配置要求高吗
  6. 分布式计算框架Hadoop核心组件概述
  7. 硬链接和软链接(符号链接)的区别
  8. java构造函数的特点_java中构造函数的特点是什么?图文解析
  9. 江淮华霆动力电池诊断,江淮IEV5和IEV6S电池诊断系统, 实车测试测试可以连接江淮IEV5
  10. 网易基于Filebeat的日志采集服务设计与实践