持续打鱼

  • 基础说明
  • 基本分析
    • 生成二进制PED
    • 概要统计:缺失率
    • 概要统计:等位基因频率
    • 基本的关联分析
    • 基因型和其他关联模型
    • 分层分析
    • 关联分析,集群核算
    • 哈迪温伯格平衡分析
  • Plink2.0目前已知区别
  • 处理结果文件

基础说明

使用Plink1.9版本,样例数据为hapmap1
运行环境为windows
一下内容是基于1.07版的教程,在1.9版本上进行相应的测试,教程地址

基本分析

所有命令均是进入plink目录下执行,数据文件存放在data目录下,输出为out目录

生成二进制PED

plink.exe --noweb --file data/hapmap1 --make-bed --out out/1.bed/hapmap
PLINK v1.90b6.10 64-bit (17 Jun 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to out/1.bed/hapmap.log.
Options in effect:--file data/hapmap1--make-bed--noweb--out out/1.bed/hapmapNote: --noweb has no effect since no web check is implemented yet.
10125 MB RAM detected; reserving 5062 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (83534 variants, 89 people).
--file: out/1.bed/hapmap-temporary.bed + out/1.bed/hapmap-temporary.bim +
out/1.bed/hapmap-temporary.fam written.
83534 variants loaded from .bim file.
89 people (89 males, 0 females) loaded from .fam.
89 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 89 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.99441.
83534 variants and 89 people pass filters and QC.
Among remaining phenotypes, 44 are cases and 45 are controls.
--make-bed to out/1.bed/hapmap.bed + out/1.bed/hapmap.bim +
out/1.bed/hapmap.fam ... done.

--noweb的作用是跳过网络版本检查,不检测是否最新版本。
如果要筛选基因分型个体完成度达到一定程度的数据,需要增加--mind {小数形式的缺失度},例如--mind 0.05表示完成度在95%以上的高基因分型个体

plink.exe --noweb --file data/hapmap1 --make-bed --mind 0.05 --out out/1.bed/hapmap95

以上两条命令执行完成后,得到的文件如下

具体内容暂时不展示,后面会将处理完成文件上传。

概要统计:缺失率

plink.exe --noweb --bfile out/1.bed/hapmap --missing --out out/2.miss/hapmap_miss_sta
PLINK v1.90b6.10 64-bit (17 Jun 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to out/2.miss/hapmap_miss_sta.log.
Options in effect:--bfile out/1.bed/hapmap--missing--noweb--out out/2.miss/hapmap_miss_staNote: --noweb has no effect since no web check is implemented yet.
10125 MB RAM detected; reserving 5062 MB for main workspace.
83534 variants loaded from .bim file.
89 people (89 males, 0 females) loaded from .fam.
89 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 89 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.99441.
--missing: Sample missing data report written to
out/2.miss/hapmap_miss_sta.imiss, and variant-based missing data report written
to out/2.miss/hapmap_miss_sta.lmiss.

得到的文件如下
这里要注意的是--bfile同时指定了前缀为hapmap的三个文件hapmap.bed、hapmap.bim、hapmap.fam也可以分别通过--bed --bim --fam来分别指定这三个文件,另外,可以通过--chr来制定分析具体哪一个染色体的缺失率

概要统计:等位基因频率

plink.exe --noweb --bfile out/1.bed/hapmap --freq --out out/3.freq/hapmap_freq_stat
PLINK v1.90b6.10 64-bit (17 Jun 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to out/3.freq/hapmap_freq_stat.log.
Options in effect:--bfile out/1.bed/hapmap--freq--noweb--out out/3.freq/hapmap_freq_statNote: --noweb has no effect since no web check is implemented yet.
10125 MB RAM detected; reserving 5062 MB for main workspace.
83534 variants loaded from .bim file.
89 people (89 males, 0 females) loaded from .fam.
89 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 89 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.99441.
--freq: Allele frequencies (founders only) written to
out/3.freq/hapmap_freq_stat.frq .


如果要使用特定的分析算法,可以通过--within {算法文件绝对路径}来指定要使用的算法,例如下面的例子

plink.exe --noweb --bfile out/1.bed/hapmap --freq --within data/pop.phe --out out/3.freq/hapmap_freq_stat_pop

同样的,可以使用--snp {snp编号}来制定分析特定的SNP等位基因频率

plink.exe --noweb --bfile out/1.bed/hapmap --freq --within data/pop.phe --snp rs1891905 --out out/3.freq/hapmap_freq_stat_pop

基本的关联分析

plink.exe --noweb --bfile out/1.bed/hapmap --assoc --out out/4.assoc/assco
PLINK v1.90b6.10 64-bit (17 Jun 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to out/4.assoc/assco.log.
Options in effect:--assoc--bfile out/1.bed/hapmap--noweb--out out/4.assoc/asscoNote: --noweb has no effect since no web check is implemented yet.
10125 MB RAM detected; reserving 5062 MB for main workspace.
83534 variants loaded from .bim file.
89 people (89 males, 0 females) loaded from .fam.
89 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 89 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.99441.
83534 variants and 89 people pass filters and QC.
Among remaining phenotypes, 44 are cases and 45 are controls.
Writing C/C --assoc report to out/4.assoc/assco.assoc ...
done.


可以通过添加--adjust进行排序

基因型和其他关联模型

plink.exe --noweb --bfile out/1.bed/hapmap --model --out out/5.model/model
PLINK v1.90b6.10 64-bit (17 Jun 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to out/5.model/model.log.
Options in effect:--bfile out/1.bed/hapmap--model--noweb--out out/5.model/modelNote: --noweb has no effect since no web check is implemented yet.
10125 MB RAM detected; reserving 5062 MB for main workspace.
83534 variants loaded from .bim file.
89 people (89 males, 0 females) loaded from .fam.
89 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 89 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.99441.
83534 variants and 89 people pass filters and QC.
Among remaining phenotypes, 44 are cases and 45 are controls.
Writing --model report to out/5.model/model.model ... done.


可以通过 --cell 0 --snp rs2222162 来制定分析的细胞以及SNP

分层分析

plink.exe --noweb --bfile out/1.bed/hapmap --cluster --out out/6.cluster/cluster
PLINK v1.90b6.10 64-bit (17 Jun 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to out/6.cluster/cluster.log.
Options in effect:--bfile out/1.bed/hapmap--cluster--noweb--out out/6.cluster/clusterNote: --noweb has no effect since no web check is implemented yet.
10125 MB RAM detected; reserving 5062 MB for main workspace.
83534 variants loaded from .bim file.
89 people (89 males, 0 females) loaded from .fam.
89 phenotype values loaded from .fam.
Using up to 4 threads (change this with --threads).
Before main variant filters, 89 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.99441.
83534 variants and 89 people pass filters and QC.
Among remaining phenotypes, 44 are cases and 45 are controls.
Distance matrix calculation complete.
Clustering... done.
Cluster solution written to out/6.cluster/cluster.cluster1 ,
out/6.cluster/cluster.cluster2 , and out/6.cluster/cluster.cluster3 .


可以通过--mc 2 --ppc 0.05 来指定要分析集群最大数量以及IBS阈值

关联分析,集群核算

plink.exe --noweb --bfile out/1.bed/hapmap --mh --within out/6.cluster/cluster_mc_ppc.cluster2 --adjust --out out/6.cluster/aac1
PLINK v1.90b6.10 64-bit (17 Jun 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to out/6.cluster/aac1.log.
Options in effect:--adjust--bfile out/1.bed/hapmap--mh--noweb--out out/6.cluster/aac1--within out/6.cluster/cluster_mc_ppc.cluster2Note: --noweb has no effect since no web check is implemented yet.
10125 MB RAM detected; reserving 5062 MB for main workspace.
83534 variants loaded from .bim file.
89 people (89 males, 0 females) loaded from .fam.
89 phenotype values loaded from .fam.
--within: 45 clusters loaded, covering a total of 89 people.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 89 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.99441.
83534 variants and 89 people pass filters and QC.
Among remaining phenotypes, 44 are cases and 45 are controls.
--mh/--bd: 21 valid clusters, with a total of 21 cases and 21 controls.
Writing report to out/6.cluster/aac1.cmh ... done.
--adjust: Genomic inflation est. lambda (based on median chisq) = 1.07656.
--adjust values (66852 variants) written to out/6.cluster/aac1.cmh.adjusted .

哈迪温伯格平衡分析

plink.exe --noweb --bfile out/1.bed/hapmap --hardy --out out/7.hardy/hardy
PLINK v1.90b6.10 64-bit (17 Jun 2019)          www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to out/7.hardy/hardy.log.
Options in effect:--bfile out/1.bed/hapmap--hardy--noweb--out out/7.hardy/hardyNote: --noweb has no effect since no web check is implemented yet.
10125 MB RAM detected; reserving 5062 MB for main workspace.
83534 variants loaded from .bim file.
89 people (89 males, 0 females) loaded from .fam.
89 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 89 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.99441.
--hardy: Writing Hardy-Weinberg report (founders only) to out/7.hardy/hardy.hwe
... done.

Plink2.0目前已知区别

  1. 不再需要–noweb
  2. --file不在识别,直接从--bfile或者--bgene开始分析,但是可以兼容1.9 --bfile 生成的文件
  3. --assoc 使用 --glm替代
  4. --model无法识别,使用到该结果的运算目前无法使用
  5. --cluster无法识别,使用到该结果的运算目前无法使用
  6. 可用的命令中,部分命令输出结果有区别(文件格式&内容)

处理结果文件

V1.9&V2.0

Plink分析执行记录相关推荐

  1. 「群体遗传学实战」第二课: 画出和文章几乎一样的PCA图

    主成分分析(PCA)是一种线性降维方法,能从纷繁复杂的数据中抽离出关键因素,用来区分不同的样本.这里我们不谈PCA背后的数学原理,只谈哪些软件能够处理数据,我找到了以下三款 Plink: https: ...

  2. 用plink做GWAS(PCA、关联分析)并用R绘图

    用plink做GWAS(PCA.关联分析)并用R绘图 GWAS 一.观察初始数据 二.质量控制 样本缺失率和位点缺失率过滤(产生.imiss和lmiss文件) 计算MAF 数据清理 三.检查亲缘关系 ...

  3. plink源码_哔哩哔哩 | 在windows下如何使用plink进行GWAS分析?

    1. 为什么要在windows下操作? 之前写了一个GWAS使用plink的入门教程(笔记 | GWAS 操作流程3:plink关联分析--完结篇,笔记 | GWAS 操作流程1:下载数据),因为是在 ...

  4. GWAS全基因组关联分析流程(BWA+samtools+gatk+Plink+Admixture+Tassel)

    我梳理了GWAS全基因组关联分析的整个流程,并提供了基本的命令,用到的软件包括BWA.samtools.gatk.Plink.Admixture.Tassel等,在此分享出来给大家提供参考. 一.BW ...

  5. GWAS | 全基因组关联分析 | PLINK | 实战 | 统计遗传学

    参考:PLINK | File format reference vcftools plink的主要功能:数据处理,质量控制的基本统计,群体分层分析,单位点的基本关联分析,家系数据的传递不平衡检验,多 ...

  6. 使用plink进行连锁不平衡分析

    欢迎关注"生信修炼手册"! plink是进行连锁不平衡分析的常用工具之一,需要两个基本的输入文件,后缀分别为ped和map.ped文件格式在之前的文章中已经详细介绍过,这里只介绍m ...

  7. 利用PLINK进行GWAS分析

    PLINK软件输入文件的常见格式类型: 1,一般格式:PED/MAP 2,转置格式:TPED/TFAM 3,二进制格式:BED/BIM/FAM 几种格式之间可以相互转换.推荐使用BED/BIM/FAM ...

  8. 分析与解决:MySQL分区表复制bug导致的主从延迟

    来自:DBAplus社群 作者介绍 张松坡,腾讯云数据库架构师,主要负责腾讯云数据库MySQL.Redis等数据库架构设计.数据库运维.运营开发等工作.曾就职于腾讯新闻.腾讯视频. 写在前面,感谢腾讯 ...

  9. JVM致命错误日志(hs_err_pid.log)分析

    为什么80%的码农都做不了架构师?>>>    当jvm出现致命错误时,会生成一个错误文件 hs_err_pid<pid>.log,其中包括了导致jvm crash的重要 ...

最新文章

  1. Pycharm的运行和简单调试
  2. pandas DataFrame(5)-合并DataFrame与Series
  3. AIM Tech Round 4 (Div. 2)ABCD
  4. 基础知识:元组、字典、集合
  5. 【Unity3D技巧】一个简单的Unity-UI框架的实现
  6. Ubuntu18.04安装VCS、Verdi、dve全套教程亲测(有成功截图)
  7. 要比惨吗?看看这个女人
  8. 如何获取上传文件的本地路径
  9. java版 modbus crc16校验 (已测试成功)_java版 ModBus CRC16校验 (已测试成功)
  10. Android下Cocos2d创建HelloWorld工程
  11. python 字符串编码解码和格式化问题
  12. QT编程之——使用全局变量
  13. Master公式求递归复杂度
  14. react引入html2canvas和jspdf生成PDF打印及下载
  15. Combined Cycle Power Plant Data Set(初学练手:详解)
  16. 【笔记】H5跳转手机应用商店(指定应用页/第三方应用商店)
  17. TreeView的用法
  18. impala 基础知识及使用
  19. 如何让Flex AIR压缩解压缩库airxzip也支持中文文件名
  20. 四川大学c语言实验报告,四川大学-C语言程序设计精品课程申报网站

热门文章

  1. 抢占智能制造“制高点” 个性化生产不可或缺
  2. 公安出入境管理用指纹采集设备接口规范
  3. 给你讲明白MySQL的乐观锁和悲观锁
  4. Optimal Trajectory Generation for Autonomous Vehicles Under Centripetal Acceleration Constraint [翻译]
  5. 便于查询增加索引文件 c语言,英汉电子词典小项目总结
  6. 点云智能分类研究进展与展望
  7. 第1章 Python编程基础
  8. 用云机器/虚拟机架设方舟游戏?
  9. 酸辣粉生产线 方便粉丝加工设备
  10. handler java_Java中以handler命名的类有什么含义吗?