gff文件用什么打开_GFF格式、psl格式、GBrowse介绍及其它可视化工具
http://gmod.org/wiki/GFF
gff介绍
GFF3: http://song.sourceforge.net/gff3.shtml
GFF: http://www.sanger.ac.uk/Software/formats/GFF/
GFF2PS: http://genome.imim.es/software/gfftools/GFF2PS.html
http://gmod.org/wiki/GBrowse
GBrowse
http://xerad.systemsbiology.net/MotifMogulServer/MotifMogulFAQ.html
PSL
format
PSL lines represent alignments, and are typically taken from
files generated by BLAT or psLayout. See the BLAT documentation for more details. All of the
following fields are required on each data line within a PSL
file:
matches - Number of bases that match that
aren't repeats
misMatches - Number of bases that don't
match
repMatches - Number of bases that match but
are part of repeats
nCount - Number of 'N' bases
qNumInsert - Number of inserts in query
qBaseInsert - Number of bases inserted in
query
tNumInsert - Number of inserts in target
tBaseInsert - Number of bases inserted in
target
strand - '+' or '-' for query strand. For
translated alignments, second '+'or '-' is for genomic strand
qName - Query sequence name
qSize - Query sequence size
qStart - Alignment start position in
query
qEnd - Alignment end position in query
tName - Target sequence name
tSize - Target sequence size
tStart - Alignment start position in
target
tEnd - Alignment end position in target
blockCount - Number of blocks in the alignment
(a block contains no gaps)
blockSizes - Comma-separated list of sizes of
each block
qStarts - Comma-separated list of starting
positions of each block in query
tStarts - Comma-separated list of starting
positions of each block in target
Example:
Here is an example of an annotation track in PSL format. Note that
line breaks have been inserted into the PSL lines in this example
for documentation display purposes. Click here for a copy of this example that can be pasted into the
browser without editing.
track name=fishBlats description="Fish BLAT" useScore=1
59 9 0 0 1 823 1 96 +- FS_CONTIG_48080_1 1955 171 1062 chr22
47748585 13073589 13073753 2 48,20, 171,1042,
34674832,34674976,
59 7 0 0 1 55 1 55 +- FS_CONTIG_26780_1 2825 2456 2577 chr22
47748585 13073626 13073747 2 21,45, 2456,2532,
34674838,34674914,
59 7 0 0 1 55 1 55 -+ FS_CONTIG_26780_1 2825 2455 2676 chr22
47748585 13073727 13073848 2 45,21, 249,349, 13073727,13073827,
Be aware that the coordinates for a negative strand in a PSL
line are handled in a special way. In the qStart and
qEnd fields, the coordinates indicate the position where
the query matches from the point of view of the forward strand,
even when the match is on the reverse strand. However, in the
qStarts list, the coordinates are reversed.
Example:
Here is a 30-mer containing 2 blocks that align on the minus strand
and 2 blocks that align on the plus strand (this sometimes can
happen in response to assembly errors):
0 1 2 3 tens position in query
0123456789012345678901234567890 ones position in query
++++ +++++ plus strand alignment on query
-------- ---------- minus strand alignment on query
Plus strand:
qStart=12
qEnd=31
blockSizes=4,5
qStarts=12,26
Minus strand:
qStart=4
qEnd=26
blockSizes=10,8
qStarts=5,19
Essentially, the minus strand blockSizes and
qStarts are what you would get if you reverse-complemented
the query. However, the qStart and qEnd are not
reversed. To convert one to the other: qStart = qSize - revQEnd
qEnd = qSize - revQStart
GFF format
GFF (General Feature Format) lines are based on the GFF standard
file format. GFF lines have nine required fields that must
be tab-separated. If the fields are separated by spaces instead of
tabs, the track will not display correctly. For more information on
GFF format, refer to http://www.sanger.ac.uk/Software/formats/GFF.
Here is a brief description of the GFF fields:
seqname - The name of the sequence. Must be a
chromosome or scaffold.
source - The program that generated this
feature.
feature - The name of this type of feature.
Some examples of standard feature types are "CDS", "start_codon",
"stop_codon", and "exon".
start - The starting position of the feature
in the sequence. The first base is numbered 1.
end - The ending position of the feature
(inclusive).
score - A score between 0 and 1000. If the
track line useScore attribute is set to 1 for this
annotation data set, the score value will determine the
level of gray in which this feature is displayed (higher numbers =
darker gray). If there is no score value, enter ".".
strand - Valid entries include '+', '-', or
'.' (for don't know/don't care).
frame - If the feature is a coding exon,
frame should be a number between 0-2 that represents the
reading frame of the first base. If the feature is not a coding
exon, the value should be '.'.
group - All lines with the same group are
linked together into a single item.
Example:
Here's an example of a GFF-based track. Click here for a copy of this example that can be pasted into the
browser without editing. NOTE: Paste operations on some operating
systems will replace tabs with spaces, which will result in an
error when the GFF track is uploaded. You can circumvent this
problem by pasting the URL of the above example
(http://genome.ucsc.edu/goldenPath/help/regulatory.txt) instead of
the text itselfinto the custom annotation track text box.
track name=regulatory description="TeleGene(tm) Regulatory
Regions"
chr22 TeleGene enhancer 1000000 1001000 500 + . touch1
chr22 TeleGene promoter 1010000 1010100 900 + . touch1
chr22 TeleGene promoter 1020000 1020000 800 - . touch2
GTF
format
GTF (Gene Transfer Format) is a
refinement to GFF that tightens the specification. The first eight
GTF fields are the same as GFF. The group field has been
expanded into a list of attributes. Each attribute
consists of a type/value pair. Attributes must end in a semi-colon,
and be separated from any following attribute by exactly one
space.
The attribute list must begin with the two mandatory
attributes:
gene_id value - A globally
unique identifier for the genomic source of the sequence.
transcript_id value - A
globally unique identifier for the predicted transcript.
Example:
Here is an example of the ninth field in a GTF data line:
gene_id
"Em:U62317.C22.6.mRNA"; transcript_id "Em:U62317.C22.6.mRNA";
exon_number 1
For more information on this format, see http://genes.cse.wustl.edu/GTF2.html.
The Genome Browser groups together GTF lines that have the same
transcript_id value. It only looks at features of type
exon and CDS.
gff文件用什么打开_GFF格式、psl格式、GBrowse介绍及其它可视化工具相关推荐
- gff文件用什么打开_gff文件转换成gtf文件
做测序数据分析的时候经常需要将gff格式的注释文件转换成gtf格式的文件.今天小编就给大家介绍一个工具,gffread来实现这个目的.注意这个工具需要在linux或者mac操作系统上运行. 下面是一个 ...
- gff文件用什么打开_GTF/GFF文件的差异及其相互转换
MyTear 我们在做生物分析的时候,经常会碰到GFF格式的文件以及GTF格式的注释文件.他们有着相似的名字,甚至连内容都极为相似~那么,他们究竟差在哪里呢? GFF全称为general featur ...
- gff文件用什么打开_GFF3格式文件
GFF3是GFF注释文件的新标准.文件中每一行为基因组的一个属性,分为9列,以TAB分开. 依次是: 1. reference sequence:参照序列 指出注释的对象.如一个染色体,克隆或片段.可 ...
- gff文件_如何提取gff文件中的基因注释信息
原标题:如何提取gff文件中的基因注释信息 gff3格式注释文件是最常见的基因注释,(https://archive.broadinstitute.org/annotation/argo/help/g ...
- gff文件_gff/gtf格式
1)gff3及gtf2简介 一个物种的基因组测序完成后,需要对这些数据进行解读,首先要先找到这些序列中转录起始位点.基因.外显子.内含子等组成元件在染色体中的位置信息(即注释)后才能再进行深入的分析. ...
- gff文件_GFF格式说明 | Public Library of Bioinformatics
gff格式是Sanger研究所定义,是一种简单的.方便的对于DNA.RNA以及蛋白质序列的特征进行描述的一种数据格式,比如序列的那里到那里是基因,已经成为序列注释的通用格式,比如基因组的基因预测,许多 ...
- gff文件_GFF格式说明
.原始定义见 SONG website gff 是存文本文件,由 tab 键隔开的 9 列组成,以下是各列的说明: Column 1: "seqid" 序列的编号,编号的有效字符 ...
- python下载大文件mp4_Python代码打开本地.mp4格式文件的方法
Python代码打开本地.mp4格式文件的方法 想通过编写Python代码来打开本地的.mp4格式文件,使用os模块来操作文件.我的电脑默认的是QQ影音播放器,执行Python代码打开默认播放器,播放 ...
- .NPT 扩展名格式文件类型及打开方式分析:首次渗入 XR 内容领域
文件扩展名 NPT 有多种文件类型,并且也与多种不同的软件程序相关联.NPT 文件主要归类为 Data Files.Settings Files 类型.AR/VR技术普及以后,NPT 也有了 XR 内 ...
最新文章
- 【技术干货】TC基础与自动化
- python基础语言法则_python语言基础3
- vmware使用技巧
- java语言c语言基础_新手入门选什么:有些人说C语言要比Java更难!你应该怎么办?...
- erp的术语-jde系统
- linux的php探针使用,php探针在Linux下的安装过程
- Override and Overload
- uiautomatorviewer 定位提示Error obtaining UI hierarchy
- QT 获取键盘组合键
- 时间格式转换,转时间戳,转UTC,转中国标准时间
- git commit 提交出错,工作区代码被回退到最开始内容
- 论文经验 - 计算机视觉(CV)方向
- uni-app微信小程序扫普通二维码分享小程序
- intel无盘服务器,英特尔网吧服务器
- mysql中sum和count的区别
- Java 的可移植性
- grep 同时查找多个文件
- board (双联通分量)
- 第二章 C++编程简介【信息学奥赛】
- 什么是iu组装服务器,华硕迷你IU机架服务器RS100-E4/PI2全新上市
热门文章
- 前端面试题整理——(第一弹 HTML和CSS)
- 联想电脑欲重返手机市场,首款机型揭秘
- 基本算法温习:打印金字塔
- 关于nginx启用HTTP2后出现ERR_HTTP2_INADEQUATE_TRANSPORT_SECURITY错误的解决方案
- cpri带宽不足的解决方法_一些常见网络问题的解决方法
- 微信小程序填坑篇 2
- centos 7 播放MP3文件
- 无需qc,使用ipv6外网访问群晖的最简单方法
- 熬过创业初期的艰辛,一个项目让他盈利20余万
- android开发提示对话框,Android中Notification 提示对话框