http://gmod.org/wiki/GFF

gff介绍

GFF3: http://song.sourceforge.net/gff3.shtml

GFF: http://www.sanger.ac.uk/Software/formats/GFF/

GFF2PS: http://genome.imim.es/software/gfftools/GFF2PS.html

http://gmod.org/wiki/GBrowse

GBrowse

http://xerad.systemsbiology.net/MotifMogulServer/MotifMogulFAQ.html

PSL

format

PSL lines represent alignments, and are typically taken from

files generated by BLAT or psLayout. See the BLAT documentation for more details. All of the

following fields are required on each data line within a PSL

file:

matches - Number of bases that match that

aren't repeats

misMatches - Number of bases that don't

match

repMatches - Number of bases that match but

are part of repeats

nCount - Number of 'N' bases

qNumInsert - Number of inserts in query

qBaseInsert - Number of bases inserted in

query

tNumInsert - Number of inserts in target

tBaseInsert - Number of bases inserted in

target

strand - '+' or '-' for query strand. For

translated alignments, second '+'or '-' is for genomic strand

qName - Query sequence name

qSize - Query sequence size

qStart - Alignment start position in

query

qEnd - Alignment end position in query

tName - Target sequence name

tSize - Target sequence size

tStart - Alignment start position in

target

tEnd - Alignment end position in target

blockCount - Number of blocks in the alignment

(a block contains no gaps)

blockSizes - Comma-separated list of sizes of

each block

qStarts - Comma-separated list of starting

positions of each block in query

tStarts - Comma-separated list of starting

positions of each block in target

Example:

Here is an example of an annotation track in PSL format. Note that

line breaks have been inserted into the PSL lines in this example

for documentation display purposes. Click here for a copy of this example that can be pasted into the

browser without editing.

track name=fishBlats description="Fish BLAT" useScore=1

59 9 0 0 1 823 1 96 +- FS_CONTIG_48080_1 1955 171 1062 chr22

47748585 13073589 13073753 2 48,20, 171,1042,

34674832,34674976,

59 7 0 0 1 55 1 55 +- FS_CONTIG_26780_1 2825 2456 2577 chr22

47748585 13073626 13073747 2 21,45, 2456,2532,

34674838,34674914,

59 7 0 0 1 55 1 55 -+ FS_CONTIG_26780_1 2825 2455 2676 chr22

47748585 13073727 13073848 2 45,21, 249,349, 13073727,13073827,

Be aware that the coordinates for a negative strand in a PSL

line are handled in a special way. In the qStart and

qEnd fields, the coordinates indicate the position where

the query matches from the point of view of the forward strand,

even when the match is on the reverse strand. However, in the

qStarts list, the coordinates are reversed.

Example:

Here is a 30-mer containing 2 blocks that align on the minus strand

and 2 blocks that align on the plus strand (this sometimes can

happen in response to assembly errors):

0 1 2 3 tens position in query

0123456789012345678901234567890 ones position in query

++++ +++++ plus strand alignment on query

-------- ---------- minus strand alignment on query

Plus strand:

qStart=12

qEnd=31

blockSizes=4,5

qStarts=12,26

Minus strand:

qStart=4

qEnd=26

blockSizes=10,8

qStarts=5,19

Essentially, the minus strand blockSizes and

qStarts are what you would get if you reverse-complemented

the query. However, the qStart and qEnd are not

reversed. To convert one to the other: qStart = qSize - revQEnd

qEnd = qSize - revQStart

GFF format

GFF (General Feature Format) lines are based on the GFF standard

file format. GFF lines have nine required fields that must

be tab-separated. If the fields are separated by spaces instead of

tabs, the track will not display correctly. For more information on

GFF format, refer to http://www.sanger.ac.uk/Software/formats/GFF.

Here is a brief description of the GFF fields:

seqname - The name of the sequence. Must be a

chromosome or scaffold.

source - The program that generated this

feature.

feature - The name of this type of feature.

Some examples of standard feature types are "CDS", "start_codon",

"stop_codon", and "exon".

start - The starting position of the feature

in the sequence. The first base is numbered 1.

end - The ending position of the feature

(inclusive).

score - A score between 0 and 1000. If the

track line useScore attribute is set to 1 for this

annotation data set, the score value will determine the

level of gray in which this feature is displayed (higher numbers =

darker gray). If there is no score value, enter ".".

strand - Valid entries include '+', '-', or

'.' (for don't know/don't care).

frame - If the feature is a coding exon,

frame should be a number between 0-2 that represents the

reading frame of the first base. If the feature is not a coding

exon, the value should be '.'.

group - All lines with the same group are

linked together into a single item.

Example:

Here's an example of a GFF-based track. Click here for a copy of this example that can be pasted into the

browser without editing. NOTE: Paste operations on some operating

systems will replace tabs with spaces, which will result in an

error when the GFF track is uploaded. You can circumvent this

problem by pasting the URL of the above example

(http://genome.ucsc.edu/goldenPath/help/regulatory.txt) instead of

the text itselfinto the custom annotation track text box.

track name=regulatory description="TeleGene(tm) Regulatory

Regions"

chr22 TeleGene enhancer 1000000 1001000 500 + . touch1

chr22 TeleGene promoter 1010000 1010100 900 + . touch1

chr22 TeleGene promoter 1020000 1020000 800 - . touch2

GTF

format

GTF (Gene Transfer Format) is a

refinement to GFF that tightens the specification. The first eight

GTF fields are the same as GFF. The group field has been

expanded into a list of attributes. Each attribute

consists of a type/value pair. Attributes must end in a semi-colon,

and be separated from any following attribute by exactly one

space.

The attribute list must begin with the two mandatory

attributes:

gene_id value - A globally

unique identifier for the genomic source of the sequence.

transcript_id value - A

globally unique identifier for the predicted transcript.

Example:

Here is an example of the ninth field in a GTF data line:

gene_id

"Em:U62317.C22.6.mRNA"; transcript_id "Em:U62317.C22.6.mRNA";

exon_number 1

For more information on this format, see http://genes.cse.wustl.edu/GTF2.html.

The Genome Browser groups together GTF lines that have the same

transcript_id value. It only looks at features of type

exon and CDS.

gff文件用什么打开_GFF格式、psl格式、GBrowse介绍及其它可视化工具相关推荐

  1. gff文件用什么打开_gff文件转换成gtf文件

    做测序数据分析的时候经常需要将gff格式的注释文件转换成gtf格式的文件.今天小编就给大家介绍一个工具,gffread来实现这个目的.注意这个工具需要在linux或者mac操作系统上运行. 下面是一个 ...

  2. gff文件用什么打开_GTF/GFF文件的差异及其相互转换

    MyTear 我们在做生物分析的时候,经常会碰到GFF格式的文件以及GTF格式的注释文件.他们有着相似的名字,甚至连内容都极为相似~那么,他们究竟差在哪里呢? GFF全称为general featur ...

  3. gff文件用什么打开_GFF3格式文件

    GFF3是GFF注释文件的新标准.文件中每一行为基因组的一个属性,分为9列,以TAB分开. 依次是: 1. reference sequence:参照序列 指出注释的对象.如一个染色体,克隆或片段.可 ...

  4. gff文件_如何提取gff文件中的基因注释信息

    原标题:如何提取gff文件中的基因注释信息 gff3格式注释文件是最常见的基因注释,(https://archive.broadinstitute.org/annotation/argo/help/g ...

  5. gff文件_gff/gtf格式

    1)gff3及gtf2简介 一个物种的基因组测序完成后,需要对这些数据进行解读,首先要先找到这些序列中转录起始位点.基因.外显子.内含子等组成元件在染色体中的位置信息(即注释)后才能再进行深入的分析. ...

  6. gff文件_GFF格式说明 | Public Library of Bioinformatics

    gff格式是Sanger研究所定义,是一种简单的.方便的对于DNA.RNA以及蛋白质序列的特征进行描述的一种数据格式,比如序列的那里到那里是基因,已经成为序列注释的通用格式,比如基因组的基因预测,许多 ...

  7. gff文件_GFF格式说明

    .原始定义见 SONG website gff 是存文本文件,由 tab 键隔开的 9 列组成,以下是各列的说明: Column 1: "seqid" 序列的编号,编号的有效字符 ...

  8. python下载大文件mp4_Python代码打开本地.mp4格式文件的方法

    Python代码打开本地.mp4格式文件的方法 想通过编写Python代码来打开本地的.mp4格式文件,使用os模块来操作文件.我的电脑默认的是QQ影音播放器,执行Python代码打开默认播放器,播放 ...

  9. .NPT 扩展名格式文件类型及打开方式分析:首次渗入 XR 内容领域

    文件扩展名 NPT 有多种文件类型,并且也与多种不同的软件程序相关联.NPT 文件主要归类为 Data Files.Settings Files 类型.AR/VR技术普及以后,NPT 也有了 XR 内 ...

最新文章

  1. 【技术干货】TC基础与自动化
  2. python基础语言法则_python语言基础3
  3. vmware使用技巧
  4. java语言c语言基础_新手入门选什么:有些人说C语言要比Java更难!你应该怎么办?...
  5. erp的术语-jde系统
  6. linux的php探针使用,php探针在Linux下的安装过程
  7. Override and Overload
  8. uiautomatorviewer 定位提示Error obtaining UI hierarchy
  9. QT 获取键盘组合键
  10. 时间格式转换,转时间戳,转UTC,转中国标准时间
  11. git commit 提交出错,工作区代码被回退到最开始内容
  12. 论文经验 - 计算机视觉(CV)方向
  13. uni-app微信小程序扫普通二维码分享小程序
  14. intel无盘服务器,英特尔网吧服务器
  15. mysql中sum和count的区别
  16. Java 的可移植性
  17. grep 同时查找多个文件
  18. board (双联通分量)
  19. 第二章 C++编程简介【信息学奥赛】
  20. 什么是iu组装服务器,华硕迷你IU机架服务器RS100-E4/PI2全新上市

热门文章

  1. 前端面试题整理——(第一弹 HTML和CSS)
  2. 联想电脑欲重返手机市场,首款机型揭秘
  3. 基本算法温习:打印金字塔
  4. 关于nginx启用HTTP2后出现ERR_HTTP2_INADEQUATE_TRANSPORT_SECURITY错误的解决方案
  5. cpri带宽不足的解决方法_一些常见网络问题的解决方法
  6. 微信小程序填坑篇 2
  7. centos 7 播放MP3文件
  8. 无需qc,使用ipv6外网访问群晖的最简单方法
  9. 熬过创业初期的艰辛,一个项目让他盈利20余万
  10. android开发提示对话框,Android中Notification 提示对话框