Skip to content

Topological sorting of gff features#

It might be critical to have a GFF/GTF file properly sorted:

  • Not properly sorted, a genome browser can bug or give wrong displays
  • Some tools require files sorted in a particular way (e.g.tabix tool from htslib need a GFF sorted by chromosomes and positions).
  • It makes it easy to ready for the human eye

Zhigang Lu has made a nice post about his experience trying to find a way to get a correct topological sorting. See here.

Table of Contents#

Tests summary#

tool option in command line Type of sorting Comment
AGAT --tabix by chromosomes, by gene position, by type (mRNAs then exon, then CDS then alphabetical feature types; then mRNA2 then exon2, then CDS2 then alphabetical feature2 types) Fix GFF/GTF if needed
GenomeTools -sortlines -tidy -retainids by chromosomes and positions then random feature type Lines with the same chromosomes and start positions would be placed randomly, so parent feature lines might sometimes be placed after their children lines.
GenomeTools -retainids by chromosomes, by gene position, by type (mRNA then children; then mRNA2 then children2), by position (children are sorted by positions)
GFF3sort --precise by chromosomes and positions then attribute with Parent attribute first. move lines with "Parent=" attributes (case insensitive) behind lines without "Parent=" attributes. The goal of GFF3sort is not to obtain a topological sorting but rather getting something that could be indexed optimally by third part tools.
gffread By default, chromosomes are kept in the order they were found. With --sort-alpha parameter the chromosomes (reference sequences) are sorted alphabetically /!\ Some feature types are lost e.g. gene, three_prime_UTR, five_prime_UTR, etc...

Example 1#

This test is based on the file used by Zhigang Lu

The GFF file to sort#

##gff-version 3
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    gene    103403  151162  0.12    -   .   ID=Smp_315690
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.02    -   .   ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.1 -   .   ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.93    -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.96    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 105920  106144  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    105920  106144  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 106876  107159  0.93    -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    106876  107159  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 140582  140849  0.85    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    140582  140849  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.2

Results#

AGAT#

AGAT v1.0.0

  • default sorting

agat_convert_sp_gxf2gxf.pl --gff test.gff

##gff-version 3
SM_V7_1 AUGUSTUS    gene    103403  151162  0.12    -   .   ID=Smp_315690
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.02    -   .   ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   ID=exon-1;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   ID=exon-6;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   ID=exon-8;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   ID=exon-10;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.93    -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   ID=five_prime_utr-1;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   ID=three_prime_utr-1;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.1 -   .   ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   ID=exon-2;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    105920  106144  .   -   .   ID=exon-3;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    106876  107159  .   -   .   ID=exon-4;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    140582  140849  .   -   .   ID=exon-5;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   ID=exon-7;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   ID=exon-9;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   ID=exon-11;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.96    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 105920  106144  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 106876  107159  0.93    -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 140582  140849  0.85    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   ID=five_prime_utr-2;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   ID=three_prime_utr-2;Parent=Smp_315690.2
  • Tabix sorting

agat config --expose --tabix agat_convert_sp_gxf2gxf.pl --gff test.gff

##gff-version 3
SM_V7_1 AUGUSTUS  gene  103403  151162  0.12  - . ID=Smp_315690
SM_V7_1 AUGUSTUS  mRNA  103403  151162  0.02  - . ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1 AUGUSTUS  mRNA  103403  151162  0.1 - . ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1 AUGUSTUS  exon  103403  103770  . - . ID=exon-1;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  three_prime_UTR 103403  103440  . - . ID=three_prime_utr-1;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  exon  103403  103770  . - . ID=exon-2;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  three_prime_UTR 103403  103440  . - . ID=three_prime_utr-2;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 103441  103770  0.93  - 0 ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  CDS 103441  103770  0.96  - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  105920  106144  . - . ID=exon-3;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 105920  106144  1 - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  106876  107159  . - . ID=exon-4;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 106876  107159  0.93  - 2 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  140582  140849  . - . ID=exon-5;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 140582  140849  0.85  - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  142981  143205  . - . ID=exon-6;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  CDS 142981  143205  1 - 0 ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  exon  142981  143205  . - . ID=exon-7;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 142981  143205  1 - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  145395  145678  . - . ID=exon-8;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  CDS 145395  145678  1 - 2 ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  exon  145395  145678  . - . ID=exon-9;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 145395  145678  1 - 2 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  151075  151162  . - . ID=exon-10;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  CDS 151075  151132  1 - 0 ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  exon  151075  151162  . - . ID=exon-11;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 151075  151132  1 - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  five_prime_UTR  151133  151162  . - . ID=five_prime_utr-1;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  five_prime_UTR  151133  151162  . - . ID=five_prime_utr-2;Parent=Smp_315690.2
GenomeTools#

GenomeTools 1.6.1

gt gff3 -sortlines -tidy -retainids test.gff

##gff-version 3
##sequence-region   SM_V7_1 103403 151162
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    gene    103403  151162  0.12    -   .   ID=Smp_315690
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.02    -   .   ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.1 -   .   ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.96    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.93    -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 105920  106144  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    105920  106144  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 106876  107159  0.93    -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    106876  107159  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 140582  140849  0.85    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    140582  140849  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.2

gt gff3 -retainids test.gff

##gff-version 3
##sequence-region   SM_V7_1 103403 151162
SM_V7_1 AUGUSTUS    gene    103403  151162  0.12    -   .   ID=Smp_315690
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.02    -   .   ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.93    -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.1 -   .   ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.96    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 105920  106144  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    105920  106144  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 106876  107159  0.93    -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    106876  107159  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 140582  140849  0.85    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    140582  140849  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.2
###
GFF3sort#

GFF3sort 0.1.a1a2bc9

gff3sort.pl --precise test.gff

##gff-version 3
SM_V7_1 AUGUSTUS    gene    103403  151162  0.12    -   .   ID=Smp_315690
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.02    -   .   ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.1 -   .   ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.93    -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.96    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 105920  106144  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    105920  106144  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 106876  107159  0.93    -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    106876  107159  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 140582  140849  0.85    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    140582  140849  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.2
gffread#

gffread v0.11.4

gffread test.gff

# gffread test.gff
# gffread v0.11.4
##gff-version 3
SM_V7_1 AUGUSTUS    mRNA    103403  151162  .   -   .   ID=Smp_315690.1;geneID=Smp_315690
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 103441  103770  .   -   0   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 142981  143205  .   -   0   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 145395  145678  .   -   2   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 151075  151132  .   -   0   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    mRNA    103403  151162  .   -   .   ID=Smp_315690.2;geneID=Smp_315690
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    105920  106144  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    106876  107159  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    140582  140849  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 103441  103770  .   -   0   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 105920  106144  .   -   0   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 106876  107159  .   -   2   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 140582  140849  .   -   0   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 142981  143205  .   -   0   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 145395  145678  .   -   2   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 151075  151132  .   -   0   Parent=Smp_315690.2

Example 2#

This test is based on the file used by GFF3sort

The GFF file to sort#

##gff-version 3
###
A01 Cufflinks   mRNA    473 6154    .   -   .   ID=XLOC_001154.41;description=Novel: Intergenic transcript
A01 Cufflinks   exon    473 814 .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    1626    2574    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    2695    2721    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    3637    3726    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    5329    5408    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    5994    6154    .   -   .   Parent=XLOC_001154.41
###
A01 Cufflinks   mRNA    473 6386    .   -   .   ID=XLOC_001154.42;description=Novel: Intergenic transcript
A01 Cufflinks   exon    473 2024    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    2615    2721    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    3637    3726    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    5329    6386    .   -   .   Parent=XLOC_001154.42

Results#

AGAT#

AGAT v0.9.0

  • default sorting

agat_convert_sp_gxf2gxf.pl --gff test2.gff --merge_loci

##gff-version 3
###
A01 Cufflinks   gene    473 6386    .   -   .   ID=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01 Cufflinks   mRNA    473 6154    .   -   .   ID=XLOC_001154.41;Parent=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01 Cufflinks   exon    473 814 .   -   .   ID=exon-1;Parent=XLOC_001154.41
A01 Cufflinks   exon    1626    2574    .   -   .   ID=exon-2;Parent=XLOC_001154.41
A01 Cufflinks   exon    2695    2721    .   -   .   ID=exon-3;Parent=XLOC_001154.41
A01 Cufflinks   exon    3637    3726    .   -   .   ID=exon-4;Parent=XLOC_001154.41
A01 Cufflinks   exon    5329    5408    .   -   .   ID=exon-5;Parent=XLOC_001154.41
A01 Cufflinks   exon    5994    6154    .   -   .   ID=exon-6;Parent=XLOC_001154.41
A01 Cufflinks   mRNA    473 6386    .   -   .   ID=XLOC_001154.42;Parent=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01 Cufflinks   exon    473 2024    .   -   .   ID=exon-7;Parent=XLOC_001154.42
A01 Cufflinks   exon    2615    2721    .   -   .   ID=exon-8;Parent=XLOC_001154.42
A01 Cufflinks   exon    3637    3726    .   -   .   ID=exon-9;Parent=XLOC_001154.42
A01 Cufflinks   exon    5329    6386    .   -   .   ID=exon-10;Parent=XLOC_001154.42
  • Tabix sorting

agat_convert_sp_gxf2gxf.pl --gff test2.gff --merge_loci --tabix

##gff-version 3
###
A01 Cufflinks gene  473 6386  . - . ID=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01 Cufflinks mRNA  473 6154  . - . ID=XLOC_001154.41;Parent=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01 Cufflinks mRNA  473 6386  . - . ID=XLOC_001154.42;Parent=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01 Cufflinks exon  473 814 . - . ID=exon-1;Parent=XLOC_001154.41
A01 Cufflinks exon  473 2024  . - . ID=exon-7;Parent=XLOC_001154.42
A01 Cufflinks exon  1626  2574  . - . ID=exon-2;Parent=XLOC_001154.41
A01 Cufflinks exon  2615  2721  . - . ID=exon-8;Parent=XLOC_001154.42
A01 Cufflinks exon  2695  2721  . - . ID=exon-3;Parent=XLOC_001154.41
A01 Cufflinks exon  3637  3726  . - . ID=exon-4;Parent=XLOC_001154.41
A01 Cufflinks exon  3637  3726  . - . ID=exon-9;Parent=XLOC_001154.42
A01 Cufflinks exon  5329  5408  . - . ID=exon-5;Parent=XLOC_001154.41
A01 Cufflinks exon  5329  6386  . - . ID=exon-10;Parent=XLOC_001154.42
A01 Cufflinks exon  5994  6154  . - . ID=exon-6;Parent=XLOC_001154.41
GenomeTools#

GenomeTools 1.6.1

gt gff3 -sortlines -tidy -retainids test2.gff

##gff-version 3
##sequence-region   A01 473 6386
A01 Cufflinks   exon    473 814 .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    473 2024    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   mRNA    473 6154    .   -   .   ID=XLOC_001154.41;description=Novel: Intergenic transcript
A01 Cufflinks   mRNA    473 6386    .   -   .   ID=XLOC_001154.42;description=Novel: Intergenic transcript
A01 Cufflinks   exon    1626    2574    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    2615    2721    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    2695    2721    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    3637    3726    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    3637    3726    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    5329    5408    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    5329    6386    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    5994    6154    .   -   .   Parent=XLOC_001154.41
###

gt gff3 -retainids test2.gff

##gff-version 3
##sequence-region   SM_V7_1 103403 151162
SM_V7_1 AUGUSTUS    gene    103403  151162  0.12    -   .   ID=Smp_315690
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.02    -   .   ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.93    -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.1
SM_V7_1 AUGUSTUS    mRNA    103403  151162  0.1 -   .   ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1 AUGUSTUS    three_prime_UTR 103403  103440  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    103403  103770  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 103441  103770  0.96    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 105920  106144  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    105920  106144  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 106876  107159  0.93    -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    106876  107159  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 140582  140849  0.85    -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    140582  140849  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 142981  143205  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    142981  143205  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 145395  145678  1   -   2   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    145395  145678  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    CDS 151075  151132  1   -   0   ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    exon    151075  151162  .   -   .   Parent=Smp_315690.2
SM_V7_1 AUGUSTUS    five_prime_UTR  151133  151162  .   -   .   Parent=Smp_315690.2
###
GFF3sort#

GFF3sort 0.1.a1a2bc9

gff3sort.pl --precise test2.gff

##gff-version 3
A01 Cufflinks   mRNA    473 6154    .   -   .   ID=XLOC_001154.41;description=Novel: Intergenic transcript
A01 Cufflinks   mRNA    473 6386    .   -   .   ID=XLOC_001154.42;description=Novel: Intergenic transcript
A01 Cufflinks   exon    473 2024    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    473 814 .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    1626    2574    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    2615    2721    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    2695    2721    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    3637    3726    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    3637    3726    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    5329    6386    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    5329    5408    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    5994    6154    .   -   .   Parent=XLOC_001154.41
gffread#

gffread v0.11.4

gffread test2.gff

# gffread test2.gff
# gffread v0.11.4
##gff-version 3
A01 Cufflinks   mRNA    473 6154    .   -   .   ID=XLOC_001154.41
A01 Cufflinks   exon    473 814 .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    1626    2574    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    2695    2721    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    3637    3726    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    5329    5408    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   exon    5994    6154    .   -   .   Parent=XLOC_001154.41
A01 Cufflinks   mRNA    473 6386    .   -   .   ID=XLOC_001154.42
A01 Cufflinks   exon    473 2024    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    2615    2721    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    3637    3726    .   -   .   Parent=XLOC_001154.42
A01 Cufflinks   exon    5329    6386    .   -   .   Parent=XLOC_001154.42