linux 中awk命令实现从gff文件中排除pseudogene信息

发布时间 2023-06-08 00:37:13作者: 小鲨鱼2018

 

001、

[b20223040323@admin2 test]$ ls
a.gff
[b20223040323@admin2 test]$ cat a.gff
region      1
pseudogene  2
transcript  3
exon        4
pseudogene  5
transcript  6
exon        7
gene        8
miRNA       9
exon        10
pseudogene  11
pseudogene  12
mRNA        13
exon        14
pseudogene  15
gene        16
mRNA        17
gene        18
exon        19
gene        20
lnc_RNA     21
[b20223040323@admin2 test]$ awk 'BEGIN{tag = "yes"}{if($1 == "pseudogene") {tag = "no"}; if($1 == "gene") {tag = "yes"}; if(tag == "yes") {print $0}}'  a.gff  ## 过滤掉pseudo信息
region      1
gene        8
miRNA       9
exon        10
gene        16
mRNA        17
gene        18
exon        19
gene        20
lnc_RNA     21