To start with the entries look like this:
>F13C5.1 CE19383 WBGene00017422 status:Partially_confirmed UniProt:O76564 protein_id:AAC64611.1
Run this vim command:
:%s:>\(\S\{4,}\)\t.*UniProt\:\(\S\{6,}\).*$:>\1_CAEEL__\2:g
And now they look like this:
>geneName_OrgID__UniProtAccNo
No comments:
Post a Comment