Convert text to table

Posted by Quattro on Super User See other posts from Super User or by Quattro
Published on 2012-10-10T14:46:20Z Indexed on 2012/10/10 15:40 UTC
Read the original article Hit count: 291

Filed under:
|
|

I would like convert text into a table. Here is a link to the text

http://www.tcdb.org/public/tcdb

Short example:

>gnl|TC-DB|A0CIB0|1.A.17.3.1 Chromosome undetermined scaffold_19, whole genome shotgun sequence OS=Paramecium tetraurelia GN=GSPATT00007662001 PE=4 SV=1
MDDQNQPILQEQPKPKQKKPLLNTKMVKKQKMQNKKEENLREILNFYTNQVDARKFLQKM
KAVVDSNQQEKKYQDDFLNPNEYNEMQDIYEDYNMGDLVIVFPNPDADGVKNPPITYKEA
PLTKTNFYSKIGNVSYENDIDELCVDEMEYLRNMRNVDGEHMDQDHVKEEI
>gnl|TC-DB|A0CS82|9.B.82.1.5 Chromosome undetermined scaffold_26, whole genome shotgun sequence - Paramecium tetraurelia.
MIIEEQIEEKMIYKAIHRVKVNYQKKIDRYILYKKSRWFFNLLLMLLYAYRIQNIGGFYI
VTYIYCVYQLQLLIDYFTPLGLPPVNLEDEEEDDDQFQNDFSELPTTLSNKNELNDKEFR
PLLRTTSEFKVWQKSVFSVIFAYFCTYIPIWDIPVYWPFLFCYFFVIVGMSIRKYIKHMK
KYGYTILDFTKKK

I wanted to have columns for example delimited with pipe | or ;

|>gnl|TC-DB|A0CIB0|1.A.17.3.1| Chromosome undetermined scaffold_19, whole genome shotgun sequence OS=Paramecium tetraurelia GN=GSPATT00007662001 PE=4 SV=1|
MDDQNQPILQEQPKPKQKKPLLNTKMVKKQKMQNKKEENLREILNFYTNQVDARKFLQKM
KAVVDSNQQEKKYQDDFLNPNEYNEMQDIYEDYNMGDLVIVFPNPDADGVKNPPITYKEA
PLTKTNFYSKIGNVSYENDIDELCVDEMEYLRNMRNVDGEHMDQDHVKEEI

I am working with Windows and I don't know how to do it

I just know

  • every row starts with >
  • I want to substitute the first whitespace in a row with a delimiter like | or ;
  • after the first regular expression new line in a row, I want also a delimiter
  • everything between the regular expression first new line and > should go into a new column (it's a sequence of a protein)

© Super User or respective owner

Related posts about conversion

Related posts about regex