Arun Seetharam bio photo

Arun Seetharam

Bioinformatician

Email Twitter Github

Sometimes it is necessary to split a large file containing several sequences (fasta format) in to individual files. I do this by a simple ‘awk’ command where i separate sequences based on regular expression match and then write it to a file numbered sequentially. It is easy and quick!

awk '/^>/{s=++d".fasta"} {print > s}' <inputFile>