awk print every other line (or every Nth line) in fasta file

This specific line of awk doesn't have much general utility, but it was intended to pull out every other sequence record in a .fasta file. It can be applied to every Nth record in the fasta file as well by changing the modulo operator statement.  It only applies to .fasta files in which the sequence string isn't wrapped into multiple lines.

Here it is in its one-liner form:
awk 'BEGIN{i=0} (substr($0,1,1) == ">") { if (i%2 == 0) {print $0; getline; print $0} i++}' test.fa
And it makes a bit more sense when formatted:
awk 'BEGIN{i=0} (substr($0,1,1) == ">") {
 if (i%2 == 0) {
  print $0
  print $0
}' test.fa
This assumes the .fasta file is of the format:
and the sequence string is contained entirely on one line.


Popular Posts