Journal Club: short ORFs and gene prediction

This week's journal club paper: Small open reading frames associated with morphogenesis are hidden in plant genomes by Hanada et al. (2013).

The implications of the idea that many functional, small open reading frames (sORFs) exist in eukaryotic genomes are pretty exciting. The paper didn't entirely convince me that this was the case, but I was sufficiently persuaded that it merits further investigation.

I didn't see any major red flags with the paper. I would have preferred biological replicates for their sORF arrays over technical replicates. Additionally, overexpression mutants can cause all kinds of wacky effects that aren't necessarily a consequence of the functional properties of the overexpressed genes; I'm not sure concluding that overexpression mutants of sORFs causes phenotypic consequences more frequently than random chosen genes is going to hold up to further scrutiny (maybe sORFs accumulate more effectively than regular protein products to cause strange side effects). However, again, I think it's sufficient to merit further investment in the topic, such as the knock-down/knockouts they propose.

My comment:

"I think it's worth continuing to pursue study of these small ORFs as the authors propose. They may have already shown this in one of their previous publications, but I'm interested in the properties of these small ORFs in the context of gene prediction. They said they used hexamer composition bias for prediction; are there other properties that might be useful for prediction, like CpG island promoters?"

Comments