Friday, September 6, 2013

Questions about models to analyze selective pressures

We attend a journal club where we read an article and watch a presentation given by the student that chose the paper, once a week. This is my second semester in a journal club, and I want to start documenting my thoughts/questions on certain papers.

Today's paper is Comparative transcriptomics reveals patterns of selection in domesticated and wild tomato by Koenig et al. (2013) . This paper has presented some new-to-me analysis of gene expression, and in particular I hope to learn more about the models of evolution that they used in their analysis of selective pressures.

Currently, I'm having a difficult time understanding the conclusions they draw with respect to their results of fitting three models of evolution: Brownian motion single rate, Ornstein-Uhlenbeck, and Brownian motion two rate model. As a disclaimer, I'm not all that familiar with these models, or their analyses, so my concern may be totally off-base.  

My interpretation: After fitting the models and using Akaike information criteria statistical test they found genes that best fit the Brownian motion two rate model, and for those genes:
  •  S. pennellii branch had the largest number of genes 
  •  S. lycopersicum branch had the largest proportion of unique genes 
(Like I said, I'm unclear on how those were calculated.) So, on the off chance that I understood that correctly, my question is why they only hypothesized that the "rapid divergence in gene expression that has occurred in S. pennelli can be explained by neutral process."  

My confusion:
  • Divergence is calculated/assessed by comparing two objects, in this case branches. So, shouldn't there be an equally reasonable argument for evolution on the other branch, e.g. that human selection (domestication) occurred on the other branches that leads to the "rapid divergence"? 
  • And, why do they choose the neutral process? My limited understanding of Brownian motion says that BM does not only describe random drift, but other mechanisms follow this trend such as randomly changing selective regimes and continued change in independent additive factors of small effects. 

Another reason I am totally aware that I could be way off-base is because in their discussion, given what I believe are results of the same type of analysis, they draw conclusions the way I expect them to be drawn:

"The most extensive network requiring that we discovered in S. lycopersicum relates to light responsiveness. Loss of connectivity in this network may reflect selection for reduced light response in S. lycopersicum or may reflect a more robust response in the desert-adapted S. pennellii..."

Their acknowledgement that either of those changes may be what we are seeing is reasonable, whereas before their seemingly one-sided conclusion seems unreasonable.

Halp, please.

TL;DR (comment for journal club)

This paper presents a lot of new-to-me analyses. In particular I think it was innovative to use models of evolution on gene expression data when I believe that they are more commonly used on more traditional phenotype data like physical characteristics. However, their interpretation of the results with respect to fitting the models caused some confusion for me:

My limited understanding of the Brownian motion model is that it does not only describe random drift. Other mechanisms follow this trend such as randomly changing selective regimes or continued change in independent additive factors of small effects. However, of the multiple interpretations of the Brownian motion model, the authors only chose the n
eutral process interpretation without defending it.

No comments:

Post a Comment