446262 Pentatricopeptide repeat proteins from 1121 species.

The RNA-binding pentatricopeptide repeat (PPR) family comprises hundreds to thousands of genes in most plants, but only a few dozen in algae, evidence of massive gene expansions during land plant evolution. The nature and timing of these expansions has not been well-defined due to the sparse sequence data available from early-diverging land plant lineages. We exploit the comprehensive OneKP dataset of over 1000 transcriptomes from diverse plants and algae to establish a clear picture of the evolution of this massive gene family, focusing on the proteins typically associated with RNA editing, which show the most spectacular variation in numbers and domain composition across the plant kingdom.