Plants are evolutionary champions, dominating Earth’s ecosystems for more than a billion years and making the planet habitable for countless other life forms, including us. Now, scientists have completed a nine-year genetic quest to shine a light on the long, complex history of land plants and green algae, revealing the plot twists and furious pace of the rise of this super group of organisms.
The project, known as the One Thousand Plant Transcriptomes Initiative (1KP), brought together nearly 200 plant biologists to sequence and analyze genes from more than 1,100 plant species spanning the green tree of life. A summary of the team’s findings published today in Nature.
“In the tree of life, everything is interrelated,” says Gane Ka-Shu Wong, lead investigator of 1KP and professor in the University of Alberta’s department of biological sciences. “And if we want to understand how the tree of life works, we need to examine the relationships between species. That’s where genetic sequencing comes in.”
Much of plant research has focused on crops and a few model species, obscuring the evolutionary backstory of a clade that is nearly half a million species strong.
To get a bird’s-eye view of plant evolution, the 1KP team sequenced transcriptomes – the set of genes that is actively expressed — to illuminate the genetic underpinnings of green algae, mosses, ferns, conifers, flowering plants and all other lineages of green plants.
“This gives a much broader perspective than what you could get by just looking at crops, which are all concentrated in one little part of the evolutionary tree,” says study co-author Pamela Soltis, University of Florida (UF) distinguished professor and Florida Museum of Natural History curator. “By having this bigger picture, you can understand how changes occurred in the genome, which then allows you to investigate changes in physical characteristics, chemistry or any other feature you’re interested in.”
One challenge was the project’s sheer size, says study co-author Douglas Soltis, UF distinguished professor and Florida Museum curator.
“To look at that many genomes is unparalleled,” he says. “It’s not a jump in technology as much as a jump in scale.”
Sequencing transcriptomes requires freshly collected tissues, which is how Soltis found himself trekking through Gainesville’s greenery with containers of liquid nitrogen. Back at the laboratory, a team extracted genetic material from the frozen plant clippings and shipped the extractions to China for sequencing. All over the world, their colleagues followed suit.
Analyzing the sequences also required a reworking of existing software, which wasn’t designed to handle such an unprecedented volume of genetic data, and without funding for the analysis, the researchers chipped away at the data as they had spare time.
But the labor was worth it, Pamela Soltis says.
“The plant community got more than 1,000 sets of sequences,” says Soltis, who also directs the UF Biodiversity Institute. “Who could argue with that? All these branches of the plant tree of life have been filled in.”
One hallmark of plant evolution — and a feature rarely seen in animals – is the frequency of genome duplication. Over and over again, lineages doubled, tripled or even quadrupled their entire set of genes, resulting in massive genome sizes. While the purpose of whole genome duplication is still unclear, scientists suspect that it may drive evolutionary innovation: If you have two copies of genes, one copy can gradually evolve a new function.
Addressing the frequency of whole genome duplication in plants was one of 1KP’s goals, Douglas Soltis said. While flowering plants and ferns were already famous for genome duplication, Soltis said 1KP uncovered a number of previously unknown duplication events in these groups, as well as in the gymnosperms, the group of plants that includes conifers.
Other plant lineages took a different route, expanding certain gene families rather than copying their entire genome. This, too, is thought to provide new avenues for evolutionary development, and not surprisingly, the research team uncovered a major expansion of genes just before the appearance of vascular plants, land plants with xylem and phloem – special cells for transporting water and nutrients.
But Douglas Soltis said gene expansions did not always correspond to major plant evolutionary milestones.
“There’s not much of an expansion before seed plants appear or for flowering plants,” he says. “In fact, flowering plants actually shrank certain gene families, which may be a sign that they just co-opted existing genes for new functions.”
Another surprise finding was that mosses, liverworts and hornworts form a single related group, confirming a centuries-old hypothesis that had been reversed in recent decades.
“We’d done a partial analysis in 2014 that suggested these plants were close relatives, but a lot of people didn’t believe it. These results underscore those findings,” Pamela Soltis says. “It’s going to rock the moss world.”
While the project refines our understanding of plant evolution and relationships between lineages, these data are also invaluable tools for advancing crop science, medicine and other fields, the researchers says.
Identifying genes that have been duplicated in flowering plants could help scientists better understand their function, which could lead to crop improvements, Pamela Soltis says.
And because many plants have medicinal benefits, the genetic data offered by the 1KP project could lead to new discoveries that improve human health.
“We focused on getting a lot of wild samples collected from plant lineages known to have important chemistry in hopes that people could mine this material for new compounds,” Douglas Soltis says.
The sequences generated by the 1KP team are publicly accessible through the CyVerse Data Commons.
“Probably hundreds of papers have used the data in ways we don’t even know about,” Pamela Soltis says. “That is a super cool aspect of this study.”
But the 1KP team has little time to celebrate its achievement. The next goal? Sequencing 10,000 genomes.