A cutting-edge Artificial Intelligence (AI) model capable of deciphering the sequences and structural patterns that form the genetic “language” of plants has been introduced by a research collaboration.
PlantRNA-FM, considered the first AI model of its kind, has been developed through a collaboration between plant researchers at the John Innes Centre and computer scientists at the University of Exeter.
Described by its creators as a significant technological breakthrough, the model has the potential to drive innovation and discovery in plant science and extend its applications to the study of invertebrates and bacteria.
RNA, like its more familiar counterpart DNA, is a vital molecule in all living organisms. It carries genetic information through sequences and structures composed of nucleotides—the building blocks of RNA—arranged in patterns akin to how letters form words and sentences in human language.
Professor Yiliang Ding’s team at the John Innes Centre focuses on RNA structures, a critical component of RNA molecules. These structures enable RNAs to fold into complex forms that regulate essential biological functions, such as plant growth and stress responses.
To deepen their understanding of RNA’s intricate roles, Professor Ding’s group partnered with Dr. Ke Li’s team at the University of Exeter. Together, they developed PlantRNA-FM, an AI model trained on an extensive dataset of 54 billion RNA sequences representing the genetic alphabets of 1,124 plant species.
The researchers employed methodologies similar to those used in training AI models like ChatGPT to comprehend human language. By analyzing RNA data from plant species worldwide, PlantRNA-FM gained a comprehensive understanding of RNA’s role across the plant kingdom.
Similar to how ChatGPT understands and responds to human language, PlantRNA-FM has been trained to grasp the grammar and logic of RNA sequences and structures.
The researchers have already leveraged the model to make accurate predictions about RNA functions and to uncover specific functional RNA structural patterns within transcriptomes. These predictions have been experimentally validated, confirming that the RNA structures identified by PlantRNA-FM play a role in regulating the efficiency of translating genetic information into proteins.
“While RNA sequences may appear random to the human eye, our AI model has learned to decode the hidden patterns within them,” says Dr Haopeng Yu, the postdoc researcher in Professor Yiliang Ding’s group at the John Innes Centre.
This successful collaboration was also supported by scientists from Northeast Normal University and the Chinese Academy of Sciences in China contributed to this work.
Professor Ding said: “Our PlantRNA-FM is just the beginning. We are working closely with Dr Li’s group to develop more advanced AI approaches to understand the hidden DNA and RNA languages in nature. This breakthrough opens new possibilities for understanding and potentially programming plants which could have profound implications for crop improvement and the next generation of AI-based gene design. AI is increasingly instrumental in helping plant scientists tackle challenges, from feeding a global population to developing crops that can thrive in a changing climate.”
An Interpretable RNA Foundation Model for Exploration Functional RNA Motifs in Plants appears in Nature Machine Intelligence.