Analysis of the structures of thousands of plant proteins by Tetsuya Sakurai and colleagues from the RIKEN Center for Sustainable Resource Science in Japan has led to the construction of a database that will help scientists identify the functions of more plant genes.
Although the complete genomes have been sequenced for a number of plants and their genes have been identified, the functions of many of these genes remain unknown. Arabidopsis thaliana, also known as thale cress, is one such model plant with a fully sequenced genome, and due to the genome’s small size is used routinely in plant research.
Using published information to target only non-redundant protein sequences, the researchers performed computational modeling to predict the physicochemical and structural properties of these proteins from the complete genomes of Arabidopsis and five other plants: soybean, poplar, rice, moss and alga.
With further analysis of the protein structures, the regions in the proteins most likely to be functional were identified. The team identified more than 52,000 functional regions in proteins from the six plants. The results formed the basis for a new RIKEN database, called the Plant Protein Annotation Suite, or Plant-PrAS.
“Protein structural research in plants lags behind that in animals and bacteria with respect to the structural analysis of individual proteins and gene functional annotation,” Sakurai says. “We developed the Plant-PrAS database to resolve such problems. It houses unique information about the plant proteome, which is downloadable and extensively searchable.”