Advances in genomic research have unveiled alternative transcription initiation sites in thousands of soybean genes, reshaping our understanding of gene expression and its implications for crop improvement.
Over 70 years after the groundbreaking discovery of DNA’s structure by Rosalind Franklin, James Watson, and Francis Crick, scientists like Jianxin Ma are uncovering new ways to interpret the genetic blueprint of life. In 2010, Ma, a professor of agronomy at Purdue University, spearheaded the creation of the first soybean reference genome, based on the widely studied Williams 82 variety. This resource has been instrumental for researchers and plant breeders investigating traits like seed protein content, disease resistance, and abiotic stress tolerance.
Filling Gaps in the Soybean Genome
Ma, who holds the Indiana Soybean Alliance Inc. Endowed Chair in Soybean Improvement, has continued to refine this foundational genetic tool. His latest study, published in The Plant Cell, fills critical gaps in the original soybean genome by identifying transcription initiation sites — the starting points for the creation of mRNA, which cells use to produce proteins.
“The reference genome was like a dictionary when we announced it,” Ma explained in a recent Purdue news release. “Each gene was like a single word. However, there was a piece of critical information lacking: transcription initiation sites for individual genes.”
These sites are key to understanding gene expression, as they regulate when, where, and how often proteins are made. Traditionally, scientists believed that each gene had a single initiation site, typically located near a specific DNA sequence called the TATA box. Ma’s research challenges this assumption.
Alternative Sites and Their Role in Soybean Biology
“There is a set of predicted transcription start sites for over 50,000 genes in soy, but based on our new study, less than 3% of those predicted transcription initiation sites actually are correct,” Ma said.
Using STRIPE-seq (Survey of TRanscription Initiation at Promoter Elements Sequencing), Ma’s lab analyzed transcription initiation sites across eight soybean tissues, including roots, leaves, and nodules. This approach revealed that many genes have alternative initiation sites, some located within coding regions of the gene itself. These alternative sites can lead to the production of different proteins from the same gene, potentially enhancing a plant’s adaptability and complexity.
For example, Ma’s team discovered tissue-specific transcription initiation sites in root nodules, which house nitrogen-fixing Rhizobia bacteria. This symbiotic relationship is crucial for legume survival in nitrogen-deficient soils without the use of synthetic fertilizers.
“We found these particular transcription initiating sites in nodules, but not in the roots or any other tissues, suggesting they are for tissue-specific transcription and associated with nodule-specific function,” Ma said.
Linking Epigenetics and Gene Expression
Ma’s research also highlights the interplay between transcription initiation and chromatin structure — the way DNA is packed around histone proteins. Epigenetic markers on these histones can make DNA more or less accessible for transcription, influencing which initiation sites are used.
“We have found nearly 7,000 genes that have the alternative transcription initiation within the coding sequences. These alternative transcription initiation sites tend to be tissue-specific and associated with histone modifications,” Ma noted.
These findings suggest that alternative initiation sites may have evolved as an adaptive advantage, allowing soybeans to fine-tune gene expression for different functions and environments.
A New Resource for Researchers
To share this groundbreaking data, Ma is collaborating with USDA Agricultural Research Service scientists Rex Nelson and Jacqueline Campbell to integrate the findings into SoyBase, a comprehensive online database for soybean research.
“Having even a potential transcription start site will aid in the analysis of soybean gene promoter regions,” said Nelson, SoyBase curator. “This may shed light on the proteins that interact with promoters and induce transcription.”
Campbell, co-curator of SoyBase, added, “The identification of transcription factors that bind promoter regions will allow researchers to identify gene regulatory interaction networks involved in the complex regulation of genes in agronomically important phenotypes.”
By making these data publicly available, Ma hopes to accelerate research into gene functions and regulatory mechanisms, paving the way for improved soybean varieties.
“The database serves as an important resource for both basic and applied research,” Ma said. “As we better understand how these alternative transcription sites affect particular traits, the hope is to see this lead to better soybean varieties.”