Computomics’ xSeedScore is a must have for plant breeders.
There is no doubt that plant breeders need timely and accurate information to find new commercial varieties. Breeding can be a numbers game, and to obtain the most accurate performance predictions for a breeding program, algorithms need to take into account not only genetic information, but also the effect of environment and the interactions between the two.
This is where xSeedScore comes in.
xSeedScore is a machine learning based algorithm created by the innovative bioinformatics team at Computomics that allows plant breeders to get to their top 1% of crosses with a higher prediction than any standard statistical method.
Dr. Sebastian Schultheiss, Managing Director of Computomics, started the company to put the power of data into the hands of breeders. He wanted to provide a service that would allow them to make accurate, data-driven decisions about the plants they are working with.
Schultheiss started working with different types of data like genome sequencing, large scale phenotyping, climate, and weather data, and realized the potential that lies in data integration.
“While a lot of people are using statistical models and regression to do that, we developed proprietary machine learning based algorithms, or artificial intelligence (AI) algorithms,” he says. “Our technology xSeedScore is an AI toolset that enables us to really model these different sources of data, put them into relative perspective, and make sure that we use them appropriately: to benefit prediction accuracy and avoid common mistakes such as overfitting.”
With these tools, Computomics helps breeders draw the shortest possible line from population to commercial product while taking into account the effect of environment, location, or soil microbiome.
How it Works
“The simplest explanation why our product is really superior to, let’s say, statistical methods, is that we are able to capture these higher-order interactions between different elements,” explains Schultheiss. “Elements can include precipitation during the growing season, day lengths, etc. All of these things together are what our prediction models are actually using.”
On the biology side, xSeedScore really provides new insights. According to Schultheiss, it can show you for a certain seed variety if there is a need for maybe less fertilizer at certain times in the year or if the seeds are also suitable for another region where they haven’t been marketed before.
“These kinds of insights are possible with the technology that we’ve developed. And that’s what makes it unique,” he says.
By accumulating data sets in their plant breeding program, breeders are going to have a competitive advantage when they want their breeding programs to accelerate, to adapt to a changing climate, and to adapt growing conditions to specific seeds, e.g. in indoor farming.
Storing Data
Computomics has a strong focus on data security. All of the client data is kept separate from any other project to ensure it remains the breeders’ property and they can use it for their predictions. Computomics uses their own data center and hardware, stored in an industry-grade state-of-the-art facility, with security measures to prevent unauthorized access to the data sets.
“Often, plant breeders have a very limited amount of time and we can guarantee turnaround times that fit into their breeding schedules,” Schultheiss says. The company is also very sensitive to the specific needs of each client and works within the confines of their breeding programs and goals.
Results show the Computomics’ algorithms are able to predict the genetic potential of crops with an increase in accuracy of typically 50% or more compared to state-of the-art methods, which means breeders and growers can make decisions with more confidence. Generating millions of new opportunities to find a commercial hybrid versus the usual 2000-3000 with increased predictive value, and lines that outperform testers can have monumental effects for breeding programs.
xSeedScore can deliver over twice the prediction accuracy compared to standard statistical methods. Accurate prediction of all virtual crosses and identification and deselection of low performers before planting them significantly decreased testing time. This in total results in double genetic gain, shorter breeding cycle and increased efficiencies.
For more information, go to http://www.computomics.com/