The seed sector doesn’t escape the global trend: more and more data are generated every year in plant breeding. It pops the question how to get the best out of this mass of information. Euralis started considering this question already in the nineties, and developed its own platform, Helix, exactly for this purpose.
It was built “brick by brick” by our SIR service (Statistics, Informatics, Research and Development). This group consists of seven people and is in charge of IT, statistical and methodological developments as well as training and support for 190 users of the Helix platform, spread over 16 sites. The development of the platform would not have been possible without the close collaboration of the selection teams of our division Euralis Seeds.
The platform is designed to manage data, business processes and data analysis at all phases of the field crop improvement cycle. It concerns all of the following crops: corn, sunflower, rapeseed, sorghum and soybean. And to keep it perfectly up to date, the platform is continually enriched and adapted to follow the evolution of organizations and trades (agronomy, biotechnology, computer science). In addition, it provides mobile solutions for inputs in nurseries, greenhouses and trials using laptops and tablets, as well as embedded software in harvesters (PlotX), allowing the collection of data during the harvesting of research trials.
The platform manages, in an integrated and centralized way, the data flows related to the business processes, both in the field, from seedling preparation to harvesting, data collection and validation, as well as in the molecular analysis laboratory, from the sampling of leaves or seeds to the validated molecular data.
In recent years, there has been a significant change in the volumes to be managed. For example,
- Field data: each year, the databases are enriched with results from more than several thousand research trials, corresponding to several hundred thousand plots, supplemented by the results of several hundred development trials and several thousand commercial trials.
- Genotypic data: from a few hundreds of thousands of data per year 5 years ago, the laboratory now generates more than 200 million per year thanks to high density chips.
In addition, the platform offers a wide variety of tools for R&D teams to explore and analyze their data, for example in terms of visualizations, controls and cleaning of data.
- Synthetic reports from different sources of information.
- Integrated statistical data analysis pipelines, which can combine phenotypic data (related to the observable traits of an organism) and genotypic (linked to the genetic information of an individual).
- Decision support tools for selection programs.
- Interfaces with complementary analysis software.
Obviously, for all these analyses, the platform relies on an intensive computing server cluster.
This platform is to be seen as the repository of all of Euralis genetics with more than 20 years of results. What is really great is that this platform guarantees integration, storage and data integrity. It is hosted in secure data centers, and it allows a global and transversal vision, for all the teams of the different geographic research stations across Europe.