A Large Agricultural Company

Utilizing cloud to process genetic sequences at lightening speed

With almost 100 years of experience in creating agricultural hybrids, the client was looking for ways to optimize their processes. Working closely with Sogeti, they realized there was significant opportunity to improve leveraging cloud and analytics. 

The process is complex: mapping the genome of different strains of seeds to help their clients make informed decisions on how to best make use of farmland. Their previous on-premise infrastructure limited the amount of genetic sequences they could capture, process, and store; they wanted to utilize the scalability of the cloud to make this process more efficient.  

This was the largest native development and re-platforming project for Sogeti USA on the AWS platform, but we had a plan. To approach this major project, Sogeti:

  • Re-platformed three genomics applications into the AWS cloud and redesigned them to be almost serverless.
  • Deployed cloud resources and services via automated deployment.
  • Modernized antiquated Perl and Java code into Python and SPARK.
  • Built out a fully automated CI/CD pipeline utilizing a hybrid cloud model in AWS.

As a result, Sogeti helped the client:

  • Retire 200 servers on premise Hadoop cluster and avoid a tech refresh.
  • Process 10x more genomes per year by utilizing the scalability of the AWS cloud. This process used to take 8 days to complete; we made it happen in 8 hours.
  • Decrease the cost of processing and storing genetic sequence data by 50%.
  • Process individual genetic sequence 5x faster than before.
  • Provided the client fully transparency into their system by using open source standards.

Sogeti came into this project with an existing relationship with this client. After this project, that relationship became stronger. In fact, AWS ask the client and Sogeti to co-present this story at the AWS re:Invent conference in 2018.