OES
A platform for discovery and curation of ML-ready, curated, and standardized omics data.
Overview
The Omics Engine Service (OES) is a discovery and curation platform designed to simplify access to machine learning-ready, curated, and standardized omics data. It addresses the challenges scientists face with raw public omics data, which are often heterogeneous, lacking in standardized formats, and scattered across different sources online.
By cataloging raw omics data and curating metadata into a consistent, high-resolution format, OES enables researchers to select the right data for their experiments efficiently. Bioinformaticians and scientists can easily search and download data using an API, with options for three levels of curation: raw, basic, and premium.
OES encompasses over 1,750 curated samples, more than 250 datasets, over 10 omics types, and involves 450+ publications and BioProjects, offering high-quality metadata that researchers can trust.
Features and Benefits
- Reduce curation time by over 75%: Access quality datasets and metadata in minutes for conditions available in OES, which otherwise would require months of expert curation.
- Use consistent, high-quality data: Access FAIR-standardized data necessary for your discoveries, minimizing integration errors and associated costs.
- Select appropriate curation depth: Choose from three curation depths, and benefit from custom precision curation if needed, adding granularity to datasets.
- Integrate proprietary data: Combine your proprietary datasets with high-quality public data for enhanced research outcomes.
Through automated pipelines, monthly updates, and manual quality control, plus internally developed tools for biocurators, OES keeps data up-to-date and reliable, offering a comprehensive data universe for downstream analysis.
OES integrates seamlessly with LEAP™, an AI scientific discovery application, enhancing your analysis by linking with over 28 million scientific papers, 60+ knowledge bases, and bioinformatics data, forming a knowledge graph with 211 million connections.
The platform is developed by scientists for scientists, ensuring relevance and usability, thereby accelerating discovery and reducing experimental risks by automating omics data curation and discovery.

