Data Management

Use all your data to gain competitive advantage

Structured, proprietary data gives you an advantage over your competitors. Your data is an organizational asset; it powers new insights, warns against potential pitfalls, and guides innovative research directions.

Benefits of Good Data Management

  • Prevent repeated experiments
  • Quickly find the materials closest to hitting new specs
  • Gain new insights and verify expert intuition
  • Power AI-driven analysis

WITH THE CITRINE PLATFORM, RESEARCHERS CAN:

  • Quickly understand what data is available across an organization
  • Find data related to a new project
  • Find materials meeting or almost meeting customer requirements
  • Extract trends in data
  • Compare the properties of different materials
  • Harness data for AI models

Data Ingestion

Materials and chemicals data are complex. It comes in many formats and from different sources.

The Citrine Platform has a flexible Python interface to enable automated data ingestion. Tools enable the inspection of data so that outliers can be found, and data cleaned and corrected.

Citrine’s data engineers partner with customers to ingest historical data and enable smooth capture of data going forward.

The Platform contains a descriptor library specifically developed for materials and chemicals data, which can convert information (e.g. a chemical formula) into data (e.g. atomic weight).

Data ingestion

Data Structure

The Citrine Platform uses a data format called GEMD (Graphical Expression of Materials Data). This data model captures and contextualizes materials data from procurement through to processing and characterization. Its visual nature helps researchers to see related data. The data is also available in a tabular interface for filtering and analysis.

Key points on Citrine’s approach to data:

  • New fields can be easily specified
  • Real measured values are kept separately from intended recipe values
  • Data consistency is encouraged through templates and naming conventions
  • Data uncertainty is specified systematically