What We've Learned

Over the last 7 years working with leading chemicals and materials companies, we have learned a thing or two about important characteristics of Materials Informatics Systems (MIS). If you are evaluating different approaches and deciding on your implementation plan – here are seven things to look for:

1. Segregated and Encrypted Data

First things first, your product and materials data is one of your most valuable assets, and it needs to be secure. Make sure any vendors or partners you work with implement best practices in both physical and operational security, and can readily share information regarding their security program, such as industry standard security certifications and encryption protocols.

Segregation and encryption of customer data

2. Interfaces for Both Scientists, Data Managers, and Data Scientists

Curating and ingesting data to train AI models requires many detailed steps. Having a scalable Python API enables data managers and data scientists to work efficiently to automate data ingestion and model building workflows. However, an intuitive graphical user interface (GUI) is equally important. This capability makes AI accessible to subject matter experts like researcher scientists and application engineers.

User interface, python and graphical

3. A Flexible Data Model that Captures the Whole Context

Data from procurement through to characterization

Materials and chemicals data is complex. The properties of a material depend on a whole chain of variables from procurement to characterization to scale-up. In order for other researchers to trust and reuse that data, your MIS’s data model needs to capture both the metadata (e.g. the test conditions) and the materials’ history – input ingredients, processing steps, and measurements at each stage. Additionally, this data model needs to be flexible and adapt as your team pursues new research directions, purchases new equipment, or expands to new product lines.

4. Methods to Convert Materials-Specific Information to Machine-Readable Data

The ability to translate information and domain knowledge into machine-readable data is especially important in materials and chemicals, where data is often sparse, and  expensive testing and experimentation is required to generate more.

Take a chemical formula for example. The letters and numbers can be unpacked to deliver far more information than which elements are present in what quantities. Interatomic distance, atomic weight and so on can be calculated, and it may be these characteristics that influence the final properties of a material. An MIS needs a library of tools to convert material-specific information such as micrographs and x-ray diffraction images to a library of machine-readable data for incorporation into an AI workflow.

Converting micrographs and chemical formulas

5. Intuitive Data Visualizations

“A picture tells a thousand words” is an aphorism that is doubly true in this context. Trawling through data tables can be tedious, and it doesn’t help you see the big picture. An MIS should have easy ways to visualize data to help researchers and engineers identify interesting trends and product development opportunities.

Materials process history diagram in the Citrine Platform

6. Ways to Build Off Existing Domain Knowledge

The laws of physics don’t need to be repeatedly discovered by AI. By using what you know already about the relationships between composition, processing parameters etc.,  and final material properties, you can give an AI model a head start. The model can be focused on figuring out the unknowns. Two things help with this:

  1. A graphical, modular approach to machine learning models. By mapping out the connections between input parameters and final target properties, researchers can easily collaborate with data scientists. If these models can be re-used across projects, future projects can leverage your teams’ historical work.
  2. The ability to incorporate known physics and analytical relationships into the model-building process.
domain knowledge integration into AI models

7. Systematic Calculation of Uncertainty

AI predictions are not useful in materials and chemicals if you don’t know their associated uncertainty.  If a researcher needs a solution now, they want to home in on materials that are likely to hit their targets.   On the other hand, if your team is exploring a more open-ended R&D project, you may want to explore areas of chemistry or processing parameters that haven’t been tested before. AI models that incorporate uncertainty calculations will be able to highlight these more “exploratory” regions.

Uncertainty calculations help you answer questions like:

  • Which of these research directions has the highest chance of hitting target properties?
  • Is the cost of this new ingredient justified by better performance?
  • If we restrict our ingredient set to be lead-free will we still get the properties we need?
  • Which research project has the highest chance of success?
Design space comparison


There are many other aspects to an MIS that will help a company to get the most business value from Materials and Chemicals data. You can read an in-depth white paper on the topic here.

You can also watch a recent webinar on the topic here.