What is Materials and Chemicals Informatics?
The application of data science, materials science, and machine learning to the materials and chemicals space.
What is Machine Learning?
People are good at spotting patterns, but computers can more quickly spot patterns in many more dimensions.
Machine learning spots the correlations between many input and output data and creates a model that can be used to predict outputs based on new inputs.
Machine learning can consider many different scenarios quickly and cost-effectively. This leads to finding more novel outcomes.
How much data is needed for Materials Informatics?
Data fuels machine learning. More, high quality data leads more quickly to more accurate results. As testing is expensive and time consuming, materials and chemicals data sets tend to be small and sparse.
The Citrine Platform is specially developed to deal with complex, small, and sparse data sets. Starting with only a few hundred data points is common. Citrine has tackled projects with fewer than 30 initial data points!
Materials data is highly complex, and context matters. It is important to capture the whole context of the data so that researchers can confidently reuse the data in the future.
What is Sequential Learning?
Sequential Learning is a way for your materials scientists and your data scientists to work together iteratively to improve results in the shortest number of steps possible.
Not all experiments are equally useful. Some experiments provide more information than others. With Sequential Learning each new experiment is chosen in order to maximize the amount of useful information.
The results from experiments are used to update the machine learning models, and the process is repeated. In this way the process efficiently drives toward improvements.
Through Sequential Learning you can easily ignore dead ends, avoid experiments you should already know the answer to, and focus on promising new opportunities.
Will AI Replace Scientists?
The short answer is no. AI is a tool that scientists use to help make better decisions, and is driven by the domain knowledge and business decision making of the scientist.
- They set the targets
- They decide which data is useful
- They make the models efficient by supplying known relationships between properties
- They decide which of the proposed experiments to perform
- They make trade-offs between performance, cost, and other factors
AI doesn’t replace scientists; it saves them time. With fewer unproductive experiments to analyze, they can use their expertise to work on more successful projects.