An AI-powered chatbot can predict chemical reactions

According to the researchers, a machine learning system like ChatGPT could — with a little fine-tuning — become surprisingly good at answering chemistry research questions. General-purpose AI systems can predict the properties of molecules and materials or the outcomes of chemical reactions as well as or better than more specialized models. It also requires fewer modifications for this purpose.

Chatbots trained in a similar way to ChatGPT can make machine learning in chemistry much easier. They provide chemistry laboratories with limited resources a powerful new research tool.

Chemistry training

Large language models are artificial neural networks that learn from huge collections of text. They can generate answers to questions or assignments by statistically predicting how sentences will follow one another. To find out what these systems could mean for chemistry, a research team from Friedrich Schiller University examined GPT-3, an early version of the “brain” behind the chatbot ChatGPT.

The researchers first collected information about chemical compounds and substances that were similar to those they wanted to ask questions about. They then rewrote that information into a maximum of thirty questions and answers. They then sent the data to OpenAI, the company behind ChatGPT, to add it to the GPT-3 training set.

The sophisticated system could answer predictive questions about compounds and substances that researchers wanted to study, even though they were not explicitly included in the input data. So GPT-3 can predict things correctly without needing explicit knowledge.

For example, the researchers tested the system’s suitability for answering questions about alloys with high entropy (states of motion of molecules and atoms). These alloys consist of approximately equal amounts of two or more metals. Much is known about ordinary alloys such as steel, but this is not the case for high-entropy alloys. However, fine-tuned linguistic models have been able to successfully guess how the metals in one of these alloys are arranged.

See also  Checklist for Municipalities on Mental Health Initiatives

Reducing the threshold

When asking questions about substances not included in the training data, accuracy results were similar to those of more specialized machine learning tools in chemistry and even those of explicitly programmed computer simulations.

The researchers also showed they could achieve similar results when they improved an open source version of GPT-3 called GPT-J. This means that labs with a small budget can develop their own version without having to pay or seek commercial help.

Currently, humans are still needed to collect information and prepare inputs from language models, but researchers are now designing future versions that can perform this step automatically by extracting text from existing literature.

This article previously appeared on Nature News.

Translation: June Rolls

Megan Vasquez

"Creator. Coffee buff. Internet lover. Organizer. Pop culture geek. Tv fan. Proud foodaholic."

Leave a Reply

Your email address will not be published. Required fields are marked *