Combining data-driven science and computational chemistry can significantly accelerate materials discovery

(Nanowerk Spotlight) Siloxanes – a class of manufactured silicone derivatives, also know as silicones – are widely used (with an annual volume of 2.8 million tonnes in 2018) in medicine and industrial applications, mostly though in cosmetics and personal care products like deodorants, shampoos, skin creams and hair styling products. These are the substances that carry the pleasant smell and make the content smooth and easy to apply.
However, siloxanes can also be organic contaminants. When we use siloxane-containing products, the siloxanes either evaporate or are washed off and flushed down the drain, thus ending up in the environment. Due to their high vapor pressure, siloxanes are persistent and prone to bio accumulation, making it challenging to remove them from various environmental media. In closed-volume applications, such as water reclamation activities in space missions, these siloxanes accumulate in water to concentrations that are toxic to humans. Trace elements of siloxanes also present a major challenge for analytical chemists.
As early as the 1990s, researchers have been aware that some siloxanes can induce toxic effects in several aquatic organisms. Siloxanes are classified into linear and cyclic compounds and can widely differ in particle size, molecular weight, shape and chemical groups and this determines the different physico-chemical properties that directly affect the safety or the risk of their use.
Research has shown that the release of some siloxanes could have severe impacts and potential toxic effects on animal organisms. Examples are oestrogen mimicking, connective tissue disorder, adverse immunologic effects, and eventually fatal liver or lung damage. Even worse, siloxanes could mask the presence of other contaminants in the detection systems, which hinders the effective removal of other pollutants.
Developing suitable sorbents is a cost-effective solution for the removal of siloxanes. Pure-silica zeolites (PSZs) exhibit outstanding structural advantages as adsorbent materials. As a type of microporous material consisting of only silicon and oxygen atoms, PSZs are hydrophobic and without any acid sites. Moreover, PSZs are thermally stable and can be easily regenerated when their pores are blocked. These unique features make PSZs potential sorbents for siloxane removal.
However, there are millions of possible PSZs, and screening these PSZs one by one to determine if they are promising candidates for siloxane removal is simply not possible.
"Machine Learning offers us a powerful tool to solve such complex problems," Prof. Zhongfang Chen from the Department of Chemistry, University of Puerto Rico, tells Nanowerk. "Different from rules-based systems that require much experience, time, and efforts, Machine Learning generates mathematical models from experimental and computational data at speeds and scales that far exceed human capabilities."
Publishing their findings in Journal of Materials Chemistry A ("Machine-learning-assisted screening of pure-silica zeolites for effective removal of linear siloxanes and derivatives"), a team led by Chen and Prof. Arturo J. Hernández-Maldonado (University of Puerto Rico) proposes a two-step computational framework (Figure 1) combining Grand Canonical Monte Carlo (GCMC) simulations and Machine Learning (ML) methods to investigate the adsorption performances of pure-silica zeolites.
This work highlights the promise of combining data-driven modelling with traditional computations to predict the performance of complex zeolite systems. It can provide guidance to future computation studies and experimental studies of pure-silica zeolites for the removal of problematic compounds.
Flow chart showing two-step computational screening to achieve prominent zeolites for adsorbing four linear siloxanes and derivates
Figure 1. Flow chart showing two-step computational screening to achieve prominent zeolites for adsorbing four linear siloxanes and derivatives. (Image: University of Puerto Rico) (click on image to enlarge)
Using this model, it becomes possible to screen promising pure-silica zeolites with excellent adsorption performance towards four linear siloxanes and derivatives on a short time scale.
"We obtained essential features and screened out 230 preeminent zeolites from 50 959 hypothetical 16-layers pure-silica zeolites – a picking ratio of ∼0.45% – with excellent adsorption performance regarding our set of four problematic compounds," says Shiru Lin, the first author of the paper. "Further GCMC simulations verified that all the 20 randomly chosen ML-recommended PSZs have excellent adsorption performance towards our four targets."
The team randomly chose 500 zeolites from 50 959 PSZs and computed the average adsorption loading (mol) and adsorption energy (kcal/mol) by GCMC simulations. Due to the common low adsorption energies towards linear siloxanes, they used the adsorption energy as the standard to classify PSZs in this work. Based on this standard, zeolites with adsorption energy in the top 20% are classified as class-1 (great zeolites, triangle points in Figure 2), while the rest are classified as class-0 (bad zeolite, square points in Figure 2).
average adsorption energy and loading values of 500 pure-silica zeolites
Figure 2. The average adsorption energy and loading values of 500 PSZs towards four problematic compounds, where triangles represent the top 20% zeolites (class-1) ranked by adsorption energy, while the square points are the other 80% zeolites (class-0), and the light yellow sections denote the top 90% class-1 zeolites ranked by adsorption loading. (Image: University of Puerto Rico) (click on image to enlarge)
"The adsorption energies of these 500 PSZs towards each problematic compound cover a relatively big range," notes Chen. "In detail, the differences between the highest and the lowest adsorption energies for trimethylsilanol (TMS) (7.72 kcal/mol) and monomethylsilanetriol (MMST) (6.90 kcal/mol) are around two times larger than those for dimethylsulfone (DMSO2) (4.21kcal/mol) and dimethylsilanediol (DMSD) (3.03 kcal/mol). Such adsorption energy differences strongly suggest that the structures of zeolite frameworks can significantly influence their adsorption performances, especially for TMS and MMST."
In this study, the team selected five relevant features for their model, namely three crystal parameters (a, b and c /Å), pore diameter (p /Å), and probe-accessible surface area (s Å/unit cell). Pore diameters and probe-accessible surface areas can depict the size and the area of common adsorption locations, and crystal parameters (a, b and c) can provide additional information for the overall shape of zeolites.
After training the models, they achieve high scores of 0.99 for the training sets, and high scores of 0.91, 0.90, 0.91, 0.89 for DMSO2, TMS, DMSD, and MMST in the test sets, respectively.
"The high training and test scores demonstrate that these Random Forest models with selected five features can well describe the effects of structural parameters of PSZs on adsorption performances, and these models are expected to have outstanding predictive power to classify the adsorption performances of much more PSZs towards the four PCs under consideration," explains Hernández-Maldonado.
"We also checked the confusion matrices and found that predictive accuracy for class-1 is higher than that for class-0, which guarantees that our ML models would not miss promising zeolites," he adds.
"To reconfirm the accuracy of ML models and the superior adsorption performance of the selected 230 PSZs, we randomly chose two sets – 10 each – of zeolites from these 230 four-class-1 zeolites, and computed their average adsorption energies and adsorption loading towards four problematic compounds by GCMC simulations," explains Lin. "These 20 zeolites all have class-1 adsorption energies and average loading values higher or close to the top 30% zeolites in the training data of 500 zeolites."
"Our work vividly demonstrates that the collocation of data-driven science and computational chemistry can significantly accelerate materials discovery, and help solve the most challenging separation problems in environmental science," Chen concludes. "In our work, the screening process was accelerated by ML methodology, and the candidate list of PSZs was dramatically reduced, which provides good guidance for future experimental and theoretical investigations on developing potent materials for siloxane removal. Our methodology can be extended to other sorbents such as Al-containing zeolites, metal-doped zeolites, and metal-organic frameworks. In addition, the contaminants can also be other difficult to remove organic/inorganic compounds."
Michael Berger By – Michael is author of three books by the Royal Society of Chemistry:
Nano-Society: Pushing the Boundaries of Technology,
Nanotechnology: The Future is Tiny, and
Nanoengineering: The Skills and Tools Making Technology Invisible
Copyright © Nanowerk LLC

Become a Spotlight guest author! Join our large and growing group of guest contributors. Have you just published a scientific paper or have other exciting developments to share with the nanotechnology community? Here is how to publish on