The interpretation of a spectrum is not always a simple task and requires considerable expertise. Each spectrum can be compared with a database containing numerous reference material properties, but unknown material features that are not present in the database can be problematic, and often have to be interpreted using spectral simulations and theoretical calculations. In addition, the fact that modern spectroscopy instruments can generate tens of thousands of spectra from a single experiment is placing considerable strain on conventional human-driven interpretation methods, and a more data-driven approach is thus required.
Use of big data analysis techniques has been attracting attention in materials science applications, and researchers at The University of Tokyo Institute of Industrial Science realised that such techniques could be used to interpret much larger numbers of spectra than traditional approaches. “We developed a data-driven approach based on machine learning techniques using a combination of the layer clustering and decision tree methods”, states co-corresponding author Teruyasu Mizoguchi.
The team used theoretical calculations to construct a spectral database in which each spectrum had a one-to-one correspondence with its atomic structure and where all spectra contained the same parameters. Use of the two machine learning methods allowed the development of both a spectral interpretation method and a spectral prediction method, which is used when a material’s atomic configuration is known. The method was successfully applied to interpretation of complex spectra from two core-electron loss spectroscopy methods, energy-loss near-edge structure (ELNES) and X-ray absorption near-edge structure (XANES), and was also used to predict the spectral features when material information was provided. “Our approach has the potential to provide information about a material that cannot be determined manually and can predict a spectrum from the material’s geometric information alone”, says lead author Shin Kiyohara.
However, the proposed machine learning method is not restricted to ELNES/XANES spectra and can be used to analyse any spectral data quickly and accurately without the need for specialist expertise. As a result, the method is expected to have wide applicability.
The work is described in Scientific Reports.