CSFK researchers use artificial intelligence to study the process of star and planet formation within the framework of the international NEMESIS project


The Konkoly Thege Miklós Astronomical Institute at the ELKH Research Centre for Astronomy & Earth Sciences, together with the University of Vienna and the University of Geneva, are attempting to redefine the classification of young stars and the early stages of stellar and planetary evolution. The NEMESIS (Novel Evolutionary Model for the Early Stars with Intelligent Systems) project, funded through the European Union's Horizon 2020 research and innovation framework program, has the primary goal of creating the largest database of young stars to date. In addition, researchers plan to use artificial intelligence to build a model of star formation that goes beyond current theories and can provide a more complete explanation of the phenomena observed with modern astronomical instruments.

Artificial intelligence has been a popular term in science fiction novels for decades, but is rarely encountered in everyday life. However, this situation has taken a dramatic turn: today, artificial intelligence encompasses our life, from self-driving cars to facial recognition on phones and websites, and personalized advertising on the Internet. On the other hand, in order for artificial intelligence and machine learning to work reliably, large amounts of data are needed to identify various trends and patterns. Over recent decades, more and more data have become available to astronomers, and this body of data has grown to the point where it is unmanageable by ordinary methods, making it necessary to apply big data, machine learning and deep learning methods to astronomy.

In the case of face recognition, the various algorithms translate the face itself into the language of mathematics, i.e. numbers that give, for example, the distance between the left and right edges of the face, the distance between the chin and the top of the head, the ratio between the two, the distance between the eyes and the ears, etc.

"The patterns used in astronomy are similar in the language of mathematics, but are derived from measurable characteristics of stars, such as the brightness they emit at different wavelengths, the proportion of these, the chemical elements they contain, and the characteristics of their environment. Young stars, for example, can be found in environments where there is a lot of interstellar dust and gas, since they were formed from these not so long previously, of course, if we measure time on a cosmic timescale," says Dr Gábor Marton, a research fellow at the Astronomical Institute and the national coordinator of the NEMESIS project.

A systematic classification of the different stages of star formation only became possible in the 1980s, thanks to the first infrared observations and theoretical calculations. Today, more than 25 years after the classification of young stars was first interpreted in a coherent evolutionary context, much more and better data are available. In addition, researchers have much more advanced computational tools and methods to reassess initial hypotheses, and define new criteria and assumptions.


The Rozetta Nebula in far infrared light (at 70, 160 and 250 μm wavelengths) as seen by the Herschel space telescope. Each bright speck embedded in the diffuse dust is a young star being formed. The bluish parts of the cloud are colder, while the reddish parts are warmer. Dense, cold interstellar clouds like Rosetta are the prime targets of the NEMESIS project, as they contain the youngest stellar cores (Photo: ESA https://www.esa.int/Science_Exploration/Space_Science/Herschel/Baby_stars_in_the_Rosette_cloud)

Dr Odysseas Dionatos, researcher at the University of Vienna and coordinator of the consortium, said: "The latest evidence suggests that planets start to form at the same time as stars, so that star and planet formation are not two successive stages but a rapid, simultaneous event. The classification based on the wavelength of the stellar emission has been of great help in determining the all-encompassing parameters of young stars, but there is tremendous uncertainty concerning the specific evolutionary timescales. The research will reinterpret the current classification scheme and the characteristic outlier time scales. Supervised and unsupervised machine learning methods will be used to process the available data in order to answer the most relevant questions about stellar and planetary evolution."

CSFK-csillagkeletkezési régió

The figure shows the NGC1333 star formation region at different wavelengths of light: (a) visible, (b) mid-infrared, (c) far-infrared, (d) and (e) sub-millimeter. Images at different wavelengths can help distinguish young objects of different ages. The younger an object, the colder it is, and the more visible it is in longer wavelength images. The observed wavelength increases from Figure (a) towards Figure (e) (increasingly warm regions are shown).

Why is this relevant now? – "One reason was the lack of large-scale optical infrared sky surveys. This has changed over the past decade, thanks to whole-sky surveys such as Gaia, 2MASS and WISE. Population statistics are needed to describe different developmental time scales, for which a large sample size is essential. The Gaia space telescope has detected 1.8 billion objects in its lifetime, a large number of which may be young stellar candidates," explained Dr Marc Audard, a researcher at the University of Geneva.

The NEMESIS project was launched in March 2021 and has been awarded over €1.6 million in funding for the next 4 years, of which €407,384 can be spent on domestic research. More information (in English) is available on the project website, which is currently being populated with content.