CRIC Information System to integrate computing centres for processing SPD NICA data

News, 05 September 2024

Researchers from Budker Institute of Nuclear Physics of the Siberian Branch of the Russian Academy of Sciences (INP SB RAS, Novosibirsk), with the participation of JINR employees, developed the CRIC (Computing Resource Information Catalog) Information System, one of the key components for building a distributed system for processing experimental data from the SPD Experiment at the NICA Accelerator Complex. Since 2024, the INP SB RAS has been an active participant in the SPD Collaboration, contributing by developing the IT infrastructure of the experiment, taking part in the physics programme, and engineering elements of the SPD Facility. The information system allows integrating computing centres to process large amounts of statistics of experimental data.

The INP SB RAS researchers started developing the CRIC Information System in 2016 for the experiments of the Large Hadron Collider (LHC) at CERN. CRIC is based on the previous system operating at CERN, AGIS (ATLAS Grid Information System), which was also created by the INP SB RAS team. Employees of the Laboratory of Information Technologies at JINR participated in engineering some parts of the AGIS System and refining its interfaces as part of joint work for the ATLAS Experiment.

“In 2010-2011, we were developing and partially implementing AGIS. We were faced with the task of creating an information system for the computer infrastructure of the distributed computing network of the ATLAS Experiment. We solved it and immediately began to gradually introduce the system into production at CERN. Step-by-step implementation is very convenient, because it allows the experiment to smoothly adapt to the new solutions, gradually connecting software services and new users, – an INP SB RAS researcher, CRIC Project Coordinator Alexey Anisenkov explained. – In 2011-2012, the system was already in full operation. It helped solve real tasks and configure and organize the distributed computing environment. Back then, it already made it possible to determine which resources were working or temporarily disabled, thus effectively configuring a large grid infrastructure of hundreds of computing clusters in order to eventually ensure the operability of the entire experimental data processing system.”

Successfully using the AGIS System in the tasks of the ATLAS Experiment led the leadership of CERN and the INP SB RAS to the decision to create an expanded version of it, CRIC, for other experiments. Starting in 2020, the LHC experiments gradually started to switch to it. The distributed data processing environment of the LHC experiments includes over 170 large computer centres around the world. CRIC helped configure and ensure coordinated operation of a distributed infrastructure for storing and processing one exabyte of ATLAS data, obtained over 15 years of operation of the experiment.

In 2019-2020, an idea emerged to apply the CRIC System for physics data processing systems of the NICA Accelerator Complex. In 2022, the system started to be implemented as part of the creation of a distributed data processing system for the projected SPD Experiment.

SPD Detector circuit

“The number of recorded SPD Detector events is measured in tens of thousands per second. This places quite high demands on the data processing system and IT infrastructure,” a MLIT JINR senior researcher, Deputy Computing and Software Coordinator of the SPD Experiment, Candidate of Technical Sciences Danila Oleinik said.

He noted that since the INP SB RAS joined the SPD Collaboration in 2024, the institute’s specialists have been actively involved in not only creating the experimental facility, but also developing tools for data processing. “CRIC System Leading Developer, a researcher Alexey Anisenkov supervises its development and ongoing maintenance in the Software & Computing Project of SPD in accordance with the needs of our experiment,” Danila Oleinik stressed.

The CRIC System will enable configuration of the components of the experimental data processing system and provide description of the topology of the distributed computing infrastructure. The information system is a link between services and infrastructure. In addition, it provides certain information for monitoring and resource accounting systems.

“The expected volume of experimental data is measured in tens of petabytes per year, which puts this experiment on a par with experiments at the LHC. Naturally, the processing of experimental data, including modelling of physical events, is carried out in a geographically distributed computing environment, with CRIC as one of its key systems,” Danila Oleinik commented.

The scientist clarified that CRIC is a component of the integrated distributed data processing system of the SPD Experiment. Thus, MLIT JINR specialists make a major contribution to the refinement and commissioning of the PanDA Load Management System and the Rucio Data Management System for the needs of SPD. Konstantinov Petersburg Nuclear Physics Institute of the Kurchatov Institute National Research Centre (Gatchina) actively participates in the development of geographically distributed data processing infrastructure.

Danila Oleinik explained that the organization of distributed data processing for modern experiments includes the use of a variety of specialised software systems, with almost each product being individually refined and adapted to the tasks of a specific experiment. “Based on the experience, knowledge, and expertise of the MLIT team, we can not only use existing solutions, but also develop systems of the new generation,” the scientist noted.