Distributed heterogeneous computing infrastructure for NICA data processing

Seminars

Laboratory of Information Technologies

Joint Laboratory Seminar

Date and Time: Tuesday, 24 December 2024, at 3:00 PM

Venue: Conference Hall, Meshcheryakov Laboratory of Information Technologies, online on Webinar

Seminar topic: “Distributed heterogeneous computing infrastructure for NICA data processing”

Speaker: Igor Pelevanyuk

Abstract:

One of the key components in the implementation of the NICA Project, alongside the accelerator and detector facilities BM@N, MPD, and SPD, is the computational infrastructure necessary for processing, analysis, storage, and transfer of large volumes of experimental data. Since 2019, a distributed heterogeneous computing infrastructure based on the DIRAC Interware was developed at MLIT. It includes Tier1 and Tier2 clusters, the Govorun Supercomputer, the NICA cluster, the DDC cluster, clouds of the JINR Member States, the UNAM cluster in Mexico, and the IMDT cluster in Mongolia. To improve the efficiency of operation, job and network monitoring were developed and implemented. A fundamentally new approach to performance analysis of tasks has been developed, allowing the analysis of hundreds of thousands of jobs and identification of inefficiently operating resources. A methodology for modelling task behaviour in a distributed heterogeneous environment was developed. Its use allows forecasting the course of execution of large task batches. The built infrastructure and tools are used to execute jobs of the MPD, BM@N, and SPD Experiments. In total, more than 3.5 million jobs were completed with an average execution time of 8 hours.

(Based on the PhD thesis).