PIC: Port d'Informació Científica

Gonzalo Merino


Port d’Informació Científica (PIC) is a scientific-technological center operated through a collaboration agreement between CIEMAT and IFAE which specializes in developing advanced tools and methods for scientific data analysis. We participate in research projects from multiple disciplines with challenging data environments, including particle physics, astrophysics, biology, materials science and others. Our main objective is to accelerate research by making data analysis more effective through the use of Machine Learning, High-Throughput Computing and Big Data techniques using distributed computing to support collaborative research.

HIGHLIGHT: PIC Expands WLCG Capabilities to the Barcelona Supercomputing Centre

Access to existing compute resources beyond those pledged by the Worldwide LHC Computing Grid sites has been explored as a way to increase the available capacity for LHC computing. The PIC team led the negotiations with the Barcelona Supercomputing Centre (BSC) for the recognition of LHC computing as a strategic project. As a result of the agreement, preferential access to a fraction of the CPU resources at the BSC has been granted since 2020. Since then, the research team at PIC has invested significant effort to integrate and exploit these resources. The lack of outbound network connectivity from the BSC compute nodes, necessary to interface them to the workload management systems of the LHC experiments, required R&D work to circumvent the limitations. New services were developed and deployed at PIC to interface the BSC with WLCG and handle the job submission and the data flow. Monte Carlo simulation workflows from ATLAS, CMS and LHCb experiments are now routinely executed at BSC. Since 2020, a total of 125M hours have been allocated and consumed in BSC. In 2023, a total of 38.5M hours were consumed at BSC for LHC activities through gateway services at PIC.

Image
Figure 1 :

Data center infrastructure

During 2023, PIC has experienced a significant increase in its resources. On one hand, the IBM TS4500 tape library has been expanded by 20PB of data, which are already allocated to the experiments we support. Additionally, the available disk capacity for projects has been increased by 7PB, distributed across different platforms such as Ceph and dCache. Computing resources have also grown proportionally to this expansion, with the addition of 2500 computing slots to the existing cluster. It is also worth noting the launch of a Kubernetes cluster, where various applications have been deployed to manage data transfers, as well as an Ovirt platform designed to host approximately 200 virtual machines. At the infrastructure level, there has been a significant improvement with the renovation of the fire detection and suppression system. The current system is based on a highly precise aspiration detection system and suppression using NOVEC 1230.

Image
Figure 2:

Data Management and Analysis Services

Hadoop

The PIC Big Data Common Service has deployed a new and much improved platform for its storage and computing requirements in 2023. This new platform consists of a custom in-house developed Hadoop distribution, that runs on top of a brand-new hardware cluster. The Hadoop distribution, code named Shepherd, has been developed to avoid vendor lock-in and allow migrating its components to more recent versions, and uses Docker along with Gitlab CI/CD to simplify and automate both testing and deployment. The new cluster is composed of 20 nodes that collectively provide 480 processing cores and 2 PB of net storage. An expansion with additional 10 nodes is planned for 2024. Both CosmoHub and the production of Mock Galaxy Catalogs for Euclid (among other surveys) has also been migrated to this new platform in February 2024.

CosmoHub

CosmoHub has also seen several improvements in 2023, most of them on the backend side, such as the addition of Parquet as a download format for custom catalogs. We have also developed and integrated two net sets of user defined functions (UDFs) implementing aggregation over array columns, such as spectra or probability density functions, and operations on spherical geometries that are part of the ongoing effort to implement the ADQL standard. PIC participated in the meeting of the Red de Infraestructuras de Astronomía to present CosmoHub as a potential system to provide data hosting services for this community.

Rucio

We have successfully reimplemented the deployment of the Rucio Data Management software using Helm, a package manager for Kubernetes. With a single YAML configuration file, we can fully deploy Rucio, including the server, daemons, PostgreSQL, and database schema. During 2023, we deployed a Rucio instance to automate the file transfer for the MAGIC telescopes, and we are planning to expand its use to several other experiments.

Particle Physics - The LHC

At the end of 2023, the resources deployed by PIC for LHC computing were ~115 kHS06 (which corresponds to about 9000 CPU cores), ~11.5 PB for disk storage, and about 27 PB for tape storage space. One of the main characteristics of Tier-1 centres, beyond a very large storage and computing capacity, is being able to provide these resources through services that need to be extremely reliable, hence the critical services in a Tier-1 operate in 24x7 mode. PIC Tier-1 was at the top of the stability and reliability rankings in WLCG for 2023.

In addition to contributing computing resources to WLCG, the team have also actively been involved in the R&D activities of the LHC experiments, necessary for the evolution of the infrastructure to cope with an ever-increasing scale and complexity of the LHC scientific program, and in preparation for the HL-LHC phase, which resulted in significant contributions conferences and computing-specific publications. In particular, the group has been actively involved in integrating HPC resources, testing new services to deploy an Analysis Facilities and studying the benefits for the inclusion of data caches in WLCG.

The computing centres within the WLCG are anticipated to handle wide area network (WAN) throughputs of tens of terabits per second during the HL-LHC era (2029+), prompting significant upgrades to the WAN infrastructure at major centres like the Spanish LHC Tier-1 at PIC. Initial evaluations indicate that PIC would need network upgrades in 2026 and 2029, with target speeds of 300 Gbps and 600 Gbps, respectively. This underscores the critical need for network upgrades at PIC and national service providers to ensure that scientific research benefits from a seamless and enhanced connectivity experience. These needs have been notified to both CSUC and RedIRIS.

Cosmology

Virtual galaxy catalogs - Throughout 2023, the Scientific Pipeline at PIC (SciPIC), dedicated to efficiently generating massive virtual galaxy catalogs, has undergone several improvements. Various releases have been deployed for use by the Euclid Organization Unit Simulation Data (OU-SIM) to generate simulated images for Science Performance Verification 3 (SPV3). SPV3 holds particular significance for the Euclid mission as it aims to verify the expected performances of the Euclid project, providing valuable insights into critical aspects of the project.

In collaboration with the professional enterprise “cacaocinema,” we presented an outreach video during the internal Euclid annual meeting in Copenhagen. This video was later made public through the European Space Agency’s website and YouTube in August, garnering over 34K views:

Image
Figure 3:

Material Sciences

InCAEM

The goal of this project is to design, install, commission and define the exploitation strategy of an infrastructure for correlative analysis of advanced materials for energy applications. The project is the Catalan branch of the Advanced Materials coordinated project within the Planes Complementarios, funded by the by the European Union – NextGenerationEU in the context of the “Recovery, Transformation and Resilience Plan” (PRTR) and the Regional Government of Catalonia. During 2023 PIC has continued working in collaboration with ALBA, ICN2 and ICMAB collecting the requirements and prototyping the offline analysis and data preservation computing facility for the future research infrastructure that will be located in the ALBA synchrotron facility.