PIC’s data center is located in UAB’s Building D and has two different vaults in the same building with different characteristics and energy efficiency profiles:
PIC’s air-cooled room has a raised-floor with 34 racks and 1400 rack units for IT equipment and a StorageTek SL8500 tape library. PIC’s liquid-cooled room is in a fireproof enclosure and is the only liquid immersion data center in Spain.
PIC supports a variety of scientific projects deploying computing and storage resources for intensive processing and analysis of extremely large amounts of data.
PIC offers to its users more than 7000 computing slots on two different computing clusters: HTCondor and PBS. These resources are available through the Worldwide LHC Computing Grid (WLCG) and also for local users.
PIC uses Open Source software to manage more than 24 PB of data stored in magnetic support using different tape technologies: LTO4, LTO5, T10KC and T10KD. During 2017, LTO4 data have been migrated to T10KD, a new technology which 10 times the capacity per cartridge (8 TeraBytes).
The Worldwide LHC Computing Grid (WLCG) is a grid-based distributed computing infrastructure comprising resources from more than 170 centres in 34 countries. WLCG is used to analyse the unprecedented rate of hundreds of Petabytes (PB) of data generated each year by the LHC.
The Spanish Tier-1 centre, located at PIC, provides services to three of the LHC experiments, accounting for ~5% of the total Tier-1 capacity for ATLAS, CMS, and LHCb. The Tier-1 project at PIC is embedded in a large R&D project aimed to federate the Tier-1 and the ATLAS and CMS Tier-2 resources from IFAE and CIEMAT, aligning with the WLCG strategies towards HL-LHC. This multi-site federation of high performance data services presents many R&D challenges to face in the next three years. The project is led by Josep Flix as project coordinator (CIEMAT PI) and Andreu Pacheco Pages (IFAE PI), with José Hernández and Carles Acosta as CIEMAT and IFAE co-PIs.
At the end of 2017, the resources deployed by PIC for the Tier-1 were about 68 kHS06 (which correspond to about 5250 cores), around 6PB of disk storage space, and about 19PB of tape storage space. The resources deployed for the Tier-1 are typically ~85% of the total resources deployed at PIC.
2017 has been a year with many R&D and operations activities at the PIC Tier-1: IPv6 addresses have been enabled in production for many services (including disk pools and compute nodes); disk pools and tape recall configurations have been tuned to get better performance, with the help of dedicated CMS and ATLAS stress tests; around 50% of the Tier-1 CPU power is running under HTcondor in PIC, accessible through two HTcondor-CEs. HTcondor is being deployed as a replacement of the current batch system (Torque/MAUI); tests for scheduling and running CMS jobs, originally assigned to PIC Tier-1, in CIEMAT Tier-2 in Madrid, in a transparent way for users/experiments; tests and validation of the new OS (CentOS 7); CRIC development activities (the new Information System in WLCG); and new CPU benchmarking R&D activities.
As a complement to the Tier-1 facility, the PIC LHC group is also providing resources to the ATLAS Spanish Tier-2 infrastructure, specialized in Real and Simulated Data Analysis and Monte Carlo Simulation Production. Members of the group are Aresh Vedaee, Alex Guino and Andrés Pacheco Pagés.
In numbers the contribution of PIC in 2017 to the spanish Tier-2 has been 9,238 HEP Specs 2006 (HS06) in CPU power and 795 Terabytes of storage. The CPU contribution corresponds to 81 millions of HS06 normalized hours. The measured availability and reliability of the resources has been 98.6% and 98.9% respectively. The requirement for a Tier-2 is to be more than 90% in both values, which has been achieved.
Services offered include the main Data Center for the Major Atmospheric Gamma Imaging Cherenkov Telescopes (MAGIC), the development of simulations for the Euclid space mission, the integral treatment of data from the Physics of the Accelerating Universe (PAU), data quality analysis of the Dark Energy Survey (DES) project and support for the exploitation and distribution of the simulations of the Marenostrum Institut de Ciències de l’Espai (MICE) project.
PIC provides data transfer from La Palma-Roque de los Muchachos Observatory, computing, data management and analysis. The Observations Database was upgraded in 2017.
PIC has presented a use case for MAGIC on Helix Nebula Science Cloud project, hosted by CERN, in order to “cloudify” some analysis and reprocessing tools using defined templates. Some preliminary tests with the cloud providers of the project were performed in order to evaluate the solutions and the adaptability to the MAGIC use case.
PIC provides grid services to CTA in order to test and implement DIRAC as a middleware for CTA Collaboration. This task is developed in collaboration with the French IN2P3 Institute. According to the user requests, new measures to increase redundancy and robustness of the DIRAC platform, in terms of database management, were implemented.
In order to expand the collaboration with IN2P3 and to provide new services for whole grid-CTA groups, a CTA use case has been presented on Helix Nebula Science Cloud. The use case defines some minor adaptions of DIRAC, using DIRAC-VM plugin to “cloudify” the computing services.
In 2017, PIC, as PAU Survey data center, has been fully operative. During the 2017A and B observation periods, data have been automatically transferred from WHT in La Palma to be stored at PIC. Analysis pipelines developed at PIC in collaboration with ICE have been run several times for optimization. As part of the optimization process, we carried out tests for integration of the PAU pipelines with the hadoop platform available at PIC. External projects using PAUCam started accessing PIC storage to retrieve their data. The total volume of PAU data (raw and reduced) stored in tapes at PIC reached at the end of the year 34.7 TB.
During 2017, PIC continued carrying on the activities derived from PIC’s role as the Spanish Science Data Center (SDC-ES) and being members of the Organizational Unit for Simulations (OU-SIM), responsible for developing the Simulation Processing Function (SIM-PF).
One of the milestones of 2017 has been the completion of Science Challenge 3 (SC3), which had very ambitious goals both on the SDC-ES and SIM-PF sides.
It was the first data production involving all the components of the SGS infrastructure: continuous deployment, Euclid Archive System (EAS), Infrastructure Abstraction Layer (IAL) and the HTC cluster. For this challenge, ~40k jobs were run at PIC, using ~67k hours of computation time and producing ~9 TB of data.
In June 2017 a major milestone was achieved within the Euclid project: the first release of the Euclid Flagship mock galaxy catalog, which is currently the largest simulated galaxy catalog ever created.
The catalog has been created entirely at PIC using a software pipeline written in Python, which runs exclusively on top of the PIC Big Data platform. The pipeline populates the dark matter halo catalogs provided by ETH-Zurich with galaxies using the Halo Occupation Distribution model (HOD).
The catalog is available for the entire Euclid collaboration through CosmoHub.
The web portal dedicated to validate and distribute massive datasets has taken advantage of the PIC Big Data platform upgrade. In the last year, nearly 200 new users have registered on the platform. Most of them come from the Euclid collaboration, but also some of them are just members of the world scientific community who want to access the public datasets.
During 2017 CosmoHub made available the first GAIA data release, which contains more than 1 thousand million entries, and also we have ingested the gold galaxy catalog of the first three years of observations of the Dark Energy Survey, which contains about 500 million objects.
IFAE participates in this project, co-funded by the European Commission Horizon 2020 Work Programme, whose goal is to develop solutions for the set-up of a Hybrid Cloud Platform.
During 2017, a prototype of Analysis as a Service was developed for MAGIC, which will undergo scaling tests in 2018. Next year we are going to deploy it in the HNSciCloud Hybrid Cloud Platform and scale up these tests to hundreds of cores.