Large (Hadron Collider) and Big (Data Science)

Wednesday, Oct 12
Session Chair: Valerio Pascucci (U. Utah)

Federica Legger Federica Legger
National Institute for Nuclear Physics
Since the start of data taking at the Large Hadron Collider (LHC) at CERN in 2009, the four LHC experiments (ALICE, ATLAS, CMS and LHCb) have collected more than an Exabyte of physics data. Storing and processing such a large amount of data requires a distributed computing infrastructure, the Worldwide LHC Computing Grid (WLCG), made up of almost 150 computing facilities spread in 42 countries around the world. The current computing infrastructures are expected to grow by an order of magnitude in size and complexity for the HL-LHC (the high luminosity upgrade of the LHC) era (2030->). In this talk, I will review the challenges of designing, deploying and operating a distributed and heterogeneous computing infrastructure, composed of on-premises data centers, public and private clouds, HPC centers. We will discover how machine learning and artificial intelligence techniques can be exploited to address such complex challenges, from data taking to data processing to data analysis in WLCG.
Bio. Dr. Federica Legger is an associate researcher at INFN (National Institute for Nuclear Physics). She studied Physics at the University of Turin in Italy, and graduated from EPFL (École Polytechnique Fédérale de Lausanne) in Switzerland with a thesis on the data acquisition electronics of the LHCb experiment at CERN. She is currently participating in distributed computing activities for the CMS experiment at the LHC (Large Hadron Collider) and for the Virgo experiment at EGO (European Gravitational Observatory). She is leading the Operational Intelligence initiative for WLCG (World LHC Computing Grid), a cross-experiment effort from the HEP (High Energy Physics) community that targets the reduction of operational cost of large scientific computing infrastructures through AI-powered automation. At the University of Turin, she is lecturer of the course Big Data and Machine Learning for graduate students. Within CMS, she is coordinating the Monitoring and Analytics working group, which is responsible for the management of the monitoring infrastructure, integration of new data sources, and the coordination of analytics tasks. Previously, she held the same role for the ATLAS experiment. In ATLAS, she held coordination roles in both distributed computing (Distributed Analysis coordinator), and physics groups for the search of Supersymmetry.

Presentation slides

The Red Queen's Race: Software Development Down the Rabbit Hole

Wednesday, Oct 12
Session Chair: Michela Taufer (U. Tennessee Knoxville)

Tim Mattson Tim Mattson
Lewis Carroll's books about Alice in Wonderland are a rich source of metaphors for thinking about the future. The Red Queen's race is one of my favorites. In this scenario, people run as fast as they can just to stay in place. Doesn't the life of a programmer sometimes feel like we're in such a race?

In this talk, we'll discuss the fast-moving world of computing and Intel's work to create abstractions that help programmers keep up. This is the essence of the collection of open standards we call oneAPI. The race, however, is far from over. We'll discuss the abstractions we'll need to keep in the race as it moves to distributed parallel computing over heterogenous nodes. We'll close with a description of our early work on programming systems that will automate much of what a programmer does. This will help programmers focus on solving scientific problems instead of struggling to keep up with the Red Queen and her race.
Bio. Tim Mattson is a parallel programmer obsessed with every variety of science (Ph.D. Chemistry, UCSC, 1985). He is a senior principal engineer at Intel where he’s worked since 1993 with brilliant people on great projects including: (1) the first TFLOP computer (ASCI Red), (2) MPI, OpenMP and OpenCL, (3) two different research processors (Intel's TFLOP chip and the 48 core SCC), (4) Data management systems (Polystore systems and Array-based storage engines), and (5) the GraphBLAS API for expressing graph algorithms as sparse linear algebra. Tim has over 150 publications including five books on different aspects of parallel computing, the latest (Published November 2019) titled “The OpenMP Common Core: making OpenMP Simple Again”.

Presentation slides

Translational Computer Science as a Paradigm Underpinning eScience

Thursday, Oct 13
Session Chair: Manish Parashar (U. Utah)

David Abramson David Abramson
University of Queensland
Given the increasingly pervasive role and growing importance of computing and data in all aspects of science and society fundamental advances in computer science and their translation to the real world have become essential. Consequently, there may be benefits to formalizing Translational Computer Science (TCS) to complement the traditional foundational and applied modes of computer science research, as has been done for translational medicine. TCS has the potential to accelerate the impact of computer science research overall. In this talk I discuss the attributes of TCS, and formally define it. I enumerate a number of roadblocks that have limited its adoption to date and sketch a path forward. Finally, I will provide some specific examples of translational research underpinning eScience projects and illustrate the advantages to both computer science and the application domains.
Bio. David is a Professor of Computer Science, and currently heads the University of Queensland Research Computing Centre. He has been involved in computer architecture and high performance computing research since 1979. He has held appointments at Griffith University, CSIRO, RMIT and Monash University. Prior to joining UQ, he was the Director of the Monash e-Education Centre, Science Director of the Monash e-Research Centre, and a Professor of Computer Science in the Faculty of Information Technology at Monash. From 2007 to 2011 he was an Australian Research Council Professorial Fellow. David has expertise in High Performance Computing, distributed and parallel computing, computer architecture and software engineering. He has produced in excess of 230 research publications, and some of his work has also been integrated in commercial products. One of these, Nimrod, has been used widely in research and academia globally, and is also available as a commercial product, called EnFuzion, from Axceleon. His world-leading work in parallel debugging is sold and marketed by Cray Inc, one of the world's leading supercomputing vendors, as a product called ccdb. David is a Fellow of the Association for Computing Machinery (ACM), the Institute of Electrical and Electronic Engineers (IEEE), the Australian Academy of Technology and Engineering (ATSE), and the Australian Computer Society (ACS).

An Equitable Future for Computational Research

Friday, Oct 14
Session Chair: Pania Newell (U. Utah)

SherAaron (Sher!) Hurt SherAaron (Sher!) Hurt
The Carpentries
IEEE's core purpose is to foster technological innovation and excellence for the benefit of humanity. This purpose cannot be realized without diverse persons driving IEEE's mission and vision. Honoring IEEE's values of trust, growth and nurturing, global community building, partnership, service to humanity, and integrity in action, community members can work collaboratively to dismantle the broken power structures and resource distribution that negatively impact marginalized communities worldwide. Interdisciplinary research communities, developers, and users of eScience applications and enabling IT technologies have a responsibility to the public to create accessible tools and infrastructure, empowering diverse groups of people to innovate in all aspects of eScience and its associated technologies, applications, algorithms and tools. Democratizing science is the answer for an equitable future in data- and compute-intensive research. Dr. Hunt's keynote address will admonish attendees to apply their expertise in ways that advance equity and inclusion in science, technology, and computational research.
Bio. SherAaron (Sher!) Hurt currently serves as the Director of Workshops for The Carpentries, a non-profit project that teaches foundational coding and data science skills to researchers, technologists, and librarians globally. As Director, she provides oversight and management, planning, vision and leadership for the Workshop Administration Team. She oversees the logistics for over 600 international workshops hosted annually by The Carpentries and Community Members. She served as a member of the taskforce to develop online workshops during the global pandemic. In this capacity, she successfully transitioned the traditional two-day in-person workshops to being offered online. Her passion lies in efficient workflows and streamlining the work for The Carpentries. She has led multiple projects that focused on automating workflows to reduce the amount of time spent completing daily tasks. Sher! earned her B.S. in Business Management at Michigan Technological University, M.A. degree in Hospitality Management at Florida International University and currently completing a Doctorate of Business Administration with a concentration in Leadership and Logistics, from Walden University. Sher! enjoys participating in all types of fitness activities! She is a cultural foodie and has created a hobby of traveling the globe.