Grad. Seminar (2/2): Henri Casanova, “Simulating HPC Systems and Applications”

Thursday February 2nd, 4:30-5:30pm in POST 126

Speaker: Henri Casanova

Title: “Simulating HPC Systems and Applications”

Abstract: There is a well-identified problem of (lack of) reproducibility in experimental Computer Science research, and in particular in High Performance Computing (HPC) research.  Running experiments in simulation is one way to lower barriers to reproducibility, and it is used effectively in several areas of Computer Science.  Its use in HPC research has gained some traction, but comparatively it is still in its infancy. Part of the reason is that developing simulations models that are sufficiently accurate to produce meaningful results but that are also sufficiently scalable to handle the scale of HPC simulation is challenging. In this presentation we will discuss several advances in the development of such simulation models.  These models are implemented as part of the open-source SimGrid simulation framework. We will give an overview of SimGrid, describe its most salient features and capabilities. We will conclude by discussing ways in which SimGrid can be used not as a research tool, but as a tool for debugging, for teaching, and for enabling online decision making.

Prof. Casanova receives NSF grant: “WRENCH: A Simulation Workbench for Scientific Workflow for Users, Developers, and Researchers”

Professor Henri Casanova was awarded a National Science Foundation grant for the project “WRENCH: A Simulation Workbench for Scientific Workflow for Users, Developers, and Researchers”.  This project received $499,000.00 in funding.

In partnership with Dr. Rafael Ferreira da Silva at the Information Science Institute at the University of Southern California, this project will develop a framework for the study of scientific workflow applications.  See the abstract below for more details.

Scientific workflows have become mainstream for conducting large-scale scientific research.  As a result, many workflow applications and Workflow Management Systems (WMSs) have been developed as part of the cyberinfrastructure to allow scientists to execute their applications seamlessly on a range of distributed platforms.  In spite of many success stories, building large-scale workflows and orchestrating their executions efficiently (in terms of performance, reliability, and cost) remains a challenge given the complexity of the workflows themselves and the complexity of the underlying execution platforms.  A fundamental necessary next step is the establishment of a solid “experimental science” approach for future workflow technology development. Such an approach is useful for scientists who need to design workflows and pick execution platforms, for WMS developers who need to compare alternate design and implementation options, and for researchers who need to develop novel decision-making algorithms to be implemented as part of WMSs.  The broad objective of this work is to provide foundational software, the Workflow Simulation Workbench (WRENCH), upon which to develop the above experimental science approach.  Capitalizing on recent advances in distributed application and platform simulation technology, WRENCH makes it possible to (i) quickly prototype workflow, WMS implementations, and decision-making algorithms; and (ii) evaluate/compare alternative options scalably and accurately for arbitrary, and often hypothetical, experimental scenarios.  This project will define a generic and foundational software architecture, that is informed by current state-of-the-art WMS designs and planned future designs.  The implementation of the components in this architecture when taken together form a generic “scientific instrument” that can be used by workflow users, developers, and researchers.  This scientific instrument will be instantiated for several real-world WMSs and used for a range of real-world workflow applications. In a particular case-study, it will be used with a popular WMS (Pegasus) to revisit published results and scheduling algorithms in the area of workflow planning optimizations. The objective is to demonstrate the benefit of using an experimental science approach for WMS research.  Another impact of this project is that it  makes it possible to include scientific workflow content pervasively in undergraduate and graduate computer science curricula, even for students without any access to computing infrastructure, by defining meaningful pedagogic activities that only require a computer and the WRENCH software stack. This educational impact will be demonstrated in the classroom in both undergraduate and graduate courses at our institutions.