Bridging the last mile - A platform for data management and analytics in campuses and research labs
Abstract
Over the past years, Compute Canada and its regional consortia have been really successful at interconnecting HPC sites through the fast networks provided by Canarie and the regional RANs. While useful when moving and... [ view full abstract ]
Over the past years, Compute Canada and its regional consortia have been really successful at interconnecting HPC sites through the fast networks provided by Canarie and the regional RANs. While useful when moving and synchronizing data between sites, it is of no help when generating data away from those islands of fast connectivity. Additionally, researchers generating large quantities of such data need to set up workflows and infrastructure to stage, prepare and transfer the data to a Compute Canada site for further processing.
This talk will present the Data Portal developed at Université Laval and deployed in large research labs and on campus-wide installations to facilitate access to central resources (both storage and computing) and enable local analysis and ingestion as close as possible to the data source. Our Data Portal is an appliance on dedicated hardware, managed remotely by the local advanced research computing team and from a topology point of view, it is fully integrated to the HPC infrastructure. It aims at being a generic scientific portal to interact with data stored both locally and at remote sites, as well as to launch analysis utilizing the local resources or scaled out on Compute Canada’s systems.
We will highlight how local and remote data is made available, describe the tools deployed on the portal for data management and end the talk with a live demonstration of a Spark Big Data analysis launch interactively through the “Data Portal” web interface.
Authors
-
Frédérick Lefebvre
(Calcul Québec - Université Laval)
Topic Areas
Advanced Research Computing (ARC): Research data management: Challenges, opportunities and , Advanced Research Computing (ARC): Innovations in platform / portal tools & software devel , Advanced Research Computing (ARC): Innovations in computational research (i.e. software, s
Session
HPC2.1.2 » Research Data Management II (11:15 - Tuesday, 21st June, CCIS 1-160, room sponsored by Obsidian)
Presentation Files
The presenter has not uploaded any presentation files.