GoingArm

An HPC oriented workshop for sharing experiences and knowledge

Next GoingArm workshop to be held in conjunction with Arm Research Summit, stay tuned!!

Get connected via Arm HPC users group

GoingArm Schedule @ Arm Research Summit - 2017

Tuesday, 12 September
(Plenary Room, Crausaz Wordsworth Building)

09:00-09:05 Organiser’s introduction
09:05-09:20 Arm-supported HPC tools

Presented by: Geraint North (Arm Ltd.)
Abstract: The Arm HPC ecosystem is growing quickly. HPC centres around the world are moving from small-scale proof-of-concept boxes to much larger systems. One of the strengths of Arm is in having multiple partners involved. This means there are micro-architectures from different companies targeted at potentially different workloads. In such an environment, it is important that end-users have a consistent software story as they potentially test different systems. To ensure that high quality tools are being developed, Arm has a suite of software to help users get the best performance out of their chosen systems, both today and as they prepare for future Arm SVE architectures. In this presentation, we will explain these tools, including compilers and libraries and how the recent acquisition of Allinea has expanded our own offerings.

About the Speaker: Geraint North is Arm's Distinguished Engineer for Server and HPC Tools. He joined Arm in 2014 as a founding member of Arm's Manchester Design Centre, which performs HPC-specific activity on compilers, libraries and tools, including support for Arm's recently announced HPC-focused Scalable Vector Extensions (SVE).
09:20-09:55 Sandia’s Arm-centric Co-design Strategy

Presented by: Jim Ang (Sandia Labs)
About the Speaker: Dr. James Ang is the Technical Manager of the Exascale Computing Program department at Sandia National Laboratories in Albuquerque, New Mexico. Dr. Ang supports the Department of Energy's Exascale Computing Initiative with a focus on the research and development of hardware technologies for future exascale component, node and system architectures. These exascale hardware technologies are targeted for leading edge high performance scientific as well as data analytic applications. Areas of active research include: HPC system architectures, System-on-Chip processor designs, advanced memory subsystems, interconnection networks, large-scale system resilience, power monitoring and control, application performance analysis, the development and use of HPC architectural system simulators, proxy applications and advanced architecture test beds for HPC co-design. Dr. Ang helped found two strategic Sandia collaborations in advanced computer architectures; the DOE/SC Computer Architecture Lab (CAL) for design space exploration, co-led by SNL and LBNL, and the NNSA/ASC Alliance for Computing at Extreme Scale (ACES), a partnership between SNL and LANL to collaborate on the Cielo (2011), Trinity (2015) and Crossroads (~2020) Advanced Technology Systems. Dr. Ang has also supported the NNSA/ASC program office through the development of the ASC Computing Strategy document, ASC Co-design Strategy document, and efforts to establish R&D collaborations between DOE and Micron Technology.
09:55-10:15 ARM SERVER STANDARDIZATION

Presented by: Jon Masters (RedHat)
About the Speaker: Jon Masters is Chief Arm Architect at Red Hat, where he is also Technical Lead for Red Hat Enterprise Linux Server for Arm. Jon is a Computer Architect who has lead the computing industry in a variety of initiatives connected with the 64-bit Armv8 Architecture (AArch64), including architecture and platform standardization for enterprise and cloud-focused uses. He has worked for several well known Enterprise and Embedded Linux software companies, co-founded industry collaborations (such as the Linaro Enterprise Group), and has authored several well known programming titles on Linux technology. Jon has presented at numerous conferences on Arm, as well as the merits of emerging Hyperscale Computing Technologies and their disruptive impact upon the traditional datacenter.
10:15-10:40 An Early Access Program for ARM in HPC

Presented by: Andy Warner (HPE)
Abstract: This talk will introduce the HPE early access program which provides very early access Cavium ThunderX2 systems to key technology partners and selected customers.

About the Speaker:Andy Warner is a Distinguished Technologist in the Advanced Technology Group within HPE. Prior to their acquisition by HPE, he was Chief Engineer responsible for software and networking at SGI. He has 30 years’ experience in HPC and related technical fields.
10:45-11:15 Break
11:15-11:30 Allinea - Arm tools

Presented by: Olly Perks (Arm Ltd.)
Abstract: An introduction to the Allinea tools detailing their usage and capabilities. This will cover some of the new extensibility features, including custom metrics and JSON export. The talk will also cover the tool use within the Arm ecosystem, discussing the challenges of porting applications to the Arm ecosystem, and the use of Allinea tools to assist in validation and performance optimisation process.

About the Speaker: Oliver is a software engineer at Arm within the HPC tools group specialising in application profiling with the Allinea MAP tool, primarily working on European Horizon2020 projects. He obtained his PhD in analysing memory usage of HPC applications at scale, from the University of Warwick, and has since worked in industry, optimising large scale scientific codes, before joining Arm.
11:30-12:05 Barcelona Supercomputing Center (BSC) - Butterfly effects of porting scientific applications to Arm-based platforms

Presented by: Filippo Mantovani (BSC)
Abstract: Since 2011 the EU Mont-Blanc project pushes the development of Arm-based compute platforms following the vision of leveraging the fast growing market of mobile technology for performing scientific computation. The process started almost 5 years ago with the development of prototypes based on Android dev-kits is now evolving beyond the research project, towards commercial computational platforms based not only on mobile SoCs, but also on server and HPC technology. In this talk we will introduce the experience gained porting system software, tools, and scientific applications to prototypes based on Arm technology within the Mont-Blanc project. By using several application examples, we will present behaviours observed on small prototypical platforms. We will describe how current limitations can represent fundamentals problems that programmers and architects will face developing and programming future HPC systems. Techniques to overcome these limitations will be also presented. The goal of the talk is to give a panoramic view of Arm based scientific computing from the Mont-Blanc perspective, supported by experience, lesson learned and test results.

About the Speaker: Filippo Mantovani is a postdoctoral research associate of the Mobile and embedded-based HPC group at the Barcelona Supercomputing Center (BSC). He graduated in mathematics and holds a PhD in Computer Science from University of Ferrara, Italy. He has been a scientific associate at the DESY laboratory in Zeuthen, Germany, and at the University of Regensburg, Germany. He spent most of his scientific career in computational physics, computer architecture and high-performance computing, contributing to the Janus, QPACE and QPACE2 projects. He joined BSC’s Mont-Blanc project in 2013, becoming in 2014 principal investigator of the project.
12:05-12:20 Toward Building up ARM HPC Ecosystem

Presented by: Sumimoto Shinji (Fujitsu)
Abstract: This talk presents Fujitsu's activities for building up ARM HPC ecosystem. Fujitsu has been developing HPC related systems for over 40 years from hardware to system software and applications. This talk overviews Fujitsu's activities how ARM HPC ecosystem build up.

About the Speaker: Shinji Sumimoto, Fujitsu, Senior Architect of Software Development division. He is in charge of technical development for Post-K computer, and is an HPC System SW Specialist, especially ultra large scale high performance communication library and cluster filesystem research and development. He is also working on ARM HPC ecosystem development. He has been developing HPC related system software for over 20 years on UNIX and Linux cluster systems, including distributed parallel system, distributed file system, kernel/user level high performance communication software, and so on.
12:20-12:55 Accelerating Genomic Sequence Alignment Workloads with ARM’s Scalable Vector Architecture

Presented by: Trevor Mudge (University of Michigan)
Abstract: Recent trends in high-performance computing have resulted in the continued mapping of big-data applications with an increasingly wide range of data-access and control regularity to parallel architectures. This variety in workloads motivates the flexibility of target platforms. In this workshop we will discuss our efforts to port several applications to Arm’s SVE, including genomics workloads (which focus on dense vector computations) and graph analytic kernels (which focus on sparse matrix computations), and contrast the computational needs. We explore the advantages and shortcomings over other platforms and suggest future developments to microarchitectures employing these extensions.

About the Speaker: Trevor Mudge received the Ph.D. degree in Computer Science from the University of Illinois, Urbana in 1977. Since then he has been on the faculty of the University of Michigan, Ann Arbor. In 2003 he was named the first Bredt Family Professor of Electrical Engineering and Computer Science. Previously he served a ten-year term as the Director of the Advanced Computer Architecture Laboratory, which is a group of eight faculty and about 60 graduate students. He is author of numerous papers on computer architecture, programming languages, VLSI design, and computer vision. He has also chaired 49 theses in these areas. His research interests include computer architecture, computer-aided design, and compilers.
13:00-14:15 Lunch
14:15-14:30 Simplify migration to ARMv8 and other environments through I/O profiling

Presented by: Rosemary Francis (Ellexus Ltd.)
Abstract: Bad I/O patterns can harm shared storage and limit application performance. On top of this, moving to a new platform can expose new I/O problems that could bring the system to a standstill. Dr Rosemary Francis will give an overview of the tools available to profile and benchmark I/O, including case studies and a focus on building or porting to an ARM cluster.

About the Speaker: Dr. Rosemary Francis is the founder and CEO of Ellexus, the I/O profiling company. Following her PhD and career in the chip design industry, Rosemary founded Ellexus to help people take control of the way they access data. Today Ellexus provides unique profiling and monitoring tools to help high performance computing organisations around the world to manage their compute clusters. Not only can Ellexus' storage-agnostic tools be run on a live compute cluster to protect you from rogue jobs and noisy neighbours, they make it easy to migrate, optimise and benchmark applications in new environments.
14:30-15:05 FLAGSHIP 2020 Project: Development of “Post-K” and Arm SVE

Presented by: Mitsuhisa Sato (RIKEN)
Abstract: We are carrying out FLAGSHIP2020 project to develop and deploy the “post‐K” supercomputer as the successor of Japan's petascale facility, the K computer. Arm v8 with SVE ﴾Scalable Vector Extension﴿ is adopted as ISA of the manycore processor for “post‐K” system. In this talk, the overview of our projects will be described, and R&D for Arm SVE will be presented.

About the Speaker: Mitsuhisa Sato received the M.S. degree and the Ph.D. degree in information science from the University of Tokyo in 1984 and 1990. He was a senior researcher at Electrotechnical Laboratory from 1991 to 1996. From 2001, he was a professor of Graduate School of Systems and Information Engineering, University of Tsukuba. He has been working as a director of Center for computational sciences, University of Tsukuba from 2007 to 2013. Since October 2010, he is appointed to the research team leader of programming environment research team in Advanced Institute of Computational Science ﴾AICS﴿, RIKEN. He is a Professor Cooperative Graduate School Program and Professor Emeritus of University of Tsukuba.
15:05-15:20 Cavium ThunderX and ThunderX2

Presented by: Giri Chukkapalli (Cavium Inc.)
Abstract: Cavium has been leading the Arm server market with many firsts – including the first production deployment of dual socket ARMv8 silicon with ThunderX.  Cavium has also collaborated with leading software vendors to accelerate the HPC software ecosystem and is installed today in many of the elite HPC Labs and Research institutions worldwide.  Cavium will provide product and roadmap updates on ThunderX as well as additional ecosystem developments in HPC.

About the Speaker:Giri Chukkapalli is a Distinguished Engineer in the Data Center Group (DCG) of Cavium Inc. Giri is currently working on future Processor architectures for HPC and Data Center applications. Previously, Giri was a Technical Director in the PWI division of ING at Broadcom Corporation. . Prior to that, Giri was a Principal Engineer in the CTO Office of Cray Inc. where he explored future technologies in the HPC and Big Data space. Prior to working at Cray, he was CTO of Appro International, Inc. and worked on system architecture, dense packaging, efficient power delivery, and direct liquid cooling. Before that, he worked as a Principal Systems Architect in Systems Engineering division at SUN Microsystems responsible for winning and delivering several large-scale HPC systems. He has been working in the HPC industry for over 15 years.
15:20-15:55 Isambard - The World’s First Large-Scale Production 64-Bit Arm Supercomputer

Presented by: Simon McIntosh-Smith (University of Bristol)
Abstract: 2017 will see something of a minor revolution with the launch of a number of 64-bit Arm server chips that have been optimised for HPC workloads. These Armv8 CPUs will for the first time offer performance comparable to mainstream x86 and POWER processors. In this talk we will describe Isambard, the world’s first large-scale production supercomputer based on these new Armv8 processors. We will describe the architecture of the 10,000 core machine, explain Isambard’s mission, and disclose some early results from our software porting efforts.

About the Speaker: Simon McIntosh-Smith is a full Professor of High Performance Computing at the University of Bristol in the UK. He began his career as a microprocessor architect at Inmos and STMicroelectronics in the early 1990s, before co-designing the world's first fully programmable graphics processor (GPU) at Pixelfusion in 1999. In 2002 he co-founded ClearSpeed Technology where, as Director of Architecture and Applications, he co-developed the first modern many-core HPC accelerators. He now leads the High Performance Research Group at the University of Bristol, where his research focuses on performance portability and application based fault tolerance. He plays a key role in designing and procuring HPC services at the local, regional and national level, including the UK’s national HPC server, Archer. In 2016 he led the successful bid by the GW4 consortium along with the UK’s Met Office and Cray, to design and build ‘Isambard’, the world’s first large-scale production Armv8-based supercomputer.
16:00-16:30 Break - Main Session Close
16:35 - 18:10 Panel Discussion

Featuring: Scott Hara (Qualcomm), Eric Van Hensbergen (ARM), Jon Masters (RedHat), Kevin Pedretti (Sandia National Lab) and Shinji Sumimoto (Fujitsu), Mitsuhisa Sato (RIKEN)
18:10-18:15 Closing Remarks

GoingArm Schedule @ ISC - 2017

Thursday, 22 June
(Room: Flint, 2nd Floor)

09:00-09:05 Organiser’s introduction
09:05-09:20 Arm-supported HPC tools

Presented by: Chris Goodyer
Abstract: The Arm HPC ecosystem is growing quickly. HPC centres around the world are moving from small-scale proof-of-concept boxes to much larger systems. One of the strengths of Arm is in having multiple partners involved. This means there are micro-architectures from different companies targeted at potentially different workloads. In such an environment, it is important that end-users have a consistent software story as they potentially test different systems. To ensure that high quality tools are being developed, Arm has a suite of software to help users get the best performance out of their chosen systems, both today and as they prepare for future Arm SVE architectures. In this presentation, we will explain these tools, including compilers and libraries and how the recent acquisition of Allinea has expanded our own offerings.

About the Speaker: Based at Arm's Manchester Design Centre, Chris leads the Arm Performance Libraries development team. They are responsible for optimizing the Arm vendor maths library, which provides BLAS, LAPACK and FFT functionality. He is also heavily involved in developing the Arm HPC software ecosystem. He holds a PhD from the University of Leeds on efficient adaptive methods for the numerical solution of PDEs and he subsequently worked at the university for twelve years on research on a variety of HPC and numerical modelling projects. Before joining Arm he was part of the HPC team at NAG working on supporting national HPC services such as HECToR, the UK's supercomputer, and the EU exascale project EXA2CT.
09:20-09:35 Allinea - Arm tools

Presented by: Olly Perks
Abstract: An introduction to the Allinea tools detailing their usage and capabilities. This will cover some of the new extensibility features, including custom metrics and JSON export. The talk will also cover the tool use within the Arm ecosystem, discussing the challenges of porting applications to the Arm ecosystem, and the use of Allinea tools to assist in validation and performance optimisation process.

About the Speaker: Oliver is a software engineer at Arm within the HPC tools group specialising in application profiling with the Allinea MAP tool, primarily working on European Horizon2020 projects. He obtained his PhD in analysing memory usage of HPC applications at scale, from the University of Warwick, and has since worked in industry, optimising large scale scientific codes, before joining Arm.
09:35-10:10 Barcelona Supercomputing Center (BSC) - Butterfly effects of porting scientific applications to Arm-based platforms

Presented by: Filippo Mantovani
Abstract: Since 2011 the EU Mont-Blanc project pushes the development of Arm-based compute platforms following the vision of leveraging the fast growing market of mobile technology for performing scientific computation. The process started almost 5 years ago with the development of prototypes based on Android dev-kits is now evolving beyond the research project, towards commercial computational platforms based not only on mobile SoCs, but also on server and HPC technology. In this talk we will introduce the experience gained porting system software, tools, and scientific applications to prototypes based on Arm technology within the Mont-Blanc project. By using several application examples, we will present behaviours observed on small prototypical platforms. We will describe how current limitations can represent fundamentals problems that programmers and architects will face developing and programming future HPC systems. Techniques to overcome these limitations will be also presented. The goal of the talk is to give a panoramic view of Arm based scientific computing from the Mont-Blanc perspective, supported by experience, lesson learned and test results.

About the Speaker: Filippo Mantovani is a postdoctoral research associate of the Mobile and embedded-based HPC group at the Barcelona Supercomputing Center (BSC). He graduated in mathematics and holds a PhD in Computer Science from University of Ferrara, Italy. He has been a scientific associate at the DESY laboratory in Zeuthen, Germany, and at the University of Regensburg, Germany. He spent most of his scientific career in computational physics, computer architecture and high-performance computing, contributing to the Janus, QPACE and QPACE2 projects. He joined BSC’s Mont-Blanc project in 2013, becoming in 2014 principal investigator of the project.
10:10-10:45 University of Michigan: Porting and Adapting Dense and Sparse Matrix Application to Arm’s SVE

Presented by: Jonathan Beaumont
Abstract: Recent trends in high-performance computing have resulted in the continued mapping of big-data applications with an increasingly wide range of data-access and control regularity to parallel architectures. This variety in workloads motivates the flexibility of target platforms. In this workshop we will discuss our efforts to port several applications to Arm’s SVE, including genomics workloads (which focus on dense vector computations) and graph analytic kernels (which focus on sparse matrix computations), and contrast the computational needs. We explore the advantages and shortcomings over other platforms and suggest future developments to microarchitectures employing these extensions.

About the Speaker: Jonathan Beaumont is a Ph.D. candidate in the Department of Computer Science and Engineering at the University of Michigan. His research involves the design of high-throughput microarchitectures and making them more accessible to programmers.
10:45-11:00 FLAGSHIP 2020 Project: Development of “Post-K” and Arm SVE

Presented by: Mitsuhisa Sato
Abstract: We are carrying out FLAGSHIP2020 project to develop and deploy the “post‐K” supercomputer as the successor of Japan's petascale facility, the K computer. Arm v8 with SVE ﴾Scalable Vector Extension﴿ is adopted as ISA of the manycore processor for “post‐K” system. In this talk, the overview of our projects will be described, and R&D for Arm SVE will be presented.

About the Speaker: Mitsuhisa Sato received the M.S. degree and the Ph.D. degree in information science from the University of Tokyo in 1984 and 1990. He was a senior researcher at Electrotechnical Laboratory from 1991 to 1996. From 2001, he was a professor of Graduate School of Systems and Information Engineering, University of Tsukuba. He has been working as a director of Center for computational sciences, University of Tsukuba from 2007 to 2013. Since October 2010, he is appointed to the research team leader of programming environment research team in Advanced Institute of Computational Science ﴾AICS﴿, RIKEN. He is a Professor Cooperative Graduate School Program and Professor Emeritus of University of Tsukuba.
11:00-11:30 Break
11:30-12:05 AVL – An Arm Porting Story - Optimizing a RBF Interpolation Solver for Energy on Heterogeneous Systems

Presented by: Patrick Schiffmann
Abstract: Coming Soon

About the Speaker: Patrick Schiffmann is a HPC software engineer at AVL, the world's largest independent company for development, simulation and testing technology of powertrains. He is working on a PhD in numerical simulations on energy-efficient heterogenous hardware. Previously he graduated from the University of Edinburgh with a degree and HPC and data science and worked as a quantitative analyst.
12:05-12:20 Cavium - ThunderX2 presentation

Presented by: Giri Chukkapalli
Abstract: Cavium has been leading the Arm server market with many firsts – including the first production deployment of dual socket Armv8 silicon with ThunderX.  Cavium has also collaborated with leading software vendors to accelerate the HPC software ecosystem and is installed today in many of the elite HPC Labs and Research institutions worldwide.  Cavium will provide product and roadmap updates on ThunderX as well as additional ecosystem developments in HPC.

About the Speaker:Giri Chukkapalli is a Distinguished Engineer in the Data Center Group (DCG) of Cavium Inc. Giri is currently working on future Processor architectures for HPC and Data Center applications. Previously, Giri was a Technical Director in the PWI division of ING at Broadcom Corporation. . Prior to that, Giri was a Principal Engineer in the CTO Office of Cray Inc. where he explored future technologies in the HPC and Big Data space. Prior to working at Cray, he was CTO of Appro International, Inc. and worked on system architecture, dense packaging, efficient power delivery, and direct liquid cooling. Before that, he worked as a Principal Systems Architect in Systems Engineering division at SUN Microsystems responsible for winning and delivering several large-scale HPC systems. He has been working in the HPC industry for over 15 years.
12:20-12:55 Isambard - The World’s First Large-Scale Production 64-Bit Arm Supercomputer

Presented by: Simon McIntosh-Smith
Abstract: 2017 will see something of a minor revolution with the launch of a number of 64-bit Arm server chips that have been optimised for HPC workloads. These Armv8 CPUs will for the first time offer performance comparable to mainstream x86 and POWER processors. In this talk we will describe Isambard, the world’s first large-scale production supercomputer based on these new Armv8 processors. We will describe the architecture of the 10,000 core machine, explain Isambard’s mission, and disclose some early results from our software porting efforts.

About the Speaker: Simon McIntosh-Smith is a full Professor of High Performance Computing at the University of Bristol in the UK. He began his career as a microprocessor architect at Inmos and STMicroelectronics in the early 1990s, before co-designing the world's first fully programmable graphics processor (GPU) at Pixelfusion in 1999. In 2002 he co-founded ClearSpeed Technology where, as Director of Architecture and Applications, he co-developed the first modern many-core HPC accelerators. He now leads the High Performance Research Group at the University of Bristol, where his research focuses on performance portability and application based fault tolerance. He plays a key role in designing and procuring HPC services at the local, regional and national level, including the UK’s national HPC server, Archer. In 2016 he led the successful bid by the GW4 consortium along with the UK’s Met Office and Cray, to design and build ‘Isambard’, the world’s first large-scale production Armv8-based supercomputer.
12:55-13:00 Closing remarks


Special thanks to:

About the Workshop

GoingArm will enable attendees to take in technical presentations by fellow applications programmers and tool authors who are currently using the Arm platform.

GoingArm is all about sharing experiences and knowledge. Attendees will gain from the first-hand knowledge of experienced scientific application programmers writing for the Arm platform, including topics such as: optimizing for 64-bit Arm, memory systems, scalability and vectorization.

Content specifically focuses on HPC applications and cross-over/emerging application areas such as machine learning, deep learning, bioinformatics, and analytics.

Organizers

Jonathan Beard

Jonathan is a Staff Research Engineer focusing on scalable big data systems, in the Memory and Systems group at Arm Inc. Jonathan also serves as a technical advisor to start-ups, and has given talks ranging from C++ parallel runtimes to debating exascale memory architectures at SC. Jonathan Beard received a BS (Biology) and BA (International Studies) in 2005 from the Louisiana State University, MS (Bioinformatics) in 2010 from The Johns Hopkins University, and a PhD in Computer Science from Washington University in St. Louis in 2015. Jonathan served as a U.S. Army Officer from 2005 through July 2010 where he served in roles ranging from medical administrator, to Aide-de-Camp, to acting director of the medical informatics department for the U.S. Army in Europe. Jonathan's research interests also include online modeling of stream/data-flow parallel systems, streaming architectures, compute near data, and massively parallel processing. (direct contact: E-mail)

Roxana Rusitoru

Roxana Rusitoru is a Senior Research Engineer in Arm’s Research division, working in Software and Large Scale Systems. She joined Arm in 2012 after obtaining an MEng degree in Computing (Software Engineering) from Imperial College London in optimising unstructured mesh CFD applications on multicores via machine learning and code transformation. At Arm, amongst others, she has worked on Linux kernel optimizations aimed at HPC and sensitivity studies aimed to showcase Arm AArch64 microprocessor characteristics suitable for HPC. Most recently, she has been working on power-aware scheduling at OS level for heterogeneous cores and methodologies to identify representative sub-sections from multi-threaded applications. Some of her research interests are software performance optimization and next-gen heterogeneous architectures. Roxana has been a part of the Mont-Blanc 1 and 2 projects, and is now leading the Software ecosystem in Mont-Blanc 3, in addition to technical contributions. (direct contact: E-mail)

Workshop Contacts