POSTER: Asynchrony versus Bulk-Synchrony for a Generalized N-body Problem from Genomics (PPoPP 2021 - Main Conference)

Who

Marquita Ellis, Aydın Buluç, Katherine Yelick

Track

PPoPP 2021 Main Conference

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 2 Mar 2021 13:12 - 13:18 - Session 6. Posters 1 Chair(s): Adam Morrison

Abstract

This work examines a data-intensive irregular application from genomics, a long read alignment problem, which represents a kind of Generalized N-Body problem, one of the “seven giants” of the NRC Big Data motifs. In this problem, computations (genome alignments) are performed on sparse and data-dependent pairs of inputs, with variable cost computation and variable datum sizes. We lay out the challenges of load balancing, communication and synchronization, guaranteeing progress, and memory footprint for this application. In particular, there is no inherent locality in the pairwise interactions, unlike simulation-based N-Body problems, and the interaction sparsity depends on particular parameters of the input, which can also affect the quality of the output. We then examine both a pre-existing bulk-synchronous implementation, using collective communication in MPI, and a new asynchronous one, using cross-node RPCs in UPC++. We show that the asynchronous version effectively hides communication costs, with a memory footprint that is typically much lower than the bulk-synchronous version. Our application, while simple enough to be a kind of proxy for genomics or data analytics applications more broadly, is also part of a real application pipeline. It shows good scaling on real input problems, and at the same time, reveals some of the programming and architectural challenges for scaling this type of data-intensive irregular application.

Link to Publication

https://dl.acm.org/doi/10.1145/3437801.3441580

Marquita Ellis

University of California at Berkeley & Lawrence Berkeley National Lab

Aydın Buluç

University of California at Berkeley & Lawrence Berkeley National Lab

Katherine Yelick

University of California at Berkeley & Lawrence Berkeley National Lab

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 2 Mar
Displayed time zone: Eastern Time (US & Canada) change

12:30 - 13:18	Session 6. Posters 1Main Conference Chair(s): Adam Morrison Tel Aviv University

12:30 6m Talk		POSTER: On Group Mutual Exclusion for Dynamic Systems Main Conference Shreyas Gokhale The University of Texas at Dallas, Sahil Dhoked The University of Texas at Dallas, Neeraj Mittal The University of Texas at Dallas Link to publication
12:36 6m Talk		POSTER: Bundled References: An Abstraction for Highly-Concurrent Linearizable Range Queries Main Conference Jacob Nelson Lehigh University, Ahmed Hassan Lehigh University, Roberto Palmieri Lehigh University Link to publication
12:42 6m Talk		POSTER: Verifying C11-Style Weak Memory Libraries Main Conference Sadegh Dalvandi University of Surrey, Brijesh Dongol University of Surrey Link to publication
12:48 6m Talk		POSTER: A Lock-free Relaxed Concurrent Queue for Fast Work Distribution Main Conference Giorgos Kappes University of Ioannina, Stergios V. Anastasiadis University of Ioannina Link to publication
12:54 6m Talk		POSTER: A more Pragmatic Implementation of the Lock-free, Ordered, Linked List Main Conference Jesper Träff TU Wien, Austria, Manuel Pöter TU Wien, Austria Link to publication
13:00 6m Talk		POSTER: Extending MapReduce Framework with Locality Keys Main Conference Yifeng Cheng Peiking University, China, Bei Wang Peking University, China, Xiaolin Wang Peking University, China Link to publication
13:06 6m Talk		POSTER: On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal LU Factorization Main Conference Grzegorz Kwasniewski ETH Zurich, Tal Ben-Nun Department of Computer Science, ETH Zurich, Alexandros Nikolaos Ziogas ETH Zurich, Timo Schneider ETH Zurich, Maciej Besta ETH Zurich, Torsten Hoefler ETH Zurich Link to publication
13:12 6m Talk		POSTER: Asynchrony versus Bulk-Synchrony for a Generalized N-body Problem from Genomics Main Conference Marquita Ellis University of California at Berkeley & Lawrence Berkeley National Lab, Aydın Buluç University of California at Berkeley & Lawrence Berkeley National Lab, Katherine Yelick University of California at Berkeley & Lawrence Berkeley National Lab Link to publication