Recent works from both hardware and software domains offer various optimizations that try to take advantage of near data computing (NDC) opportunities. While the results from these works indicate performance improvements of various magnitudes, the existing literature lacks a detailed quantification of the potential of NDC and analysis of compiler optimizations on tapping into that potential. This paper first presents an analysis of the NDC potential when executing multithreaded applications on manycore platforms. It then presents two compiler schemes designed to take advantage of NDC. The first of these schemes try to increase the amount of computation that can be performed in a hardware component, whereas the second compiler strategy strikes a balance between optimizing NDC and exploiting data reuse, by being more selective on when to perform NDC (even if the opportunity presents itself) and how. The collected experimental results on a 5$\times$5 manycore system reveal that our first and second compiler schemes improve the overall performance of our multithreaded applications by, respectively, 22.5% and 25.2%, on average. Furthermore, these two compiler schemes are only 6.8% and 4.1% worse than an oracle scheme that makes the best near data computing decisions for each and every computation.
Mon 1 MarDisplayed time zone: Eastern Time (US & Canada) change
11:10 - 12:10 | |||
11:10 15mTalk | Synthesizing Optimal Collective Algorithms Main Conference Zixian Cai Australian National University, Zhengyang Liu University of Utah, Saeed Maleki Microsoft Research, Madan Musuvathi Microsoft Research, Todd Mytkowicz Microsoft Research, Jacob Nelson Microsoft Research, Olli Saarikivi Microsoft Research, Redmond Link to publication | ||
11:25 15mTalk | Parallel Binary Code Analysis Main Conference Xiaozhu Meng Rice University, Jonathon Anderson Rice University, John Mellor-Crummey Rice University, Mark W. Krentel Rice University, Barton P. Miller University of Wisconsin - Madison, Srđan Milaković Rice University Link to publication | ||
11:40 15mTalk | Compiler Support for Near Data Computing Main Conference Mahmut Taylan Kandemir Penn State University, USA, Jihyun Ryoo Penn State University, USA, Xulong Tang University of Pittsburgh, USA, Mustafa Karakoy TUBITAK-BILGEM, Turkey Link to publication | ||
11:55 15mTalk | Scaling Implicit Parallelism via Dynamic Control Replication Main Conference Michael Bauer NVIDIA, Wonchan Lee NVIDIA, Elliott Slaughter SLAC National Accelerator Laboratory, Zhihao Jia Carnegie Mellon University, Mario Di Renzo Sapienza University of Rome, Manolis Papadakis NVIDIA, Galen Shipman Los Alamos National Laboratory, Patrick McCormick Los Alamos National Laboratory, Michael Garland NVIDIA, Alex Aiken Stanford Univeristy Link to publication |