Write a Blog >>
PPoPP 2021
Sat 27 February - Wed 3 March 2021
Tue 2 Mar 2021 14:18 - 14:24 - Session 7. Posters 2 Chair(s): Todd Mytkowicz

The fast Fourier Transform (FFT), a reduced-complexity formulation of the Discrete Fourier Transform (DFT), is an important tool in many areas of science and engineering. FFTW is a well-known package that follows this approach and is currently one of the fastest available implementations of the FFT. NVIDIA introduced its version of FFTW called cuFFT that achieves high performance on the GPUs. In this work we present a novel way to map the FFT algorithm on the newly introduced Tensor Cores by adapting the the Cooley-Tukey recursive FFT algorithm. We present four major types of optimizations that enhance the performance of our approach for varying FFT sizes and show that the approach consistently outperforms cuFFT with a speedup of about 15% to 250% on average.

Tue 2 Mar

Displayed time zone: Eastern Time (US & Canada) change

13:30 - 14:30
Session 7. Posters 2Main Conference
Chair(s): Todd Mytkowicz Microsoft Research
13:30
6m
Talk
POSTER: In-situ Workflow Auto-tuning through Combining Component Models
Main Conference
Tong Shu Southern Illinois University Carbondale, Yanfei Guo Argonne National Laboratory, Justin Wozniak Argonne National Laboratory, Xiaoning Ding New Jersey Institute of Technology, Ian Foster Argonne Nat Lab and U.Chicago, Tahsin Kurc Stony Brook University
Link to publication
13:36
6m
Talk
POSTER: Simplifying Low-Level GPU Programming with GAS
Main Conference
Da Yan Hong Kong University of Science and Technology, Wei Wang Hong Kong University of Science and Technology, Xiaowen Chu Hong Kong Baptist University
Link to publication
13:42
6m
Talk
POSTER: Corder: Cache-Aware Reordering for Optimizing Graph Analytics
Main Conference
YuAng Chen The Chinese University of Hong Kong, Shenzhen, Yeh-Ching Chung The Chinese University of Hong Kong, Shenzhen
Link to publication
13:48
6m
Talk
POSTER: DFOGraph: An I/O- and Communication-Efficient System for Distributed Fully-out-of-Core Graph Processing
Main Conference
Jiping Yu Tsinghua University, Wei Qin Tsinghua University, Xiaowei Zhu Tsinghua University, Zhenbo Sun Tsinghua University, Jianqiang Huang Tsinghua University, Xiaohan Li Tsinghua University, Wenguang Chen Tsinghua University
Link to publication
13:54
6m
Talk
POSTER: An Efficient Uncertain Graph Processing Framework for Heterogeneous Architectures
Main Conference
Heng Zhang Institute of Software, Chinese Academy of Sciences; University of Sydney, Lingda Li Brookhaven National Laboratory, Donglin Zhuang University of Sydney, Rui Liu University of Chicago, Shuang Song Facebook Inc., Dingwen Tao Washington State University, Yanjun Wu Institute of Software, Chinese Academy of Sciences, Shuaiwen Leon Song University of Sydney
Link to publication
14:00
6m
Talk
POSTER: Dynamic Scaling for Low-Precision Learning
Main Conference
Ruobing Han Peking University, Min Si Argonne National Laboratory, James W. Demmel UC Berkeley, Yang You UC Berkeley
Link to publication
14:06
6m
Talk
POSTER: Exploring Deep Reuse in Winograd CNN Inference
Main Conference
Ruofan Wu Renmin University of China, Feng Zhang Renmin University of China, Zhen Zheng Alibaba Group, Xiaoyong Du Renmin University of China, Xipeng Shen North Carolina State University
Link to publication
14:12
6m
Talk
POSTER: A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression
Main Conference
Sian Jin Washington State University, Guanpeng Li University of Iowa, Shuaiwen Leon Song University of Sydney, Dingwen Tao Washington State University
Link to publication
14:18
6m
Talk
POSTER: FFT Blitz: The Tensor Cores Strike Back
Main Conference
Sultan Durrani University of Illinois at Urbana-Champaign, Muhammad Saad Chughtai Georgia Institute of Technology, Abdul Dakkak University of Illinois at Urbana-Champaign, Wen-mei Hwu University of Illinois at Urbana-Champaign, Lawrence Rauchwerger UIUC
Link to publication
14:24
6m
Break
Break
Main Conference