Zhiqiang (Walkie) Que
Huxley Building, Imperial College London, London, UK, SW7 2BZ
Email: z.que [ at ] imperial.ac.uk
I am a research associate in the Custom Computing Research Group in the Department of Computing at Imperial College London.
Before joining Imperial, I worked on microarchitecture design and verification of ARM CPUs with Marvell Semiconductor (2011-2015) and low latency FPGA systems with CFFEX (2015-2018).
I obtained my PhD from Imperial College London under the supervision of Prof. Wayne Luk. I received my B.S. in Microelectronics and M.S. in Computer Science from Shanghai Jiao Tong University (SJTU) in 2008 and 2011 respectively.
I served as a peer reviewer for many conferences and Journals including FPGA, DATE, FCCM, FPL, FPT, TRETS, TODAES, TECS, JSA, TCAS-I, JPDC, FGCS, TETC, IECIE, etc.
Our research has received best paper awards at SmartCloud'18, CSE'10 and best paper nomination at FCCM'20, ASAP'19, FPT'19, FPT'18.
In addition, I served on Technical Program Committees for top-tier conferences, such as DAC, DATE and FPT.
My research interests include computer architecture, embedded systems, high-performance computing and computer-aided design (CAD) tools for hardware design optimization.
Some News
-
Nov. 2024 - Invited to serve on the DAC 2025 TPC, welcome to submit!
-
Sept. 2024 - Invited to serve on the DATE 2025 TPC, welcome to submit!
-
Jul. 2024 - Invited to serve on the FPT 2024 TPC, welcome to submit!
-
Jul. 2024 - Accepted by Machine Learning: Science and Technology (MLST) Ultrafast jet classification at the HL-LHC
[DOI] [PDF] -
Apr. 2024 - Accepted by IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS): Low Latency Variational Autoencoder on FPGAs
[Link] [PDF] -
Jan. 2024 - Accepted by ACM TECS: LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics
[PDF] [GitHub] [PDF-arXiv] [Indico] [Slides] [VIDEO] -
Dec. 2023 - FPT 2023: Efficiently Removing Sparsity for High-Throughput Stream Processing
[PDF] [Src] -
Jun. 2023 - Successfully defended my PhD.
-
May 2023 - FPL 2023: MetaML: Automating Customizable Cross-Stage Design-Flow for Deep Learning Acceleration, 33rd International Conference on Field-Programmable Logic and Applications (FPL). Our first work about MetaML for codifying the optimizations for DL accelerators.
[PDF] -
Apr. 2023 - Invited to serve on the FPT 2023 TPC, welcome to submit!
-
Dec. 2022 - FPT 2022: Accelerating Transformer Neural Networks on FPGAs for High Energy Physics Experiments, 2022 International Conference on Field-Programmable Technologies. Our first work about low latency transformer networks.
[Link] [PDF] -
Oct. 2022 - ARC 2022: Hardware-Aware Optimizations for Deep Learning Inference on Edge Devices
[Link] [PDF] -
Oct. 2022 - A talk for Compute Accelerator Forum based on our latest draft: LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors
[PDF-ArXiv] [Indico] [Slides] [VIDEO] -
August 2022 - FPL 2022: Optimizing graph Neural Networks for jet tagging in particle physics on FPGAs
[PDF] [VIDEO] -
April 2022 - AICAS 2022: Reconfigurable Acceleration of Graph Neural Networks for Jet Identification in Particle Physics
[PDF] -
April 2022 - ACM TRETS: Remarn: A Reconfigurable Multi-threaded Multi-core Accelerator for Recurrent Neural Networks, It is an extension of FPT'20 paper.
[Link] -
March 2022 - IEEE TCAD: FPGA-based Accelerastion for Bayesian Convolutional Neural Networks Co-author
[Link] [PDF] -
March 2022 - IEEE TPDS: Accelerating Bayesian Neural Networks via Algorithmic and Hardware Optimizations Co-author
[Link] [PDF] -
January 2022 - IEEE Transactions on Very Large Scale Integration (VLSI) Systems: Recurrent Neural Networks With Column-Wise Matrix-Vector Multiplication on FPGAs, It is an extension of our FCCM'20 paper.
[Link] [PDF] -
October 2021 - Will appear in the FPT'21: Optimizing Bayesian Recurrent Neural Networks on an FPGA-based Accelerator, Co-first author. This paper is about accelerating Bayesian LSTMs via a co-design framework.
[PDF] -
September 2021 - A talk for FastML group about the II balancing for multi-layer LSTM acceleration on FPGAs.
[Slides] -
June 2021 - ASAP'21: Accelerating Recurrent Neural Networks for Gravitational Wave Experiments. This paper presents novel reconfigurable architectures with balanced IIs for reducing the latency of multi-layer LSTM-based autoencoder that is used for detecting gravitational waves.
[PDF] [Github] -
May 2021 - Journal of Systems Architecture (JSA) : In-Circuit Tuning of Deep Learning Designs. It is an extension of our ICCAD'19 paper about the In-Circuit Tuning.
[PDF] -
March 2021 - FCCM'21: Instrumentation for Live On-Chip Debug of Machine Learning Training on FPGAs. Co-author.
[PDF] -
November 2020 - FPT'20: A Reconfigurable Multithreaded Accelerator for Recurrent Neural Networks, 2020 International Conference on Field-Programmable Technologies. Acceptance Rate: 24.7%.
[VIDEO] [PDF] -
October 2020 - ICCD'20, Short-paper: Optimizing FPGA-based CNN Accelerator using Differentiable Neural Architecture Search , the 38th IEEE International Conference on Computer Design. Co-author.
-
July 2020 - Journal of Signal Processing Systems (JSPS) paper: Mapping Large LSTMs to FPGAs with Weight Reuse. It is an extension of our ASAP'19 paper about the weights reuse for LSTMs with blocking & batching strategy.
[Link] [PDF] -
May 2020 - FCCM'20 paper: Optimizing Reconfigurable Recurrent Neural Networks. Conventional design of matrix-vector multiplications (MVM) for RNNs is row-wise, however, it will bring system stall due to data dependency. To eliminate the data dependency in RNNs, we proposed column-wise MVM for RNNs in this paper.
[Link] [PDF] -
May 2020 - Best paper nomination in FCCM'20 : High-Throughput Convolutional Neural Network on an FPGA by Customized JPEG Compression. In this paper, we propose customized JPEG+CNN to address the data transfer bandwidth problems for cloud-based FPGAs.
[Link] [PDF] -
December 2019 - FPT'19 paper: Real-time Anomaly Detection for Flight Testing using AutoEncoder and LSTM. In this work, a novel Timestep(TS)-buffer is proposed to avoid redundant calculations of LSTM gate operations to reduce system latency.
[Link] [PDF] -
November 2019 - ICCAD'19 paper: Towards In-Circuit Tuning of Deep Learning Designs.
[Link] [PDF] -
July 2019 - ASAP'19 paper: Efficient Weight Reuse for Large LSTMs. In this paper, we proposed a blocking & batching strategy to reuse the LSTM weights.
[Link] [PDF]