Zhiqiang (Walkie) Que

Huxley Building, Imperial College London, London, UK, SW7 2BZ
Email: z.que [ at ] imperial.ac.uk

I am a research associate in the Custom Computing Research Group in the Department of Computing at Imperial College London. Before joining Imperial, I worked on microarchitecture design and verification of ARM CPUs with Marvell Semiconductor (2011-2015) and low latency FPGA systems with CFFEX (2015-2018). I obtained my PhD from Imperial College London under the supervision of Prof. Wayne Luk. I received my B.S. in Microelectronics and M.S. in Computer Science from Shanghai Jiao Tong University (SJTU) in 2008 and 2011 respectively.
I served as a peer reviewer for many conferences and Journals including FPGA, DATE, FCCM, FPL, FPT, TRETS, TODAES, TECS, JSA, TCAS-I, JPDC, FGCS, TETC, IECIE, etc. Our research has received best paper awards at SmartCloud'18, CSE'10 and best paper nomination at FCCM'20, ASAP'19, FPT'19, FPT'18. In addition, I served on Technical Program Committees for top-tier conferences, such as DAC, DATE and FPT.
My research interests include computer architecture, embedded systems, high-performance computing and computer-aided design (CAD) tools for hardware design optimization.

Some News

  • Nov. 2024 - Invited to serve on the DAC 2025 TPC, welcome to submit!

  • Sept. 2024 - Invited to serve on the DATE 2025 TPC, welcome to submit!

  • Jul. 2024 - Invited to serve on the FPT 2024 TPC, welcome to submit!

  • Jul. 2024 - Accepted by Machine Learning: Science and Technology (MLST) Ultrafast jet classification at the HL-LHC
    [DOI] [PDF]

  • Apr. 2024 - Accepted by IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS): Low Latency Variational Autoencoder on FPGAs
    [Link] [PDF]

  • Jan. 2024 - Accepted by ACM TECS: LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics
    [PDF] [GitHub] [PDF-arXiv] [Indico] [Slides] [VIDEO]

  • Dec. 2023 - FPT 2023: Efficiently Removing Sparsity for High-Throughput Stream Processing
    [PDF] [Src]

  • Jun. 2023 - Successfully defended my PhD.

  • May 2023 - FPL 2023: MetaML: Automating Customizable Cross-Stage Design-Flow for Deep Learning Acceleration, 33rd International Conference on Field-Programmable Logic and Applications (FPL). Our first work about MetaML for codifying the optimizations for DL accelerators.
    [PDF]

  • Apr. 2023 - Invited to serve on the FPT 2023 TPC, welcome to submit!

  • Dec. 2022 - FPT 2022: Accelerating Transformer Neural Networks on FPGAs for High Energy Physics Experiments, 2022 International Conference on Field-Programmable Technologies. Our first work about low latency transformer networks.
    [Link] [PDF]

  • Oct. 2022 - ARC 2022: Hardware-Aware Optimizations for Deep Learning Inference on Edge Devices
    [Link] [PDF]

  • Oct. 2022 - A talk for Compute Accelerator Forum based on our latest draft: LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors
    [PDF-ArXiv] [Indico] [Slides] [VIDEO]

  • August 2022 - FPL 2022: Optimizing graph Neural Networks for jet tagging in particle physics on FPGAs
    [PDF] [VIDEO]

  • April 2022 - AICAS 2022: Reconfigurable Acceleration of Graph Neural Networks for Jet Identification in Particle Physics
    [PDF]

  • April 2022 - ACM TRETS: Remarn: A Reconfigurable Multi-threaded Multi-core Accelerator for Recurrent Neural Networks, It is an extension of FPT'20 paper.
    [Link]

  • March 2022 - IEEE TCAD: FPGA-based Accelerastion for Bayesian Convolutional Neural Networks Co-author
    [Link] [PDF]

  • March 2022 - IEEE TPDS: Accelerating Bayesian Neural Networks via Algorithmic and Hardware Optimizations Co-author
    [Link] [PDF]

  • January 2022 - IEEE Transactions on Very Large Scale Integration (VLSI) Systems: Recurrent Neural Networks With Column-Wise Matrix-Vector Multiplication on FPGAs, It is an extension of our FCCM'20 paper.
    [Link] [PDF]

  • October 2021 - Will appear in the FPT'21: Optimizing Bayesian Recurrent Neural Networks on an FPGA-based Accelerator, Co-first author. This paper is about accelerating Bayesian LSTMs via a co-design framework.
    [PDF]

  • September 2021 - A talk for FastML group about the II balancing for multi-layer LSTM acceleration on FPGAs.
    [Slides]

  • June 2021 - ASAP'21: Accelerating Recurrent Neural Networks for Gravitational Wave Experiments. This paper presents novel reconfigurable architectures with balanced IIs for reducing the latency of multi-layer LSTM-based autoencoder that is used for detecting gravitational waves.
    [PDF] [Github]

  • May 2021 - Journal of Systems Architecture (JSA) : In-Circuit Tuning of Deep Learning Designs. It is an extension of our ICCAD'19 paper about the In-Circuit Tuning.
    [PDF]

  • March 2021 - FCCM'21: Instrumentation for Live On-Chip Debug of Machine Learning Training on FPGAs. Co-author.
    [PDF]

  • November 2020 - FPT'20: A Reconfigurable Multithreaded Accelerator for Recurrent Neural Networks, 2020 International Conference on Field-Programmable Technologies. Acceptance Rate: 24.7%.
    [VIDEO] [PDF]

  • October 2020 - ICCD'20, Short-paper: Optimizing FPGA-based CNN Accelerator using Differentiable Neural Architecture Search , the 38th IEEE International Conference on Computer Design. Co-author.

  • July 2020 - Journal of Signal Processing Systems (JSPS) paper: Mapping Large LSTMs to FPGAs with Weight Reuse. It is an extension of our ASAP'19 paper about the weights reuse for LSTMs with blocking & batching strategy.
    [Link] [PDF]

  • May 2020 - FCCM'20 paper: Optimizing Reconfigurable Recurrent Neural Networks. Conventional design of matrix-vector multiplications (MVM) for RNNs is row-wise, however, it will bring system stall due to data dependency. To eliminate the data dependency in RNNs, we proposed column-wise MVM for RNNs in this paper.
    [Link] [PDF]

  • May 2020 - Best paper nomination in FCCM'20 : High-Throughput Convolutional Neural Network on an FPGA by Customized JPEG Compression. In this paper, we propose customized JPEG+CNN to address the data transfer bandwidth problems for cloud-based FPGAs.
    [Link] [PDF]

  • December 2019 - FPT'19 paper: Real-time Anomaly Detection for Flight Testing using AutoEncoder and LSTM. In this work, a novel Timestep(TS)-buffer is proposed to avoid redundant calculations of LSTM gate operations to reduce system latency.
    [Link] [PDF]

  • November 2019 - ICCAD'19 paper: Towards In-Circuit Tuning of Deep Learning Designs.
    [Link] [PDF]

  • July 2019 - ASAP'19 paper: Efficient Weight Reuse for Large LSTMs. In this paper, we proposed a blocking & batching strategy to reuse the LSTM weights.
    [Link] [PDF]