Towards NUMA-Aware Distributed Query Engines

Abstract

Efficient execution of distributed queries in the cloud is of paramount importance, both for cloud vendors and their customers. In this paper, we investigate the performance impact of different deployment policies of distributed query engines over multicore machines. We corroborate prior observations that traditional data systems have limited scalability on modern machines. Since a complete redesign to make them hardware-conscious can be prohibitively expensive, we explore whether treating the machine as a distributed system underneath can in fact bring performance advantages, while being transparent to the query engine itself. Our key observation is that, for a range of popular distributed query engines (SparkSQL, Presto, Greenplum, SingleStore), a deployment policy that maps an engine’s worker instances to the compute and memory resources of a NUMA node can bring around 2x performance improvement for the TPC-H workload (SF 100) over a standard deployment.