Research

If you would like to get a more detailed overview of my research interests, please check my research statement.

In short, I work at the intersection of systems, data management and processing, and computer architecture. I believe that one needs to adopt a holistic approach in order to address the systems challenges imposed both by emerging hardware and computing platforms and by the ever growing need to efficiently process, analyze and make sense of data collected at unprecedented rates. To that end, here are a few projects that I manage at present.

If you are a student interested in the projects described below, please check the following info.


Systems support for emerging hardware
The rise of hardware heterogeneity (CPU, GPU, FPGAs, and various specialized ASICs) and the potential to offload compute closer to data (e.g., storage and memory) or to push operations down to where data moves (e.g., on the network or acceleration within the chip) opens both exciting opportunities and significant challenges for systems software that wants to make efficient use of future hardware. Especially if we consider it in the context of the cloud. This requires an effort that is beyond what can be done within a single layer of the system stack. In this project we argue for a holistic approach by opening up the interfaces and customising the system stack. In particular, we explore the control/compute/data plane OS model and design/build customized lightweight OS kernels (suitable for the heterogeneous computing components).

Ideal candidates for the project should be comfortable with working on large-scale system development, and be interested in hacking the OS kernel or do driver development. Prior experience on working with FPGAs/GPUs is desirable but not required. For various sub-topics please contact me directly.
HW/SW co-design for data-processing workloads
Today's trend in boosting workload's performance is through hardware specialization -- e.g., Google's TPU for machine learning, NVIDIA's Tesla for deep neural networks, Oracle's SPARC M7 for advanced relational analytics, etc. In this project, in addition to exploring how we can use today's hardware to build better tools and implement suitable algorithms, we will also work on hardware/software co-design. The premise is that now is the perfect time to directly influence the design of upcoming hardware features. We will focus on data-processing workloads. In particular, we will work with FPGAs for prototyping various accelerators, but also investigate aspects for building a rack-scale computer (high-bandwidth low-latency network fabrics, connecting hundreds of cores and the effects of modern memory and storage technologies).

Ideal candidates for the project should have strong background in computer architecture and experience in developing algorithms that perform well on modern multicore machines. Prior experience on working with GPUs, FPGAs, or RDMA is a huge plus. For various sub-topics, please contact me directly.
System for Hybrid Transactional and Analytical Processing
Many businesses today rely on data driven decisions, ideally by doing analytics on fresh data. Consequently, it is important that modern data processing systems provide support for such hybrid workloads. In our prior work (see the BatchDB paper), we proposed an HTAP architecture and built a system that achieves good performance, provides high level of data freshness, and minimizes the load interaction between the transactional and analytical engines. This projects, follows up on this work and builds support for advanced types of (near-real time) analytics on fresh data (e.g., graph analytics or ML). For instance, one ongoing work is building an extension for graph processing that can do fraud-detection at almost real-time while at the same time supporting large volumes of financial transactions.

Ideal candidates for this project should be comfortable with working on large-scale system development, have strong background in databases and be interested in working on other types of data analytics.
Data analytics building blocks for heterogeneous workloads Modern data analytics workloads find the functionality offered by more traditional database engines of limited use (e.g., graph processing, ML, data mining). And this results in a plethora of specialized systems and data processing frameworks that need to re-address similar challenges that have been successfully solved by databases with technologies developed over the coarse of decades. Part of this is due to the coarse granularity of the principal database components -- the relational operators. This project explores how to make databases a more flexible platform with extended support for new data types and operations by lowering the level of abstraction. Such a change should enable building more expressive dataflows for a variety of data scientific workloads, easier portability across different platforms and influencing the design of future hardware features.

Ideal candidates for this project should have a solid background in databases and/or new data processing models/frameworks, should be comfortable with working on large-scale projects and codebases and have experience in developing algorithms or data structures that have good performance on today's hardware.

There are other smaller scale (individual) projects available -- please contact me for more details.

There is also opportunity to define projects: for example the use of Big Data and ML to address problems related to improving performance or scheduling and security of applications running on modern hardware or on cloud environments.