StagedDB/CMP

StagedDB/CMP project

This large long-term project introduces a revolutionary staged design for high-performance, evolvable DBMS that are easy to tune and maintain. We break the database system into modules and encapsulate them into self-contained stages connected to each other through queues.

With the advent of highly-parallel chip multiprocessors database system designers are called to revisit their designs. We study the performance of commercial database systems in evolving computer architectures and we believe that conventional database designs are inherently restricted in performing highly in such environments. On the other hand, the different approach taken by staged database designs makes them more suitable for high performance in the new computing landscape.

Staged Database Systems

/webdav/site/dias/groups/DIAS-unit/public/system_arch.jpg

Our group proposed the use of staging for database systems. According to the Staged Database System design, the previously monolithic database system is decomposed to a set of stages. Each stage has its own queue and thread support. New queries queue up in the first stage, they are encapsulated into a “packet”, and pass through the five stages shown on the top of the figure below. A packet carries the query’s “backpack:” its state and private data. Inside the execution engine a query can issue multiple packets to increase parallelism.

There are multiple research problems associated with this new database system architecture, ranging from optimizing hardware resource usage to job queueing and scheduling with multiple constraints and to multi-query processing and optimization.

Systems

Shore-MT
Scalabe Storage Engine (see https://diaswww.epfl.ch/shore-mt/)

Cordoba
Efficient database query processing in emerging computer architectures using Staging

QPipe
Relational execution engines typically treat concurrent queries as independent tasks, evaluating every plan in isolation. Reusing in-memory data pages across different queries is the job of the buffer pool manager, which can only set policy and not actively participate in the query evaluation process. Reusing common computations across concurrent queries comes at the cost of materializing views and assumes prior workload knowledge. The challenge is to exploit all opportunities for reusing both data and computation across concurrent queries transparently, without introducing additional costs or requiring prior knowledge.

STEPS
When running OLTP, instruction-related delays in the memory subsystem account for 25 to 40% of the total execution time. In contrast to data, instruction misses cannot be overlapped with out-of-order execution, and instruction caches cannot grow as the slower access time directly affects the processor speed. The challenge is to alleviate the instruction related delays without increasing the cache size.
We propose Steps, a technique that minimizes instruction cache misses in OLTP workloads by multiplexing concurrent transactions and exploiting common code paths. One transaction paves the cache with instructions, while close followers enjoy a nearly miss-free execution. Steps yields up to 96.7% reduction in instruction cache misses for each additional concurrent transaction, and at the same time eliminates up to 64% of mispredicted branches by loading a repeating execution pattern into the CPU.

Current Focus

Our focus is in four directions:

  • Study the performance of database systems when running OLTP and DSS workload on emerging hardware, such as the highly-parallel chip multiprocessors.

  • Build a staged relational query engine that can optimally manage available disk bandwidth, RAM, and CPU cycles across multiple concurrent queries, and provide a significant performance boost over conventional query engines.

  • Apply the Staged DB design coupled with smart scheduling to Online Transaction Processing (OLTP) engines in order to optimize both instruction and data cache (processor cache) performance, as well as, to improve the (intra-transaction) parallelism in those workloads.

  • Optimize chip multiprocessors for commercial workloads, especially database applications, such as on-line transaction processing (OLTP) and decision support applications (DSS).