Adaptive Lower-level Driven Compaction to Optimize LSM-Tree Key-Value Stores [TKDE '20]

Authors: Yunpeng Chai, Yanfeng Chai, Xin Wang, Haocheng Wei, Yangyang Wang

Publication Date: 2020/8/25

Abstract:

Log-structured merge (LSM) tree key-value (KV) stores have been widely deployed in many NoSQL and SQL systems, serving online big data applications such as social networking, graph processing, machine learning, etc. The batch processing of sorted data merging (i.e., compaction) in LSM-tree key-value stores improves the write efficiency, and some lazy compaction methods have been proposed to accumulate more data within a batch. However, these batched writing methods lead to significant tail latency, which is unacceptable for online processing. Aiming to optimize both latency and throughput, we propose a novel Lower-level Driven Compaction (LDC) method which breaks the limitations of the traditional upper-level driven compaction manner and triggers practical compaction actions bottom-up, with the benefits of both decreasing the compaction granularity for smaller latency and reducing write amplification for higher throughput. Furthermore, we extend LDC to Adaptive LDC (ALDC) by adding an adaptive policy to adjust the key compaction threshold to fit the changes of workloads' features. The experimental results indicate that ALDC reduces the tail latency significantly and meanwhile achieves a much higher and stable throughput compared with existing approaches.

Published in: IEEE Transactions on Knowledge and Data Engineering ( Early Access )

DOI: 10.1109/TKDE.2020.3019264