BitMatcher: Bit-level Counter Adjustment for Sketches (IEEE ICDE, 2024)

  • Authors: Qilong Shi, Chengjun Jia, Wenjun Li*, Zaoxing Liu, Tong Yang, Jianan Ji, Gaogang Xie, Weizhe Zhang, and Minlan Yu
  • Qilong Shi and Chengjun Jia contributed equally to this paper, and they conducted this work under the guidance of Wenjun Li (Corresponding author) and Tong Yang.
  • Wenjun Li completed this work when he conducted his postdoctoral research at Harvard University, working with Minlan Yu and Zaoxing Liu.

Abstract

Sketch has been widely used in the field of large-scale data stream processing. However, common fixed-counter algorithms such as Count-Min Sketch have to allocate larger counters, which wastes a lot of memory due to the high skewness of real-world data streams. To reduce memory usage, we propose to dynamically adjust the counter size that matches the distribution of the data stream. We introduce BitMatcher, a fast global-adjusting algorithm that automatically adjusts the counter to the optimal size to match the data stream. During stream processing, BitMatcher identifies items hashed into a bucket based on isolated fingerprints. If it overflows, BitMatcher changes the flag bits in the bucket and dynamically increases or shrinks the size of some counters in a fine-grained manner. BitMatcher can also relocate a cold item in the bucket with the idea of cuckoo hashing to preserve the potential hot item while achieving global load balancing. Through the above way of dealing with overflow caused by skewed data, BitMatcher precisely manipulates allocated bits and maximizes the use of memory. The experiments show that BitMatcher has high throughput and can outperform SOTA by up to 4 orders of magnitude in terms of accuracy. We also deployed BitMatcher on several platforms, showing its good software and hardware scalability.

Downloads

News

2024.6.1: The conference slides of BitMatcher has been uploaded.

2023.12.1: BitMatcher has been accepted by IEEE ICDE 2024.

2023.6.1: The source code of BitMatcher has also been uploaded at GitHub (https://github.com/wenjunpaper/BitMatcher).

2023.6.1: The website of BitMatcher is under construction and will be finished soon.

Contact

E-mail: wenjunli@pku.org.cn, wenjun0816@qq.com.