Over the past ten years, I have been actively involved in OLAP database-related work, making significant contributions to the open-source communities of Apache Kylin, Apache Doris, and StarRocks.
OLAP Database Performance Tuning Guide
Fundamentals of Performance Optimization
Database Principles
Performance Optimization Methodology
Performance Optimization Tools
Performance Optimization Case Studies
373
40
StarRocks
My team and I built an extremely fast StarRocks query engine, including:
StarRocks Vectorized Execution Engine
StarRocks Cost-based Query Optimizer
StarRocks Pipeline Execution Engine
Global Low Cardinality Dictionary Optimization
9800
1900
Apache Doris
Built Meituan Doris OLAP analysis platform from scratch.
Implemented the Colocate Join, resulting in a performance improvement of 3 to 6 times.
Completed multi-instance parallel optimization, enhancing query performance by 3 to 5 times.
Developed precise count distinct and user behavior analysis features based on Bitmap technology, significantly improving the performance of precise count distinct queries in Doris by orders of magnitude.
Contributed over 100 commits to the Apache Doris.
13400
3400
Apache Kylin
Implemented a distributed and high-availability Kylin Job Server.
Enhanced the performance of Kylin's ultra-high cardinality column calculations for MapReduce jobs by 10 times.
Achieved a 100-fold improvement in the performance of Kylin's precise count distinct queries without rollup.
Optimized memory usage for building the Kylin global dictionary, resulting in a two-order-of-magnitude increase in the concurrency capability of the Kylin Job Server.
Completed the Kylin on Druid project, which significantly improved Kylin's runtime query performance by several times.