THESIS
2019
Abstract
Select-Project-Join (SPJ) Queries are essential building blocks of general queries.
Efficiently estimating their output sizes critically affects the effectiveness of Cost-Based Optimizers (CBOs) in generating optimal query plans. Despite a rich literature in selection and join size estimation techniques, estimating the result size of a distinct projection remains an open problem when arbitrary filter and join conditions are present.
In this thesis, we provide an efficient online aggregation algorithm for accurately estimating the result size of SPJ queries, equivalently the distinct count. By continuously sampling paths from the join, our algorithm quickly converges to the exact value. Comprehensive experiments are conducted to prove the new algorithm outperforms existing ones by orde...[
Read more ]
Select-Project-Join (SPJ) Queries are essential building blocks of general queries.
Efficiently estimating their output sizes critically affects the effectiveness of Cost-Based Optimizers (CBOs) in generating optimal query plans. Despite a rich literature in selection and join size estimation techniques, estimating the result size of a distinct projection remains an open problem when arbitrary filter and join conditions are present.
In this thesis, we provide an efficient online aggregation algorithm for accurately estimating the result size of SPJ queries, equivalently the distinct count. By continuously sampling paths from the join, our algorithm quickly converges to the exact value. Comprehensive experiments are conducted to prove the new algorithm outperforms existing ones by orders of magnitudes.
Post a Comment