Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.38.0
Description
Supports the basic dphyp join reorder algorithm.
For example :
SELECT i_item_id FROM store_sales, customer_demographics, date_dim, item, promotion WHERE ss_sold_date_sk = d_date_sk AND ss_item_sk = i_item_sk AND ss_cdemo_sk = cd_demo_sk AND ss_promo_sk = p_promo_sk
The plan tree after pushing down filter :
LogicalProject(i_item_id=[$61]) LogicalJoin(condition=[=($7, $82)], joinType=[inner]) LogicalJoin(condition=[=($1, $60)], joinType=[inner]) LogicalJoin(condition=[=($22, $32)], joinType=[inner]) LogicalJoin(condition=[=($3, $23)], joinType=[inner]) LogicalTableScan(table=[[tpcds, store_sales]]) LogicalTableScan(table=[[tpcds, customer_demographics]]) LogicalTableScan(table=[[tpcds, date_dim]]) LogicalTableScan(table=[[tpcds, item]]) LogicalTableScan(table=[[tpcds, promotion]])
Convert Joins into one HyperGraph :
LogicalProject(i_item_id=[$61]) HyperGraph(edges=[{0}——INNER——{1},{0}——INNER——{2},{0}——INNER——{3},{0}——INNER——{4}]) LogicalTableScan(table=[[tpcds, store_sales]]) LogicalTableScan(table=[[tpcds, customer_demographics]]) LogicalTableScan(table=[[tpcds, date_dim]]) LogicalTableScan(table=[[tpcds, item]]) LogicalTableScan(table=[[tpcds, promotion]])
After dphyp join reorder (with trimming fields and pushing down Project), the plan is :
LogicalProject(i_item_id=[$1]) LogicalJoin(condition=[=($0, $2)], joinType=[inner]) LogicalProject(ss_cdemo_sk=[$0], i_item_id=[$2]) LogicalJoin(condition=[=($1, $3)], joinType=[inner]) LogicalProject(ss_cdemo_sk=[$1], ss_sold_date_sk=[$2], i_item_id=[$4]) LogicalJoin(condition=[=($0, $3)], joinType=[inner]) LogicalProject(ss_item_sk=[$0], ss_cdemo_sk=[$1], ss_sold_date_sk=[$3]) LogicalJoin(condition=[=($2, $4)], joinType=[inner]) LogicalProject(ss_item_sk=[$1], ss_cdemo_sk=[$3], ss_promo_sk=[$7], ss_sold_date_sk=[$22]) LogicalTableScan(table=[[tpcds, store_sales]]) LogicalProject(p_promo_sk=[$0]) LogicalTableScan(table=[[tpcds, promotion]]) LogicalProject(i_item_sk=[$0], i_item_id=[$1]) LogicalTableScan(table=[[tpcds, item]]) LogicalProject(d_date_sk=[$0]) LogicalTableScan(table=[[tpcds, date_dim]]) LogicalProject(cd_demo_sk=[$0]) LogicalTableScan(table=[[tpcds, customer_demographics]])
The main enumeration process of dphyp will be implemented in pr. However, it only can process inner join for now and the simplification of hypergraph has not yet been implemented.
Attachments
Issue Links
- Blocked
-
CALCITE-7029 Support DPhyp to handle various join types
-
- Resolved
-
- is related to
-
CALCITE-7092 DPhyp implementation assertion failure
-
- Resolved
-
-
CALCITE-7093 DPhyp algorithm should accept cost model as parameter
-
- Open
-
- links to