Skip to content

[CALCITE-3627] New Row Null Policy #1971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

talatuyarer
Copy link

As describe under CALCITE-3627 ticket. Current row policy is wrong. When ROW has a null column. Calcite returns null for ROW value. I am working on Apache Beam SQL Row support. Because of ANY policy RexToLixTranslator generate wrong code. This is my first pr for calcite project. Thanks for your comments in advance.

hsyuan and others added 3 commits May 10, 2020 16:54
1. Top-down trait request
2. Bottom-up trait derivation
3. Trait enforcement without AbstractConverter

How to use?

1. Enable top-down optimization by setting {VolcanoPlanner#setTopDownOpt(boolean)}
or add 'calcite.planner.topdown.opt=true' to saffron.properties config file.

2. Let your convention's rel interface extends {PhysicalNode}, see
{EnumerableRel} as an example.

3. Each physical operator overrides any one of the two methods:
{PhysicalNode#passThrough(RelTraitSet)} or
{PhysicalNode#passThroughTraits(RelTraitSet)} depending on your needs.

4. Choose derive mode for each physical operator by overriding
{PhysicalNode#getDeriveMode()}.

5. If the derive mode is {DeriveMode#OMAKASE}, override method
{PhysicalNode#derive(List)} in the physical operator, otherwise, override
{PhysicalNode#derive(RelTraitSet, int)} or
{PhysicalNode#deriveTraits(RelTraitSet, int)}.

6. Mark your enforcer operator by overriding {RelNode#isEnforcer()}, see
{Sort#isEnforcer()} as an example. This is important, because it can help
{VolcanoPlanner} avoid unnecessary trait propagation and derivation, therefore
improve optimization efficiency.

7. Implement {Convention#enforce(RelNode, RelTraitSet)} in your convention,
which generates appropriate physical enforcer. See
{EnumerableConvention#enforce(RelNode, RelTraitSet)} as example. Simply return
null if you don't want physical trait enforcement.

How does it work?

Let S# denote the seed physical operator in a RelSet after logical and physical
rules transformation, P# denote the physical operator generated by passing down
parent trait requirements, D# denote the physical operator generated by
deriving from child delivered traitSets.

The initial rel list state in a RelSet is as follows:
    cursor
      |
      V
     S1, S2

When we create a task for RelSubset1, the task will immediately pass the
subset's traitSet to seed operators, S1 and S2, now we have:
    cursor
      |
      V
     S1, S2, P1, P2

The subset task will create a optimization task for the relnode pointed by
cursor, and move cursor to next available physical operator S2. In the task for
S1, it will continue optimize its child nodes, which are RelSubsets. After
child inputs optimization is finished, S1 will derive new relnodes from
delivered subsets in input RelSet. Once task for S1 is completed, we have:
        cursor
          |
          V
     S1, S2, P1, P2, D1

The subset task continues scheduling task for S2, P1... until there is no more
relnode created for the RelSet, then we have:
                                cursor
                                  |
                                  V
     S1, S2, P1, P2, D1, D2, D3, null

When a task for another RelSubset2 is created, the task will try to pass down
the subset's traitSet to seed operator S1 and S2, now the RelSet looks like:
                                cursor
                                  |
                                  V
     S1, S2, P1, P2, D1, D2, D3, P3, P4

The process continues till there is no more subsets or relnodes created for the
RelSet.

See https://lists.apache.org/thread.html/r38ea71968c069f465921e7197488329c15413b46831c90ad4d48f87e%40%3Cdev.calcite.apache.org%3E

Close apache#1953
@DonnyZone
Copy link
Contributor

Thanks for pushing this work forward. I ever tried (but failed) to find some similar functions in other DB products. This makes me confused about whether we should treat [null, null, .., null] as null.
That is, to introduce NullPolicy.ALL or use NullPolicy.NONE? At least, current NullPolicy.ANY is wrong.
Personally, I'm inclined to introduce NullPolicy.ALL, as it follows SQL standard.
Also cc @rubenada @zabetak

@@ -25,6 +25,10 @@
* ANY whenever possible.</p>
*/
public enum NullPolicy {

/** Returns null if and only if all of the arguments are null;
* If all of the arguments are false return false otherwise true. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all of the arguments are false return false otherwise true.

Do we support this? I think the policy is just for null value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we also support that. in # [CALCITE-3482] Equality of nested ROWs returns false for identical literal value test cases we are testing that. In those test cases we check ROW comparison.

@@ -1146,7 +1147,10 @@ private static Expression implementNullSemantics(
list.add(
translator.translate(
operand.e, NullAs.IS_NULL));
translator = translator.setNullable(operand.e, false);
//If column is not ROW
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitspace after //

@@ -83,4 +83,81 @@ where t.struct = ROW(2, ROW(3,4));

!ok

# [CALCITE-3627] Null check if all fields of ROW are null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we combine these tests into one?

select ROW(...),
         ROW(...), ... , 
         ROW(...)

// Condition for regular cases: v0 == null || v1 == null
Expression condition = Expressions.foldOr(list);
if (nullPolicy == NullPolicy.ALL) {
// if some operands are not nullable ROW can not be null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To unify the comment style, capitalize if as If.

@DonnyZone
Copy link
Contributor

Hi @talatuyarer, could you please update the commit message as the format described in document?

Copy link
Author

@talatuyarer talatuyarer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @DonnyZone to review my pr. I am working on to unify testcases i will update my pr soon.

@@ -25,6 +25,10 @@
* ANY whenever possible.</p>
*/
public enum NullPolicy {

/** Returns null if and only if all of the arguments are null;
* If all of the arguments are false return false otherwise true. */
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we also support that. in # [CALCITE-3482] Equality of nested ROWs returns false for identical literal value test cases we are testing that. In those test cases we check ROW comparison.

ILuffZhe and others added 5 commits December 27, 2021 00:58
Use fixed docker image to avoid spurious changes during the website
generaion.

Close apache#2663
- Introduce new ProxyingMetadataHandlerProvider that avoids code compilation
- Update RelMetadataTest to factor out metadata handling from tests
- Introduce tests for both Janino and the proxying/simplistic path.
- A simple microbenchmark for metadata
- Minor version bump to JMH
- Enable clearing of static cache within JaninoRelMetadataProvider
- Update RelMetadataTest to be parameterized
- Add test to confirm cyclic exception
…tuyarer/calcite into CALCITE-3627-row-nullpolicy

� Conflicts:
�	core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableConvention.java
�	core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableMergeJoin.java
�	core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableRel.java
�	core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableSortedAggregate.java
�	core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableSortedAggregateRule.java
�	core/src/main/java/org/apache/calcite/adapter/enumerable/NullPolicy.java
�	core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java
�	core/src/main/java/org/apache/calcite/config/CalciteSystemProperty.java
�	core/src/main/java/org/apache/calcite/plan/Convention.java
�	core/src/main/java/org/apache/calcite/plan/RelTraitSet.java
�	core/src/main/java/org/apache/calcite/plan/volcano/RelSet.java
�	core/src/main/java/org/apache/calcite/plan/volcano/RelSubset.java
�	core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java
�	core/src/main/java/org/apache/calcite/rel/PhysicalNode.java
�	core/src/main/java/org/apache/calcite/rel/RelCollations.java
�	core/src/test/java/org/apache/calcite/test/RelOptTestBase.java
�	core/src/test/java/org/apache/calcite/test/TopDownOptTest.java
�	core/src/test/resources/org/apache/calcite/test/TopDownOptTest.xml
�	core/src/test/resources/saffron.properties
�	core/src/test/resources/sql/struct.iq
�	plus/src/test/java/org/apache/calcite/adapter/tpcds/TpcdsTest.java
@talatuyarer
Copy link
Author

I updated my pr but looks like it does not take updated version. I will create new pr for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants