-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[CALCITE-6435] SqlToRel conversion of IN expressions may lead to incorrect simplifications #3835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b1d979c
to
84069b2
Compare
|
||
if (prevLiteral != null | ||
&& literal.getType().equals(prevLiteral.getType()) | ||
&& !literal.equals(prevLiteral)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so two literals can be equal but their types can be different?
one could argue that this is a bug in SqlLiteral::equals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the situation was caused by two literals which had the same value; but different types (VARCHAR
; CHAR(n)
)
actually the predicate based simplification also does this (beyond other things); but there are situations in which its disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think VARCHAR
and CHAR
should be considered equivalent if RexLiteral.value
is the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CHAR(N) indicates that the literal has to be padded with spaces, whereas VARCHAR indicates that trailing spaces are to be ignored. So two literals with identical strings but different types can have different values.
@@ -1030,7 +1030,7 @@ private void checkGroupBySingleSortLimit(boolean approx) { | |||
+ "intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], " | |||
+ "filter=[AND(" | |||
+ "=($3, 'High Top Dried Mushrooms'), " | |||
+ "SEARCH($87, Sarg['Q2', 'Q3']:CHAR(2)), " | |||
+ "SEARCH($87, Sarg['Q2':VARCHAR, 'Q3':VARCHAR]:VARCHAR), " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this change surprising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised too ; however this is the same kind of expressions SqlToRel
would produce if the input would have been a set of comparisison ; so the IN
rewrite now matches that
core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java
Outdated
Show resolved
Hide resolved
84069b2
to
f08ddbf
Compare
Let's wait a bit to see whether anyone else has comments. |
…rrect simplifications Conversion path for comparisions generated from IN expressions was handling types differently. This may have lead to some over-simplification in some cases. Altered the conversion to do the full SqlToRex conversion steps for these generated nodes as well. Added an extra safeguard check to RexSimplify to prevent the bug from being triggered.
f08ddbf
to
0a4a974
Compare
|
I think it can be simpler to modify the judgment of
where |
I don't see any such method in I would be kinda against of the introduction of such method because it would place custom type comparision logic inside a Rex implementation... I might have not mentioned this here...but |
@@ -1753,7 +1756,7 @@ private <C extends Comparable<C>> RexNode simplifyAnd2ForUnknownAsFalse( | |||
if (literal2 == null) { | |||
continue; | |||
} | |||
if (!literal1.equals(literal2)) { | |||
if (literal1.getType().equals(literal2.getType()) && !literal1.equals(literal2)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm puzzled about the necessity of adding extra type equals checks. What specific issues might arise without them in SQL simplification?
// TODO: remove requireNonNull when checkerframework issue resolved | ||
ensureSqlType(requireNonNull(pair.left, "pair.left").getType(), | ||
bb.convertExpression(pair.right))))); | ||
RexUtil.composeConjunction( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the focus solely on the SQL IN case, or might there be a need for adjustments elsewhere too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really; this conjunction
is there for complex expressions; which will be there for NOT IN
as well:
(a,b) NOT IN ((1,2), (3,4))
NOT ((a=1 && b=2) || (a=3 && b=4) )
Currently, there is no It seems two distinct approaches. This PR handles There is no problem based on the approach of this PR. Only two small points that need to be confirmed:
|
the issue arised from the fact that
not mandatory for the current bug ; as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Conversion path for comparisions generated from IN expressions was handling types differently. This may have lead to some over-simplification in some cases.
Altered the conversion to do the full SqlToRex conversion steps for these generated nodes as well. Added an extra safeguard check to RexSimplify to prevent the bug from being triggered.