-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[CALCITE-5997] OFFSET operator is incorrectly unparsed #3423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ce38d81
to
a910c17
Compare
@tanclary I think you have added the "offset" field, so perhaps you want to review this PR. Thank you. |
@@ -78,7 +78,13 @@ public SqlItemOperator(String name, | |||
SqlWriter writer, SqlCall call, int leftPrec, int rightPrec) { | |||
call.operand(0).unparse(writer, leftPrec, 0); | |||
final SqlWriter.Frame frame = writer.startList("[", "]"); | |||
call.operand(1).unparse(writer, 0, 0); | |||
if (offset == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to add something similar for the other operators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
safe_offset, safe_ordinal etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, Calcite currently treats a simple array index as a SAFE_ORDINAL index, as the comment on SqlOperator ITEM says:
/** The SQL standard calls the ARRAY variant a
* <array element reference>. Index is 1-based. The standard says
* to raise "data exception - array element error" but we currently return
* null. */
public static final SqlOperator ITEM =
new SqlItemOperator("ITEM", OperandTypes.ARRAY_OR_MAP, 1, true);
The last true
argument is for the safe
parameter.
If we take this approach, a raw index is unparsed as a SAFE_ORDINAL, reflecting the implementation semantics.
If you want to preserve the original syntax we need to carry additional information in the SqlItemOperator.
What is the right solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, should the following test pass?
@Test void testOffset() {
sql("SELECT ARRAY[2,4,6][2]")
.ok("SELECT (ARRAY[2, 4, 6])[SAFE_ORDINAL(2)]");
sql("SELECT ARRAY[2,4,6][OFFSET(2)]")
.ok("SELECT (ARRAY[2, 4, 6])[OFFSET(2)]");
sql("SELECT ARRAY[2,4,6][SAFE_OFFSET(2)]")
.ok("SELECT (ARRAY[2, 4, 6])[SAFE_OFFSET(2)]");
sql("SELECT ARRAY[2,4,6][SAFE_ORDINAL(2)]")
.ok("SELECT (ARRAY[2, 4, 6])[SAFE_ORDINAL(2)]");
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would imagine you do not want to unparse with SAFE_ORDINAL unless you are using BigQuery as it is dialect-specific syntax (to my knowledge). The comment you mentioned- it seems like a good time to address that as well if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BigQuerySqlDialect already handles this correctly.
The bug I am fixing was caused by SqlOperatorTest, where tests look like this:
final SqlOperatorFixture f0 = fixture();
f0.setFor(SqlLibraryOperators.SAFE_ORDINAL);
final SqlOperatorFixture f = f0.withLibrary(SqlLibrary.BIG_QUERY);
f.checkScalar("ARRAY[2,4,6][SAFE_ORDINAL(3)]", "6", "INTEGER");
Notice that this test only sets up the library, but not the dialect. In a new test fixture that I am writing, which unparses and reparses the query before executing, this test fails because the query is not unparsed correctly. In this case the unparsing does not go through the dialect.
I don't see a way to set the dialect in the SqlOperatorFixture. Do you have a suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, the SQL parser will happily parse the SAFE_* functions without the big query dialect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you can create the fixture with a specified parser config (see withParserConfig()
) which I believe certain dialects like BigQuery have default parser configs. Perhaps you could pull on this thread a bit, the functionality may already be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but I still need to unparse such a query when the dialect is NOT specified.
The parser accepts SAFE_ORDINAL, so the unparser should too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see what you're saying. Yeah I believe when I originally added this cluster of functions that I essentially pair the OFFSET operator with ITEM because at the time they seemed identical. Admittedly I did not give enough thought to the unparsing so maybe treating them more independently would help also, given that that approach seems to work for most other BQ-specific operators.
I hope I have found an adequate solution. |
|
||
public SqlItemOperator(String name, | ||
SqlSingleOperandTypeChecker operandTypeChecker, | ||
int offset, boolean safe) { | ||
int offset, boolean safe, boolean simple) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think String funcId
is more valuable than boolean simple
.
// Technically these functions are extensions to standard SQL, | ||
// however, the standard SQL parser accepts them, | ||
// so there must be a way to unparse them. | ||
String function = safe ? "SAFE_OFFSET" : "OFFSET"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have String funcId
, we can write the code
if (funcId != null) {
final SqlWriter.Frame offsetFrame = writer.startFunCall(funcId);
call.operand(1).unparse(writer, 0, 0);
writer.endFunCall(offsetFrame);
}
else {
call.operand(1).unparse(writer, 0, 0);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much better, thank you.
11cae49
to
d582fec
Compare
@@ -725,6 +725,37 @@ protected static SortedSet<String> keywords(@Nullable String dialect) { | |||
+ ".*"); | |||
} | |||
|
|||
/** Test case for <a href="https://issues.apache.org/jira/browse/CALCITE-5997">[CALCITE-5997]</a> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d582fec
to
e7a7231
Compare
I think the current PR looks good. @tanclary, Do you have any suggestions? |
@JiajunBernoulli @mihaibudiu No, I think this looks good. I appreciate you opening the bug/fix. If you squash down to one commit I would be happy to merge. |
Signed-off-by: Mihai Budiu <[email protected]>
e7a7231
to
d9e9ecb
Compare
done |
Kudos, SonarCloud Quality Gate passed! |
No description provided.