-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[CALCITE-6283] Function ARRAY_APPEND with a NULL array argument crashes with NullPointerException #3706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f.checkNull("array_append(null, 2)"); | ||
f.checkNull("array_prepend(null, 2)"); | ||
f.checkNull("array_remove(null, 2)"); | ||
f.checkNull("array_contains(null, 2)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have tests for appending null elements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanclary Yes, similar to:
f.checkNull("array_append(cast(null as integer array), 1)");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant a test where the element is null, not the array. But i found that we do so that's good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it in spark3.4. It seems that the first parameter of array_append is null. Why do we need to change it like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand your question
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that spark3.4 version does not support array_append(null, 1), it will report an error. Currently, array_append and array_prepend are adapted for spark in calcite. If spark does not support array_append(null , 1), why do we need to return null? I am not sure whether higher versions of spark support it.
I don't know if my opinion is useful, if it is problematic, please ignore it. thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Yes I agree with you. What do you think @mihaibudiu ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the type checker to reject NULL literals for arrays.
However, it seems to be that Spark is inconsistent, since it allows NULL values for arrays at runtime, but not at compile time. This is probably because Spark uses a different type checking/inference algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the type checker to reject NULL literals for arrays. However, it seems to be that Spark is inconsistent, since it allows NULL values for arrays at runtime, but not at compile time. This is probably because Spark uses a different type checking/inference algorithm.
spark sql, after generating the Unresolved Logical Plan, runs a series of rules to turn it into the Logical Plan, which solves some type conversion issues there, and the type handling of array_append(null,2) seems to be in that part
core/src/main/java/org/apache/calcite/sql/type/ArrayElementOperandTypeChecker.java
Outdated
Show resolved
Hide resolved
76c8ab5
to
c17a872
Compare
pls also see https://issues.apache.org/jira/browse/CALCITE-5893 to support array_except/array_intersect. they also should reject NULL literals. calcite spark: // return null they should reject NULL. |
FYI: One background is that many spark functions related to calcite's array are contributed by one contributor. One basic strategy is to turn on NullPolicy, which means that NULL literals, including NULL and cast(null as xxx) are returned null directly. But in fact, spark will throw an exception in the former. We need to address the former case like this PR. |
The situation may be even more complicated if one considers cases where the compiler needs to insert implicit casts. I don't have a spark engine handy to test what Spark does, but if the array element types and the added element do not have the same type, Spark may choose to insert implicit casts in different ways than Calcite does. Another fact I don't have a good handle on is the nullability of array elements: https://issues.apache.org/jira/browse/CALCITE-6275 |
array_append needs to be capitalized in commit name/jira name/pr name. |
make sense. I just read it wrong. Your PR only handles the situation when array arg is NULL. |
There are already tests for this case, e.g.: |
That's great, we should merge that PR if it solves the problem. I will try to review it. |
I will update these when I squash the commits |
If the first array arg is NULL, then it is an exception for all these array functions. There is no doubt about this, but if the second arg is NULL, the semantics are different. yes, I have fixed |
Let's merge [CALCITE-5893] first and I will rebase this one. |
f.checkFails("^array_append(null, 2)^", expected, false); | ||
f.checkFails("^array_prepend(null, 2)^", expected, false); | ||
f.checkFails("^array_remove(null, 2)^", expected, false); | ||
f.checkFails("^array_contains(null, 2)^", expected, false); | ||
f.checkFails("^array_position(null, 2)^", expected, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there is a similar issue for array_min/array_max
should we fix it within this issue or create a separate one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since I got started I will add these too
In my last commit I have fixed array_min and array_max as well. |
this.allowCast = allowCast; | ||
public ArrayElementOperandTypeChecker(boolean arrayMayBeNull, boolean elementMayBeNull) { | ||
this.arrayMayBeNull = arrayMayBeNull; | ||
this.elementMayBeNull = elementMayBeNull; | ||
} | ||
|
||
//~ Methods ---------------------------------------------------------------- | ||
|
||
@Override public boolean checkOperandTypes( | ||
SqlCallBinding callBinding, | ||
boolean throwOnFailure) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If array is not first argument? such as array_xxx(1, array, 2). It seems that this typechecker only works in this situation.
But I think if this happens, we can add another typechecker or refactor, such as passing in index to indicate the index of the array parameter.
overall looks good to me +1 |
…es with NullPointerException Signed-off-by: Mihai Budiu <[email protected]>
|
Interpreting the +1 as a "go ahead and merge" |
No description provided.