-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[CALCITE-5695] Add MAP_KEYS, MAP_VALUES function (enabled in Spark li… #3194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
hi @JiajunBernoulli could you also help review it? |
@@ -3797,6 +3798,16 @@ public static List distinct(List list) { | |||
return new ArrayList<>(result); | |||
} | |||
|
|||
/** Support the MAP_KEYS function. */ | |||
public static List mapKeys(Map map) { | |||
return (List) map.keySet().stream().collect(Collectors.toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the expected result if the input map is null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return null @zoudan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we should handle null here? It seems a NPE Exception will be thrown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zoudan not needs. the runtime will does it like other engine, the NullPolicy does that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and why i doesn't add unit test map_keys(cast(null as map[keyType, valueType])) is because the calcite doesn't supported, which you can see here #3189
java.lang.RuntimeException: Error while parsing query: select cast(map['washington', 1, null, 44] as map[CHAR(10), INTEGER NOT NULL>] from (values (1))
at org.apache.calcite.sql.test.AbstractSqlTester.parseAndValidate(AbstractSqlTester.java:157)
at org.apache.calcite.sql.test.AbstractSqlTester.validateAndThen(AbstractSqlTester.java:248)
at org.apache.calcite.sql.test.AbstractSqlTester.checkColumnType(AbstractSqlTester.java:177)
at org.apache.calcite.test.SqlValidatorFixture.columnType(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use new ArrayList<>(map.keySets())
?
79ab686
to
f1fad81
Compare
/** Tests {@code MAP_KEYS} function from Spark. */ | ||
@Test void testMapKeysFunc() { | ||
final SqlOperatorFixture f = fixture() | ||
.setFor(SqlLibraryOperators.MAP_KEYS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be good to write a test that fails if library is not set to Spark (you only want it to work for Spark).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added @tanclary
site/_docs/reference.md
Outdated
@@ -2723,6 +2723,8 @@ BigQuery's type system uses confusingly different names for types and functions: | |||
| m | TO_BASE64(string) | Converts the *string* to base-64 encoded form and returns a encoded string | |||
| b m | FROM_BASE64(string) | Returns the decoded result of a base-64 *string* as a string | |||
| b o | LTRIM(string) | Returns *string* with all blanks removed from the start | |||
| s | MAP_KEYS(map) | Returns an unordered array containing the keys of the map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove '.'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed @julianhyde
hi @julianhyde @tanclary @zoudan fix all your reviews, please have a look again,thanks. and i will supports all the spark collection functions latter. |
1110d8d
to
48def8d
Compare
hi @JiajunBernoulli do you have time to review it ? |
I usually only have time on weekends. So I will review it tomorrow. |
@liuyongvs Can you respond to my comment in Jira? What is an 'unordered array'? |
site/_docs/reference.md
Outdated
@@ -2723,6 +2723,8 @@ BigQuery's type system uses confusingly different names for types and functions: | |||
| m | TO_BASE64(string) | Converts the *string* to base-64 encoded form and returns a encoded string | |||
| b m | FROM_BASE64(string) | Returns the decoded result of a base-64 *string* as a string | |||
| b o | LTRIM(string) | Returns *string* with all blanks removed from the start | |||
| s | MAP_KEYS(map) | Returns an unordered array containing the keys of the map | |||
| s | MAP_VALUES(map) | Returns an unordered array containing the values of the map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map should be *map*
.
* <p>For example, given {@code (INTEGER, DATE) MAP}, returns | ||
* {@code DATE ARRAY}. | ||
*/ | ||
public static final SqlReturnTypeInference TO_MAP_VALUES_NULLABLE = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have two types: not null and nullable.
public static final SqlReturnTypeInference TO_MAP_KEYS =
ARG0.andThen(SqlTypeTransforms.TO_MAP_KEYS);
public static final SqlReturnTypeInference TO_MAP_KEYS_NULLABLE =
TO_MAP_KEYS.andThen(SqlTypeTransforms.TO_NULLABLE);
|
||
/** Support the MAP_VALUES function. */ | ||
public static List mapValues(Map map) { | ||
return (List) map.values().stream().collect(Collectors.toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return new ArrayList<>(map.values());
I think it's clearer and there won't be NPE.
I left some comments for the PR. And we should discuss 'unordered array' in JIRA. |
fix all your review @JiajunBernoulli |
@@ -3797,6 +3798,16 @@ public static List distinct(List list) { | |||
return new ArrayList<>(result); | |||
} | |||
|
|||
/** Support the MAP_KEYS function. */ | |||
public static List mapKeys(Map map) { | |||
return (List) map.keySet().stream().collect(Collectors.toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use new ArrayList<>(map.keySet())?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed @JiajunBernoulli
fixed , thanks for your details review @JiajunBernoulli. do you also have any comments. |
ecd2ba1
to
822509a
Compare
have squashed the commits, could anyone merge it thanks @JiajunBernoulli @julianhyde @zoudan |
2d5c4b0
to
42c02f2
Compare
Conflicting files because #3198 Please resolve them. |
42c02f2
to
a212a06
Compare
hi @JiajunBernoulli fixed the conflict and ci passed |
Please delete '.' in commit message. |
a212a06
to
281d094
Compare
@JiajunBernoulli fixed all the current prs message ending with '.' |
Kudos, SonarCloud Quality Gate passed! |
MAP_KEYS - Returns an unordered array containing the keys of the map.
MAP_VALUES - Returns an unordered array containing the values of the map.
For more details
https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html