You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/05/04 14:59:45 UTC

[GitHub] [flink] snuyanzin opened a new pull request, #19637: [FLINK-25567][table] Add support casting of multisets to multisets

snuyanzin opened a new pull request, #19637:
URL: https://github.com/apache/flink/pull/19637

   ## What is the purpose of the change
   
   This PR adds support of casting multisets to multisets as in it was decided to do it in https://github.com/apache/flink/pull/18287 a separate PR
   
   
   ## Brief change log
   
     - Added `MULTISET` as `BuiltInFunctionDefinition` , `MultisetTypeStrategy`, `MultisetConverter`
   
   
   ## Verifying this change
   This change added tests and can be verified as follows:
   Added tests in `CastFunctionITCase.java`
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): ( no)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes)
     - The serializers: (no )
     - The runtime per-record code paths (performance sensitive): ( no)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: ( no)
     - The S3 file system connector: ( no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes )
     - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] snuyanzin commented on pull request #19637: [FLINK-25567][table] Add support casting of multisets to multisets

Posted by GitBox <gi...@apache.org>.
snuyanzin commented on PR #19637:
URL: https://github.com/apache/flink/pull/19637#issuecomment-1343067192

   
   @flinkbot run azure
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] flinkbot commented on pull request #19637: [FLINK-25567][table] Add support casting of multisets to multisets

Posted by GitBox <gi...@apache.org>.
flinkbot commented on PR #19637:
URL: https://github.com/apache/flink/pull/19637#issuecomment-1117436436

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7e6bd2db1c98c19946135413bfa7cedaad723c25",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7e6bd2db1c98c19946135413bfa7cedaad723c25",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7e6bd2db1c98c19946135413bfa7cedaad723c25 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] twalthr commented on a diff in pull request #19637: [FLINK-25567][table] Add support casting of multisets to multisets

Posted by "twalthr (via GitHub)" <gi...@apache.org>.
twalthr commented on code in PR #19637:
URL: https://github.com/apache/flink/pull/19637#discussion_r1144847060


##########
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/ValuesOperationFactory.java:
##########
@@ -197,10 +199,18 @@ private Optional<ResolvedExpression> convertToExpectedType(
                     && targetLogicalType.is(ARRAY)) {
                 return convertArrayToExpectedType(
                         sourceExpression, (CollectionDataType) targetDataType, postResolverFactory);
-            } else if (functionDefinition == BuiltInFunctionDefinitions.MAP
-                    && targetLogicalType.is(MAP)) {
-                return convertMapToExpectedType(
-                        sourceExpression, (KeyValueDataType) targetDataType, postResolverFactory);
+            } else if (functionDefinition == BuiltInFunctionDefinitions.MAP) {
+                if (targetLogicalType.is(MAP)) {

Review Comment:
   This piece of code confuses me. Why is the function `MAP` but the target type MULTISET? On an API level those two data types should be treated completely different.



##########
flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/BuiltInFunctionDefinitions.java:
##########
@@ -1693,6 +1693,14 @@ ANY, and(logical(LogicalTypeRoot.BOOLEAN), LITERAL)
                     .outputTypeStrategy(SpecificTypeStrategies.MAP)
                     .build();
 
+    public static final BuiltInFunctionDefinition MULTISET =
+            BuiltInFunctionDefinition.newBuilder()
+                    .name("multiset")

Review Comment:
   nit: let's use upper case for all new functions



##########
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/calls/ScalarOperatorGens.scala:
##########
@@ -1433,7 +1435,10 @@ object ScalarOperatorGens {
       .map(_._2.last)
       .values
       .toSeq
-    val valueType = mapType.getValueType
+    val valueType = resultType match {
+      case mapType1: MapType => mapType1.getValueType
+      case _ => DataTypes.INT().getLogicalType

Review Comment:
   nit: it's better to call `new IntType()` at code gen level, I guess it should be not null?



##########
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/ExprCodeGenerator.scala:
##########
@@ -715,9 +715,9 @@ class ExprCodeGenerator(ctx: CodeGeneratorContext, nullableInput: Boolean)
       case ARRAY_VALUE_CONSTRUCTOR =>
         generateArray(ctx, resultType, operands)
 
-      // maps
-      case MAP_VALUE_CONSTRUCTOR =>
-        generateMap(ctx, resultType, operands)
+      // maps and multisets
+      case MAP_VALUE_CONSTRUCTOR | MULTISET_VALUE =>

Review Comment:
   Let's also call it `MULTISET_VALUE_CONSTRUCTOR` to be in sync with map and array



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] snuyanzin commented on pull request #19637: [FLINK-25567][table] Add support casting of multisets to multisets

Posted by GitBox <gi...@apache.org>.
snuyanzin commented on PR #19637:
URL: https://github.com/apache/flink/pull/19637#issuecomment-1117691150

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] snuyanzin commented on a diff in pull request #19637: [FLINK-25567][table] Add support casting of multisets to multisets

Posted by "snuyanzin (via GitHub)" <gi...@apache.org>.
snuyanzin commented on code in PR #19637:
URL: https://github.com/apache/flink/pull/19637#discussion_r1145523634


##########
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/calls/ScalarOperatorGens.scala:
##########
@@ -1433,7 +1435,10 @@ object ScalarOperatorGens {
       .map(_._2.last)
       .values
       .toSeq
-    val valueType = mapType.getValueType
+    val valueType = resultType match {
+      case mapType1: MapType => mapType1.getValueType
+      case _ => DataTypes.INT().getLogicalType

Review Comment:
   yep, you're right



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] snuyanzin commented on pull request #19637: [FLINK-25567][table] Add support casting of multisets to multisets

Posted by GitBox <gi...@apache.org>.
snuyanzin commented on PR #19637:
URL: https://github.com/apache/flink/pull/19637#issuecomment-1155427059

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] snuyanzin commented on pull request #19637: [FLINK-25567][table] Add support casting of multisets to multisets

Posted by GitBox <gi...@apache.org>.
snuyanzin commented on PR #19637:
URL: https://github.com/apache/flink/pull/19637#issuecomment-1165492850

   Hi @twalthr 
   sorry for  the poke
   a while ago there was a suggestion to have this done at https://github.com/apache/flink/pull/18287
   could you please have a look once you have time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] snuyanzin commented on a diff in pull request #19637: [FLINK-25567][table] Add support casting of multisets to multisets

Posted by "snuyanzin (via GitHub)" <gi...@apache.org>.
snuyanzin commented on code in PR #19637:
URL: https://github.com/apache/flink/pull/19637#discussion_r1145512436


##########
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/ValuesOperationFactory.java:
##########
@@ -197,10 +199,18 @@ private Optional<ResolvedExpression> convertToExpectedType(
                     && targetLogicalType.is(ARRAY)) {
                 return convertArrayToExpectedType(
                         sourceExpression, (CollectionDataType) targetDataType, postResolverFactory);
-            } else if (functionDefinition == BuiltInFunctionDefinitions.MAP
-                    && targetLogicalType.is(MAP)) {
-                return convertMapToExpectedType(
-                        sourceExpression, (KeyValueDataType) targetDataType, postResolverFactory);
+            } else if (functionDefinition == BuiltInFunctionDefinitions.MAP) {
+                if (targetLogicalType.is(MAP)) {

Review Comment:
   It seems the issue is that on api level `org.apache.flink.table.expressions.ApiExpressionUtils#objectToExpression` [1] translates maps to `MAP`. And both `MAP` and `MULTISET` are represented as maps there...  
   [1] https://github.com/apache/flink/blob/e91eb5ec2fea9a2f28f3f55a06bc140e4ce4b5f5/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/expressions/ApiExpressionUtils.java#L85-L104



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] snuyanzin commented on a diff in pull request #19637: [FLINK-25567][table] Add support casting of multisets to multisets

Posted by "snuyanzin (via GitHub)" <gi...@apache.org>.
snuyanzin commented on code in PR #19637:
URL: https://github.com/apache/flink/pull/19637#discussion_r1145512436


##########
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/ValuesOperationFactory.java:
##########
@@ -197,10 +199,18 @@ private Optional<ResolvedExpression> convertToExpectedType(
                     && targetLogicalType.is(ARRAY)) {
                 return convertArrayToExpectedType(
                         sourceExpression, (CollectionDataType) targetDataType, postResolverFactory);
-            } else if (functionDefinition == BuiltInFunctionDefinitions.MAP
-                    && targetLogicalType.is(MAP)) {
-                return convertMapToExpectedType(
-                        sourceExpression, (KeyValueDataType) targetDataType, postResolverFactory);
+            } else if (functionDefinition == BuiltInFunctionDefinitions.MAP) {
+                if (targetLogicalType.is(MAP)) {

Review Comment:
   It seems the issue is that on api level `org.apache.flink.table.expressions.ApiExpressionUtils#objectToExpression` [1][2] translates maps to `MAP`. And both `MAP` and `MULTISET` are represented as maps there...  
   [1] https://github.com/apache/flink/blob/e91eb5ec2fea9a2f28f3f55a06bc140e4ce4b5f5/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/expressions/ApiExpressionUtils.java#L85-L104
   [2] https://github.com/apache/flink/blob/e91eb5ec2fea9a2f28f3f55a06bc140e4ce4b5f5/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/expressions/ApiExpressionUtils.java#L126-L137
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org