You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/10/08 11:56:31 UTC

[GitHub] [flink] lsyldliu opened a new pull request, #20988: [FLINK-29547][table] Fix the bug of Select a[1] which is array type for parquet complex type throw ClassCastException

lsyldliu opened a new pull request, #20988:
URL: https://github.com/apache/flink/pull/20988

   ## What is the purpose of the change
   Fix the bug of Select a[1] which is  array type for parquet complex type throw ClassCastException
   
   
   ## Brief change log
   
     - *Fix the bug of Select a[1] which is  array type for parquet complex type throw ClassCastException*
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   *(example:)*
     - *Added integration tests for in HiveTableSourceITCase*
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
     - The serializers: (no)
     - The runtime per-record code paths (performance sensitive): (no)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (no)
     - If yes, how is the feature documented? (not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] flinkbot commented on pull request #20988: [FLINK-29547][table] Fix the bug of Select a[1] which is array type for parquet complex type throw ClassCastException

Posted by GitBox <gi...@apache.org>.
flinkbot commented on PR #20988:
URL: https://github.com/apache/flink/pull/20988#issuecomment-1272302953

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "23ab26a1a3e51733fbf2b7ccf11a73db756bfd38",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "23ab26a1a3e51733fbf2b7ccf11a73db756bfd38",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 23ab26a1a3e51733fbf2b7ccf11a73db756bfd38 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] luoyuxia commented on a diff in pull request #20988: [FLINK-29547][table] Fix the bug of Select a[1] which is array type for parquet complex type throw ClassCastException

Posted by GitBox <gi...@apache.org>.
luoyuxia commented on code in PR #20988:
URL: https://github.com/apache/flink/pull/20988#discussion_r999205523


##########
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/data/conversion/ArrayObjectArrayConverter.java:
##########
@@ -100,7 +100,6 @@ public E[] toExternal(ArrayData internal) {
             if (genericArray.isPrimitiveArray()) {
                 return genericToJavaArrayConverter.convert((GenericArrayData) internal);
             }
-            return (E[]) genericArray.toObjectArray();

Review Comment:
   I'm wondering whether it's a good idea to always remove this line and then fall back  to `toJavaArray(internal)`. 
   From my side, `(E[]) genericArray.toObjectArray()` seems a optimization compared to `toJavaArray(internal)`.
   In the vectorized way,  it should fall to `toJavaArray(internal)`, otherwise, it will fail.
   But in the non-vectorized way, everything is ok even though we don't remove this line.
   For example, if we try to make array read to be non-vectorized by modifying the method `isVectorizationUnsupported`. The test will pass if we don't remove this line.
   



##########
flink-connectors/flink-connector-hive/src/test/java/org/apache/flink/connectors/hive/HiveTableSourceITCase.java:
##########
@@ -190,6 +190,29 @@ public void testReadParquetComplexDataType() throws Exception {
         batchTableEnv.unloadModule("hive");
     }
 
+    @Test
+    public void testReadParquetArrayDataType() throws Exception {
+        batchTableEnv.executeSql(
+                "create table parquet_complex_type_test("
+                        + "a array<int>, m map<int,string>, s struct<f1:int,f2:bigint>) stored as parquet");
+        // load hive module so that we can use array,map, named_struct function
+        // for convenient writing complex data
+        batchTableEnv.loadModule("hive", new HiveModule());
+        batchTableEnv.useModules("hive", CoreModuleFactory.IDENTIFIER);
+
+        batchTableEnv
+                .executeSql(
+                        "insert into parquet_complex_type_test"
+                                + " select array(1, 2), map(1, 'val1', 2, 'val2'),"
+                                + " named_struct('f1', 1,  'f2', 2)")
+                .await();
+
+        Table src = batchTableEnv.sqlQuery("select a[1], a[3] from parquet_complex_type_test");

Review Comment:
   I think we can just move the test to `testReadParquetComplexDataType`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] lsyldliu commented on pull request #20988: [FLINK-29547][table] Fix the bug of Select a[1] which is array type for parquet complex type throw ClassCastException

Posted by GitBox <gi...@apache.org>.
lsyldliu commented on PR #20988:
URL: https://github.com/apache/flink/pull/20988#issuecomment-1272302236

   cc @wuchong 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org