You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/05/21 10:51:13 UTC

[GitHub] [hudi] AirToSupply opened a new issue #2976: [SUPPORT] Column misalignment occurs when reading the COPY_ON_WRITE type of hudi table through Flink

AirToSupply opened a new issue #2976:
URL: https://github.com/apache/hudi/issues/2976


   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.Build from source with branch [master], the version is 0.9.0-SNAPSHOT. 
   2.Start a Fink1.12.x streaming job, read data from hudi table to test.
   3.Online observation, the exception caused the flink job to fail
   
   **Expected behavior**
   java.lang.IllegalArgumentException: Unexpected type: ...
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.9.0-SNAPSHOT
   
   * Spark version : None
   
   * Hive version : None
   
   * Hadoop version : 2.9.2
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   The timing of the exception is: when the specified partition column field is not at the end of the sequence of fields written to the hudi table.
   
   For example, if the order of the fields (including partition columns) written in the hudi table is: col1, col2, col3. At this time, if the partition column field is col1, the exception will be generated. If the partition column field is col3, it can work normally.
   
   A clear and concise description of the problem.
   
   **Stacktrace**
   
   The exception stack is as follows:
   
   ![BB0B7B65-BC82-40da-ABD9-6550956AAFDD](https://user-images.githubusercontent.com/62897740/119125433-588c0780-ba64-11eb-9bb6-1fad46a2a3b5.png)
   
   The local debugging is as follows:
   
   ![C10E0226-BBAD-4ef3-B3AE-161586449B35](https://user-images.githubusercontent.com/62897740/119125566-82452e80-ba64-11eb-81ab-3576fc4ff97b.png)
   
   Initial diagnosis reason: When reading the hudi table through Flink, org.apache.hudi.table.format.cow.ParquetSplitReaderUtil#genPartColumnarRowReader will be called. This method returns that the selectedTypes and selectedFieldNames arrays in the ParquetColumnarRowSplitReader object are misaligned.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] AirToSupply commented on issue #2976: [SUPPORT] Column misalignment occurs when reading the COPY_ON_WRITE type of hudi table through Flink

Posted by GitBox <gi...@apache.org>.
AirToSupply commented on issue #2976:
URL: https://github.com/apache/hudi/issues/2976#issuecomment-846086036


   @AirToSupply Is there a JIRA for this issue if not can you file one and ping it here so we can close the github issue ?
   Thanks, https://issues.apache.org/jira/browse/HUDI-1919 issue created here ~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] AirToSupply edited a comment on issue #2976: [SUPPORT] Column misalignment occurs when reading the COPY_ON_WRITE type of hudi table through Flink

Posted by GitBox <gi...@apache.org>.
AirToSupply edited a comment on issue #2976:
URL: https://github.com/apache/hudi/issues/2976#issuecomment-846086036


   @AirToSupply Thanks, https://issues.apache.org/jira/browse/HUDI-1919 issue created here ~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on issue #2976: [SUPPORT] Column misalignment occurs when reading the copy on write type of hudi table through Flink

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #2976:
URL: https://github.com/apache/hudi/issues/2976#issuecomment-851718270


   This issue has been resolved, closing this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash closed issue #2976: [SUPPORT] Column misalignment occurs when reading the copy on write type of hudi table through Flink

Posted by GitBox <gi...@apache.org>.
n3nash closed issue #2976:
URL: https://github.com/apache/hudi/issues/2976


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org