You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/07/07 20:36:05 UTC

[GitHub] [hudi] satishkotha opened a new pull request #1809: [HUDI-1080] Fix backward compatibility for com.uber inputformats

satishkotha opened a new pull request #1809:
URL: https://github.com/apache/hudi/pull/1809


   ## What is the purpose of the pull request
   
   1) InputFormat backward compatibility is broken in several places because equality check is done including package name in multiple places https://github.com/apache/hudi/blob/master/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/hive/HoodieCombineHiveInputFormat.java#L154. Fix this by using inheritance and using appropriate package name.
   
   2) For 0.4.x versions, reading worked with ORCInputFormat as well. For 0.4.x, we used shim to read as opposed to throwing error for non-hoodie formats. Bring back same functionality to master.
   
   ## Brief change log
   
   - Fix InputFormat class lookup based on package name of source inputformat.
   - Do not throw error for non-hoodie input format and delegate reading to shim loader
   
   ## Verify this pull request
   Verified end-to-end in uber internal hive service deployment.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] satishkotha commented on pull request #1809: [HUDI-1080] Fix backward compatibility for com.uber inputformats

Posted by GitBox <gi...@apache.org>.
satishkotha commented on pull request #1809:
URL: https://github.com/apache/hudi/pull/1809#issuecomment-655652601


   > @satishkotha I have COW tables with data written by hoodie 0.4.6, I've now followed https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi steps and have done successful writes/reads with hudi 0.5.3 Without your PR where would I face issue?
   
   Hi, have you been using HoodieCombineHiveInputFormat for your use cases? As you can see in the diff, the fixes are specific to combine input format.
   
   When we ran queries after upgrading to new version(0.5.1), we got below errors 
   1) `Caused by: org.apache.hudi.exception.HoodieException: Unexpected input format : com.uber.hoodie.hadoop.HoodieInputFormat
   	at org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat.getRecordReader(HoodieCombineHiveInputFormat.java:522)`
   
   2) `org.apache.hudi.exception.HoodieException: Unexpected input format : org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
   	at org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat.getRecordReader(HoodieCombineHiveInputFormat.java:533)`
   
   With 0.4.x version same queries ran fine.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] tooptoop4 commented on pull request #1809: [HUDI-1080] Fix backward compatibility for com.uber inputformats

Posted by GitBox <gi...@apache.org>.
tooptoop4 commented on pull request #1809:
URL: https://github.com/apache/hudi/pull/1809#issuecomment-656022908


   @satishkotha I just use HoodieParquetInputFormat


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash merged pull request #1809: [HUDI-1080] Fix backward compatibility for com.uber inputformats

Posted by GitBox <gi...@apache.org>.
n3nash merged pull request #1809:
URL: https://github.com/apache/hudi/pull/1809


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] tooptoop4 commented on pull request #1809: [HUDI-1080] Fix backward compatibility for com.uber inputformats

Posted by GitBox <gi...@apache.org>.
tooptoop4 commented on pull request #1809:
URL: https://github.com/apache/hudi/pull/1809#issuecomment-655609363


   @satishkotha I have COW tables with data written by hoodie 0.4.6, I've now followed https://cwiki.apache.org/confluence/display/HUDI/Migration+Guide+From+com.uber.hoodie+to+org.apache.hudi steps and have done successful writes/reads with hudi 0.5.3    Without your PR where would I face issue?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] satishkotha commented on pull request #1809: [HUDI-1080] Fix backward compatibility for com.uber inputformats

Posted by GitBox <gi...@apache.org>.
satishkotha commented on pull request #1809:
URL: https://github.com/apache/hudi/pull/1809#issuecomment-656244633


   > @satishkotha I just use HoodieParquetInputFormat
   
   That's great. You don't need to apply this change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org