You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/01/25 11:25:02 UTC

[GitHub] [hudi] quitozang opened a new pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

quitozang opened a new pull request #2486:
URL: https://github.com/apache/hudi/pull/2486


   Filtering abnormal data which the recordKeyField or precombineField is null in avro format
   
   
   ## What is the purpose of the pull request
   
   If the recordKey field or precombined field of the incoming data is null, the DeltaStreamer program will execute an error


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-766863772


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=h1) Report
   > Merging [#2486](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=desc) (5476bf0) into [master](https://codecov.io/gh/apache/hudi/commit/c4afd179c1983a382b8a5197d800b0f5dba254de?el=desc) (c4afd17) will **decrease** coverage by `2.19%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2486/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2486      +/-   ##
   ============================================
   - Coverage     50.18%   47.98%   -2.20%     
   + Complexity     3050     2693     -357     
   ============================================
     Files           419      366      -53     
     Lines         18931    17001    -1930     
     Branches       1948     1718     -230     
   ============================================
   - Hits           9500     8158    -1342     
   + Misses         8656     8201     -455     
   + Partials        775      642     -133     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.21% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `51.47% <ø> (-0.03%)` | `0.00 <ø> (ø)` | |
   | hudiflink | `0.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.16% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `65.85% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `48.61% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `66.49% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `?` | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | `26.00% <0.00%> (ø%)` | |
   | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | | | |
   | [...g/apache/hudi/utilities/sources/AvroDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0RGU1NvdXJjZS5qYXZh) | | | |
   | [...alCheckpointFromAnotherHoodieTimelineProvider.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvSW5pdGlhbENoZWNrcG9pbnRGcm9tQW5vdGhlckhvb2RpZVRpbWVsaW5lUHJvdmlkZXIuamF2YQ==) | | | |
   | [...callback/kafka/HoodieWriteCommitKafkaCallback.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NhbGxiYWNrL2thZmthL0hvb2RpZVdyaXRlQ29tbWl0S2Fma2FDYWxsYmFjay5qYXZh) | | | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | | | |
   | [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) | | | |
   | [...udi/utilities/transform/FlatteningTransformer.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9GbGF0dGVuaW5nVHJhbnNmb3JtZXIuamF2YQ==) | | | |
   | [...ties/exception/HoodieIncrementalPullException.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxFeGNlcHRpb24uamF2YQ==) | | | |
   | [...g/apache/hudi/utilities/schema/SchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlci5qYXZh) | | | |
   | ... and [42 more](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-770809175


   Few suggestions: 
   - Can you rebase and also fix the CI issue please. We will review once these are done and the patch is ready. 
   - Also, suggest to create a jira and link it. 
   - Also, please do follow template to create a new PR(I see you have removed most parts in the description). Hudi community follows a standard template. So would be nice to follow the same. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar closed pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
vinothchandar closed pull request #2486:
URL: https://github.com/apache/hudi/pull/2486


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-766863772


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=h1) Report
   > Merging [#2486](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=desc) (5476bf0) into [master](https://codecov.io/gh/apache/hudi/commit/c4afd179c1983a382b8a5197d800b0f5dba254de?el=desc) (c4afd17) will **decrease** coverage by `1.27%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2486/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2486      +/-   ##
   ============================================
   - Coverage     50.18%   48.90%   -1.28%     
   + Complexity     3050     2155     -895     
   ============================================
     Files           419      266     -153     
     Lines         18931    12041    -6890     
     Branches       1948     1133     -815     
   ============================================
   - Hits           9500     5889    -3611     
   + Misses         8656     5715    -2941     
   + Partials        775      437     -338     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.21% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `51.47% <ø> (-0.03%)` | `0.00 <ø> (ø)` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `?` | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | `26.00% <0.00%> (ø%)` | |
   | [.../hadoop/realtime/RealtimeUnmergedRecordReader.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lVW5tZXJnZWRSZWNvcmRSZWFkZXIuamF2YQ==) | | | |
   | [...hudi/utilities/schema/FilebasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9GaWxlYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh) | | | |
   | [...ties/exception/HoodieIncrementalPullException.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxFeGNlcHRpb24uamF2YQ==) | | | |
   | [...in/java/org/apache/hudi/schema/SchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zY2hlbWEvU2NoZW1hUHJvdmlkZXIuamF2YQ==) | | | |
   | [...udi/utilities/schema/DelegatingSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9EZWxlZ2F0aW5nU2NoZW1hUHJvdmlkZXIuamF2YQ==) | | | |
   | [...adoop/realtime/RealtimeBootstrapBaseFileSplit.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lQm9vdHN0cmFwQmFzZUZpbGVTcGxpdC5qYXZh) | | | |
   | [...in/java/org/apache/hudi/hive/HoodieHiveClient.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSG9vZGllSGl2ZUNsaWVudC5qYXZh) | | | |
   | [...hadoop/realtime/RealtimeCompactedRecordReader.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lQ29tcGFjdGVkUmVjb3JkUmVhZGVyLmphdmE=) | | | |
   | [...di/timeline/service/handlers/FileSliceHandler.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvRmlsZVNsaWNlSGFuZGxlci5qYXZh) | | | |
   | ... and [142 more](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-770809175


   Few suggestions: 
   - Can you rebase and also fix the CI issue please. We will review once these are done and the patch is ready. 
   - Also, suggest to create a jira and link it. 
   - Also, please do follow template to create a new PR(I see you have removed most parts in the description). Hudi community follows a standard template. So would be nice to follow the same. 
   Do ping us here once the patch is ready to be reviewed. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-766863772


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=h1) Report
   > Merging [#2486](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=desc) (5476bf0) into [master](https://codecov.io/gh/apache/hudi/commit/c4afd179c1983a382b8a5197d800b0f5dba254de?el=desc) (c4afd17) will **decrease** coverage by `1.27%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2486/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2486      +/-   ##
   ============================================
   - Coverage     50.18%   48.90%   -1.28%     
   + Complexity     3050     2155     -895     
   ============================================
     Files           419      266     -153     
     Lines         18931    12041    -6890     
     Branches       1948     1133     -815     
   ============================================
   - Hits           9500     5889    -3611     
   + Misses         8656     5715    -2941     
   + Partials        775      437     -338     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.21% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `51.47% <ø> (-0.03%)` | `0.00 <ø> (ø)` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `?` | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2486?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | `26.00% <0.00%> (ø%)` | |
   | [.../hadoop/realtime/RealtimeUnmergedRecordReader.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lVW5tZXJnZWRSZWNvcmRSZWFkZXIuamF2YQ==) | | | |
   | [...hudi/utilities/schema/FilebasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9GaWxlYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh) | | | |
   | [...ties/exception/HoodieIncrementalPullException.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxFeGNlcHRpb24uamF2YQ==) | | | |
   | [...in/java/org/apache/hudi/schema/SchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zY2hlbWEvU2NoZW1hUHJvdmlkZXIuamF2YQ==) | | | |
   | [...udi/utilities/schema/DelegatingSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9EZWxlZ2F0aW5nU2NoZW1hUHJvdmlkZXIuamF2YQ==) | | | |
   | [...adoop/realtime/RealtimeBootstrapBaseFileSplit.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lQm9vdHN0cmFwQmFzZUZpbGVTcGxpdC5qYXZh) | | | |
   | [...in/java/org/apache/hudi/hive/HoodieHiveClient.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSG9vZGllSGl2ZUNsaWVudC5qYXZh) | | | |
   | [...hadoop/realtime/RealtimeCompactedRecordReader.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lQ29tcGFjdGVkUmVjb3JkUmVhZGVyLmphdmE=) | | | |
   | [...di/timeline/service/handlers/FileSliceHandler.java](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvRmlsZVNsaWNlSGFuZGxlci5qYXZh) | | | |
   | ... and [142 more](https://codecov.io/gh/apache/hudi/pull/2486/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-773336965


   if I am not wrong, preCombine is an optional field. don't think its a mandatory one. So, if not set, we might choose some default value. 
   Also, since this is a behavior change and not everyone prefers to filter out but rather fail, and this incurs additional processing as well, can we add a new config and guard it against it. If the config is not set, old behavior should be retained. only those users who are interested can enable it. 
   @vinothchandar @n3nash : Any opinions on this PR is appreciated. 
   on a high level, this patch filters out those records considered to be invalid (i.e. w/ record key field or preCombine field) and proceeds w/ rest. if not for this patch, we will fail the operation. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-773336965


   if I am not wrong, preCombine is an optional field. don't think its a mandatory one. So, if not set, we might choose some default value. 
   Also, since this is behavior change and not everyone prefers to filter out but rather fail, and this incurs additional processing as well, can we add a new config and guard it against it. If the config is not set, old behavior should be retained. only those users who are interested can enable it. 
   @vinothchandar @n3nash : Any opinions on this PR is appreciated. 
   on a high level, this patch filters out those records considered to be invalid (i.e. w/ record key field or preCombine field) and proceeds w/ rest. if not for this patch, we will fail the operation. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pratyakshsharma commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-768477641


   @quitozang There are compilation errors in the travis build. Also please raise a jira and add it to the PR title.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-776894429


   yes, but its feasible only w/ delta streamer right. if someone is using datasource directly? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-766863772






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-773336965


   if I am not wrong, preCombine is an optional field. don't think its a mandatory one. So, if not set, we might choose some default value. 
   Also, since this is behavior change and not everyone prefers to filter out but rather fail, and this incurs additional processing as well, can we add a new config and guard it against it. If the config is not set, old behavior should be retained. only those users who are interested can enable it. 
   @vinothchandar @n3nash : Any opinions on this PR is appreciated. 
   on a high level, this patch filters out those records considered to be invalid (i.e. w/ record key field or preCombine field) and proceeds w/ rest. if not for this patch, we will fail the operation. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-780285911


   @quitozang : guess if someone is using delta streamer, they can leverage a transformer. If not, users could directly filter in the datasource before writing to hudi. So, not really sure if hudi need to support this feature. wdyt? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-766863772






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
n3nash commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-776893408


   @nsivabalan @quitozang Can we just add a transformer that does this ? https://github.com/apache/hudi/blob/master/hudi-utilities/src/main/java/org/apache/hudi/utilities/transform/Transformer.java.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-773336965


   if I am not wrong, preCombine is an optional field. don't think its a mandatory one. So, if not set, we might choose some default value. 
   Also, since this is a behavior change and not everyone prefers to filter out but rather fail, and this incurs additional processing as well, can we add a new config and guard it against it. If the config is not set, old behavior should be retained. only those users who are interested can enable it. 
   @vinothchandar @n3nash : Any opinions on this PR is appreciated. 
   on a high level, this patch filters out those records considered to be invalid (i.e. w/ null record key field or null preCombine field) and proceeds w/ rest. if not for this patch, we will fail the operation. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2486:
URL: https://github.com/apache/hudi/pull/2486#issuecomment-914631576


   Closing due to inactivity


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org