You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/01/16 23:00:05 UTC

[GitHub] [incubator-hudi] smarthi opened a new pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

smarthi opened a new pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237
 
 
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   Code cleanup, and other minor tweaks
   
   ## Brief change log
   
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368187882
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/avro/MercifulJsonConverter.java
 ##########
 @@ -286,7 +284,7 @@ private static JsonToAvroFieldProcessor generateArrayTypeHandler() {
       public Pair<Boolean, Object> convert(Object value, String name, Schema schema)
           throws HoodieJsonToAvroConversionException {
         Schema elementSchema = schema.getElementType();
-        List listRes = new ArrayList();
+        List listRes = new ArrayList<>();
 
 Review comment:
   better to `List<Object> listRes = new ArrayList<>();` ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373778785
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/versioning/compaction/CompactionV1MigrationHandler.java
 ##########
 @@ -59,14 +59,12 @@ public HoodieCompactionPlan downgradeFrom(HoodieCompactionPlan input) {
     final Path basePath = new Path(metaClient.getBasePath());
     List<HoodieCompactionOperation> v1CompactionOperationList = new ArrayList<>();
     if (null != input.getOperations()) {
-      v1CompactionOperationList = input.getOperations().stream().map(inp -> {
-        return HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
-            .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
-            .setDataFilePath(convertToV1Path(basePath, inp.getPartitionPath(), inp.getDataFilePath()))
-            .setDeltaFilePaths(inp.getDeltaFilePaths().stream()
-                .map(s -> convertToV1Path(basePath, inp.getPartitionPath(), s)).collect(Collectors.toList()))
-            .build();
-      }).collect(Collectors.toList());
+      v1CompactionOperationList = input.getOperations().stream().map(inp -> HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
+          .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
 
 Review comment:
   If we let builder's each `setXXX` method one line, the readability would be better. WDYT?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368188409
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/avro/MercifulJsonConverter.java
 ##########
 @@ -286,7 +284,7 @@ private static JsonToAvroFieldProcessor generateArrayTypeHandler() {
       public Pair<Boolean, Object> convert(Object value, String name, Schema schema)
           throws HoodieJsonToAvroConversionException {
         Schema elementSchema = schema.getElementType();
-        List listRes = new ArrayList();
+        List listRes = new ArrayList<>();
 
 Review comment:
   How about List<?> listRes = new ArrayList<>(); 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373784682
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/versioning/compaction/CompactionV1MigrationHandler.java
 ##########
 @@ -59,14 +59,12 @@ public HoodieCompactionPlan downgradeFrom(HoodieCompactionPlan input) {
     final Path basePath = new Path(metaClient.getBasePath());
     List<HoodieCompactionOperation> v1CompactionOperationList = new ArrayList<>();
     if (null != input.getOperations()) {
-      v1CompactionOperationList = input.getOperations().stream().map(inp -> {
-        return HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
-            .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
-            .setDataFilePath(convertToV1Path(basePath, inp.getPartitionPath(), inp.getDataFilePath()))
-            .setDeltaFilePaths(inp.getDeltaFilePaths().stream()
-                .map(s -> convertToV1Path(basePath, inp.getPartitionPath(), s)).collect(Collectors.toList()))
-            .build();
-      }).collect(Collectors.toList());
+      v1CompactionOperationList = input.getOperations().stream().map(inp -> HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
+          .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
 
 Review comment:
   Did you revert all the changes of `CompactionV1MigrationHandler`? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368188860
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/avro/MercifulJsonConverter.java
 ##########
 @@ -286,7 +284,7 @@ private static JsonToAvroFieldProcessor generateArrayTypeHandler() {
       public Pair<Boolean, Object> convert(Object value, String name, Schema schema)
           throws HoodieJsonToAvroConversionException {
         Schema elementSchema = schema.getElementType();
-        List listRes = new ArrayList();
+        List listRes = new ArrayList<>();
 
 Review comment:
   makes sense this is Jackson API

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368191200
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/RocksDbBasedFileSystemView.java
 ##########
 @@ -194,69 +193,66 @@ protected void storePartitionView(String partitionPath, List<HoodieFileGroup> fi
    */
   protected void applyDeltaFileSlicesToPartitionView(String partition, List<HoodieFileGroup> deltaFileGroups,
       DeltaApplyMode mode) {
-    rocksDB.writeBatch(batch -> {
-      deltaFileGroups.stream().forEach(fg -> {
-        fg.getAllRawFileSlices().map(fs -> {
-          FileSlice oldSlice = getFileSlice(partition, fs.getFileId(), fs.getBaseInstantTime());
-          if (null == oldSlice) {
-            return fs;
-          } else {
-            // First remove the file-slice
-            LOG.info("Removing old Slice in DB. FS=" + oldSlice);
-            rocksDB.deleteInBatch(batch, schemaHelper.getColFamilyForView(),
-                schemaHelper.getKeyForSliceView(fg, oldSlice));
-            rocksDB.deleteInBatch(batch, schemaHelper.getColFamilyForView(),
-                schemaHelper.getKeyForDataFileView(fg, oldSlice));
-
-            Map<String, HoodieLogFile> logFiles = oldSlice.getLogFiles()
-                .map(lf -> Pair.of(Path.getPathWithoutSchemeAndAuthority(lf.getPath()).toString(), lf))
-                .collect(Collectors.toMap(Pair::getKey, Pair::getValue));
-            Map<String, HoodieLogFile> deltaLogFiles =
-                fs.getLogFiles().map(lf -> Pair.of(Path.getPathWithoutSchemeAndAuthority(lf.getPath()).toString(), lf))
+    rocksDB.writeBatch(batch ->
+        deltaFileGroups.forEach(fg ->
+            fg.getAllRawFileSlices().map(fs -> {
+              FileSlice oldSlice = getFileSlice(partition, fs.getFileId(), fs.getBaseInstantTime());
+              if (null == oldSlice) {
+                return fs;
+              } else {
+                // First remove the file-slice
+                // LOG.info("Removing old Slice in DB. FS=" + oldSlice);
 
 Review comment:
   still keep this line uncommented?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373762549
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/bloom/filter/InternalDynamicBloomFilter.java
 ##########
 @@ -217,12 +214,8 @@ public void readFields(DataInput in) throws IOException {
    * Adds a new row to <i>this</i> dynamic Bloom filter.
    */
   private void addRow() {
-    org.apache.hadoop.util.bloom.BloomFilter[] tmp = new org.apache.hadoop.util.bloom.BloomFilter[matrix.length + 1];
-
-    for (int i = 0; i < matrix.length; i++) {
-      tmp[i] = matrix[i];
-    }
-
+    BloomFilter[] tmp = new BloomFilter[matrix.length + 1];
+    System.arraycopy(matrix, 0, tmp, 0, matrix.length);
     tmp[tmp.length - 1] = new org.apache.hadoop.util.bloom.BloomFilter(vectorSize, nbHash, hashType);
 
 Review comment:
   So org.apache.hadoop.util.bloom.BloomFilter could be also changed to BloomFilter?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368201155
 
 

 ##########
 File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieHiveUtil.java
 ##########
 @@ -71,18 +71,13 @@ public static Path getNthParent(Path path, int n) {
   public static List<String> getIncrementalTableNames(JobContext job) {
     Map<String, String> tablesModeMap = job.getConfiguration()
         .getValByRegex(HOODIE_CONSUME_MODE_PATTERN_STRING.pattern());
-    List<String> result = tablesModeMap.entrySet().stream().map(s -> {
+    return tablesModeMap.entrySet().stream().map(s -> {
 
 Review comment:
   good catch - fixed it

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368191728
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/util/FSUtils.java
 ##########
 @@ -198,7 +193,7 @@ public static String getRelativePartitionPath(Path basePath, Path partitionPath)
     return partitions;
   }
 
-  public static final List<String> getAllDataFilesForMarkers(FileSystem fs, String basePath, String instantTs,
+  public static List<String> getAllDataFilesForMarkers(FileSystem fs, String basePath, String instantTs,
 
 Review comment:
   ditto, any concerns?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373765167
 
 

 ##########
 File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieParquetRealtimeInputFormat.java
 ##########
 @@ -175,7 +176,7 @@ private static Configuration addProjectionField(Configuration conf, String field
       readColIdsPrefix = "";
     }
 
-    if (!readColNames.contains(fieldName)) {
+    if (!Objects.requireNonNull(readColNames).contains(fieldName)) {
 
 Review comment:
   readColNames is not null. so !readColNames.contains(fieldName) is okay.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368188409
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/avro/MercifulJsonConverter.java
 ##########
 @@ -286,7 +284,7 @@ private static JsonToAvroFieldProcessor generateArrayTypeHandler() {
       public Pair<Boolean, Object> convert(Object value, String name, Schema schema)
           throws HoodieJsonToAvroConversionException {
         Schema elementSchema = schema.getElementType();
-        List listRes = new ArrayList();
+        List listRes = new ArrayList<>();
 
 Review comment:
   How about List<?> listRes = new ArrayList<>(); 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf merged pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf merged pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373764565
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstant.java
 ##########
 @@ -45,13 +45,13 @@
       .put(HoodieTimeline.COMPACTION_ACTION, HoodieTimeline.COMMIT_ACTION).build();
 
   public static final Comparator<HoodieInstant> ACTION_COMPARATOR =
-      Comparator.<HoodieInstant, String>comparing(instant -> getComparableAction(instant.getAction()));
+      Comparator.comparing(instant -> getComparableAction(instant.getAction()));
 
   public static final Comparator<HoodieInstant> COMPARATOR = Comparator.comparing(HoodieInstant::getTimestamp)
       .thenComparing(ACTION_COMPARATOR).thenComparing(HoodieInstant::getState);
 
-  public static final String getComparableAction(String action) {
-    return COMPARABLE_ACTIONS.containsKey(action) ? COMPARABLE_ACTIONS.get(action) : action;
+  public static String getComparableAction(String action) {
 
 Review comment:
   its redundant for static methods since they can't be overriden anyways, right?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373785185
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/versioning/compaction/CompactionV1MigrationHandler.java
 ##########
 @@ -59,14 +59,12 @@ public HoodieCompactionPlan downgradeFrom(HoodieCompactionPlan input) {
     final Path basePath = new Path(metaClient.getBasePath());
     List<HoodieCompactionOperation> v1CompactionOperationList = new ArrayList<>();
     if (null != input.getOperations()) {
-      v1CompactionOperationList = input.getOperations().stream().map(inp -> {
-        return HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
-            .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
-            .setDataFilePath(convertToV1Path(basePath, inp.getPartitionPath(), inp.getDataFilePath()))
-            .setDeltaFilePaths(inp.getDeltaFilePaths().stream()
-                .map(s -> convertToV1Path(basePath, inp.getPartitionPath(), s)).collect(Collectors.toList()))
-            .build();
-      }).collect(Collectors.toList());
+      v1CompactionOperationList = input.getOperations().stream().map(inp -> HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
+          .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
 
 Review comment:
   OK, just make sure. It does not matter.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#issuecomment-575491467
 
 
   @yanghua Are you able to do a quick review of this? I am dealing with some other release blockers.. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368210892
 
 

 ##########
 File path: hudi-common/src/test/java/org/apache/hudi/common/util/TestFSUtils.java
 ##########
 @@ -74,7 +74,7 @@ public void testMaskFileName() {
   }
 
   @Test
-  /**
 
 Review comment:
   ok

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373779086
 
 

 ##########
 File path: hudi-common/src/test/java/org/apache/hudi/common/table/log/TestHoodieLogFormat.java
 ##########
 @@ -414,7 +414,6 @@ public void testBasicAppendAndRead() throws IOException, URISyntaxException, Int
         dataBlockRead.getRecords().size());
     assertEquals("Both records lists should be the same. (ordering guaranteed)", copyOfRecords1,
         dataBlockRead.getRecords());
-    assertEquals(dataBlockRead.getSchema(), getSimpleSchema());
 
 Review comment:
   Would you mind to explain why we should to remove this assertion?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368196473
 
 

 ##########
 File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
 ##########
 @@ -171,15 +167,15 @@
           return false;
         })
         .collect(Collectors.joining(","));
-    if (incrementalInputPaths == null || incrementalInputPaths.isEmpty()) {
 
 Review comment:
   `incrementalInputPaths` won't be null?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373762695
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstant.java
 ##########
 @@ -45,13 +45,13 @@
       .put(HoodieTimeline.COMPACTION_ACTION, HoodieTimeline.COMMIT_ACTION).build();
 
   public static final Comparator<HoodieInstant> ACTION_COMPARATOR =
-      Comparator.<HoodieInstant, String>comparing(instant -> getComparableAction(instant.getAction()));
+      Comparator.comparing(instant -> getComparableAction(instant.getAction()));
 
   public static final Comparator<HoodieInstant> COMPARATOR = Comparator.comparing(HoodieInstant::getTimestamp)
       .thenComparing(ACTION_COMPARATOR).thenComparing(HoodieInstant::getState);
 
-  public static final String getComparableAction(String action) {
-    return COMPARABLE_ACTIONS.containsKey(action) ? COMPARABLE_ACTIONS.get(action) : action;
+  public static String getComparableAction(String action) {
 
 Review comment:
   keep the `final ` as is ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368200919
 
 

 ##########
 File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
 ##########
 @@ -171,15 +167,15 @@
           return false;
         })
         .collect(Collectors.joining(","));
-    if (incrementalInputPaths == null || incrementalInputPaths.isEmpty()) {
 
 Review comment:
   good catch- fixed it

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#issuecomment-575681740
 
 
   > @smarthi Can you fix the conflicts firstly?
   
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373781774
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/versioning/compaction/CompactionV1MigrationHandler.java
 ##########
 @@ -59,14 +59,12 @@ public HoodieCompactionPlan downgradeFrom(HoodieCompactionPlan input) {
     final Path basePath = new Path(metaClient.getBasePath());
     List<HoodieCompactionOperation> v1CompactionOperationList = new ArrayList<>();
     if (null != input.getOperations()) {
-      v1CompactionOperationList = input.getOperations().stream().map(inp -> {
-        return HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
-            .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
-            .setDataFilePath(convertToV1Path(basePath, inp.getPartitionPath(), inp.getDataFilePath()))
-            .setDeltaFilePaths(inp.getDeltaFilePaths().stream()
-                .map(s -> convertToV1Path(basePath, inp.getPartitionPath(), s)).collect(Collectors.toList()))
-            .build();
-      }).collect(Collectors.toList());
+      v1CompactionOperationList = input.getOperations().stream().map(inp -> HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
+          .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
 
 Review comment:
   agreed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368210545
 
 

 ##########
 File path: hudi-common/src/test/java/org/apache/hudi/common/util/TestFSUtils.java
 ##########
 @@ -74,7 +74,7 @@ public void testMaskFileName() {
   }
 
   @Test
-  /**
 
 Review comment:
   Based on other test code style, we can let the `@Test` annotation close to the method so that we can keep `/**` here. WDYT?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368200929
 
 

 ##########
 File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
 ##########
 @@ -108,9 +107,7 @@
       LOG.info("Found a total of " + groupedFileStatus.size() + " groups");
       for (Map.Entry<HoodieTableMetaClient, List<FileStatus>> entry : groupedFileStatus.entrySet()) {
         List<FileStatus> result = filterFileStatusForSnapshotMode(entry.getKey(), entry.getValue());
-        if (result != null) {
-          returns.addAll(result);
-        }
+        returns.addAll(result);
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373762908
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/util/BufferedRandomAccessFile.java
 ##########
 @@ -1,4 +1,4 @@
-/**
+/*
 
 Review comment:
   good catch

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#issuecomment-575963643
 
 
   > given the large number of files touched here, can we land this after the freeze..
   
   +1 to merge this after this release.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#issuecomment-578534287
 
 
   @smarthi pursuant to the mailing list discussion on MINOR, can we please create a JIRA summarizing the great improvements being made here.. ? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368196367
 
 

 ##########
 File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
 ##########
 @@ -108,9 +107,7 @@
       LOG.info("Found a total of " + groupedFileStatus.size() + " groups");
       for (Map.Entry<HoodieTableMetaClient, List<FileStatus>> entry : groupedFileStatus.entrySet()) {
         List<FileStatus> result = filterFileStatusForSnapshotMode(entry.getKey(), entry.getValue());
-        if (result != null) {
-          returns.addAll(result);
-        }
+        returns.addAll(result);
 
 Review comment:
   skip null result?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] hmatu commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
hmatu commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#issuecomment-575414268
 
 
   Big thanks for your work on this, but it's hard to review like https://github.com/apache/incubator-hudi/pull/1159.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi removed a comment on issue #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi removed a comment on issue #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#issuecomment-575681740
 
 
   > @smarthi Can you fix the conflicts firstly?
   
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#issuecomment-575513806
 
 
   @vinothchandar OK, will review this PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373780145
 
 

 ##########
 File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/hive/HoodieCombineHiveInputFormat.java
 ##########
 @@ -522,12 +519,12 @@ public int hashCode() {
     perfLogger.PerfLogBegin(CLASS_NAME, PerfLogger.GET_SPLITS);
     init(job);
 
-    ArrayList<InputSplit> result = new ArrayList<InputSplit>();
+    ArrayList<InputSplit> result = new ArrayList<>();
 
 Review comment:
   We can also use `List<InputSplit> result = new ArrayList<>();` to follow the same code style in this file. WDYT?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368196277
 
 

 ##########
 File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieHiveUtil.java
 ##########
 @@ -71,18 +71,13 @@ public static Path getNthParent(Path path, int n) {
   public static List<String> getIncrementalTableNames(JobContext job) {
     Map<String, String> tablesModeMap = job.getConfiguration()
         .getValByRegex(HOODIE_CONSUME_MODE_PATTERN_STRING.pattern());
-    List<String> result = tablesModeMap.entrySet().stream().map(s -> {
+    return tablesModeMap.entrySet().stream().map(s -> {
 
 Review comment:
   would it return null? we need avoid return null.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] smarthi commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
smarthi commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373785081
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/common/versioning/compaction/CompactionV1MigrationHandler.java
 ##########
 @@ -59,14 +59,12 @@ public HoodieCompactionPlan downgradeFrom(HoodieCompactionPlan input) {
     final Path basePath = new Path(metaClient.getBasePath());
     List<HoodieCompactionOperation> v1CompactionOperationList = new ArrayList<>();
     if (null != input.getOperations()) {
-      v1CompactionOperationList = input.getOperations().stream().map(inp -> {
-        return HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
-            .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
-            .setDataFilePath(convertToV1Path(basePath, inp.getPartitionPath(), inp.getDataFilePath()))
-            .setDeltaFilePaths(inp.getDeltaFilePaths().stream()
-                .map(s -> convertToV1Path(basePath, inp.getPartitionPath(), s)).collect(Collectors.toList()))
-            .build();
-      }).collect(Collectors.toList());
+      v1CompactionOperationList = input.getOperations().stream().map(inp -> HoodieCompactionOperation.newBuilder().setBaseInstantTime(inp.getBaseInstantTime())
+          .setFileId(inp.getFileId()).setPartitionPath(inp.getPartitionPath()).setMetrics(inp.getMetrics())
 
 Review comment:
   yes - too tired and busy now - and u have a point about code readability.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#issuecomment-575926740
 
 
   given the large number of files touched here, can we land this after the freeze..  
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [HUDI-583] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r373764588
 
 

 ##########
 File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/hive/HoodieCombineHiveInputFormat.java
 ##########
 @@ -356,25 +357,21 @@ public int hashCode() {
     HoodieCombineHiveInputFormat.HoodieCombineFileInputFormatShim combine =
         new HoodieCombineHiveInputFormat.HoodieCombineFileInputFormatShim();
 
-    InputSplit[] splits = null;
-    if (combine == null) {
-      splits = super.getSplits(job, numSplits);
-      return splits;
-    }
 
 Review comment:
   good catch, combine == null is always false.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
leesf commented on a change in pull request #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#discussion_r368187882
 
 

 ##########
 File path: hudi-common/src/main/java/org/apache/hudi/avro/MercifulJsonConverter.java
 ##########
 @@ -286,7 +284,7 @@ private static JsonToAvroFieldProcessor generateArrayTypeHandler() {
       public Pair<Boolean, Object> convert(Object value, String name, Schema schema)
           throws HoodieJsonToAvroConversionException {
         Schema elementSchema = schema.getElementType();
-        List listRes = new ArrayList();
+        List listRes = new ArrayList<>();
 
 Review comment:
   List<Object> listRes = new ArrayList<>(); ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1237: [MINOR] Code Cleanup, remove redundant code, and other changes
URL: https://github.com/apache/incubator-hudi/pull/1237#issuecomment-575522890
 
 
   @smarthi Can you fix the conflicts firstly?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services