You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/07/01 09:30:13 UTC

[GitHub] [hbase] wchevreuil opened a new pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

wchevreuil opened a new pull request #2009:
URL: https://github.com/apache/hbase/pull/2009


   …s above VERSIONS limit
   
   While going through some other delete marker issue, I had found out this issue is still present on master branch. Had rebased the original patch and pushed a PR. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 edited a comment on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache9 edited a comment on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656132049


   > I guess both are solved by HBASE-15968 solution?
   
   Theoretically,  HBASE-15968 can solve the problem.
   But HBASE-15968 itself is a PITA.
   The implementation is much more complicated, and can impact performance. We used to also have customers wanted to solve the problem but AFAIK, we haven't actually enabled it on any of our production cluster(@infraio correct me if I'm wrong). So I do not think the feature is stable enough...
   The problem is that, if you only use normal read/write/delete, you do not need this 'new behavior', everything is fine. The effort on changing the behavior will likely effect the performance...
   
   For the customers who have special usage, for example, set keepDeletedCells, need more than one versions, want to delete a specific version, and which is the most important, set the timestamp manually and the timestamp is not always increasing, the 'new behavior' can solve the problem, but usually it could also be solved by the customers themselves, by changing their usage to avoid hittin the bad cases...
   
   So if users meet the problem, I will suggest them to go to HBASE-15968, but usually, they will just change their usage...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-652328261


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m  6s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m  7s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 16s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m  8s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 45s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 15s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  1s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  12m 21s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 15s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 13s |  The patch does not generate ASF License warnings.  |
   |  |   |  36m  5s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2009 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 1006133a1a0a 4.15.0-91-generic #92-Ubuntu SMP Fri Feb 28 11:09:48 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 9ad16aa376 |
   | Max. process+thread count | 84 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#discussion_r449863611



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,87 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  private void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,
+      byte[] byteNow, List<Cell> deleteCells) throws IOException {
+    List<Cell> result = new ArrayList<>(deleteCells);
+    Scan scan = new Scan(get);
+    scan.setRaw(true);
+    this.getScanner(scan).next(result);
+    List<Cell> cells = new ArrayList<>();
     if (result.size() < count) {
       // Nothing to delete
       PrivateCellUtil.updateLatestStamp(cell, byteNow);
-      return;
-    }
-    if (result.size() > count) {
-      throw new RuntimeException("Unexpected size: " + result.size());
+      cells.add(cell);
+      deleteCells.addAll(cells);
+    } else if (result.size() > count) {
+      int currentVersion = 0;
+      long latestCellTS = Long.MAX_VALUE;
+      result.sort((cell1, cell2) -> {
+        if(cell1.getTimestamp()>cell2.getTimestamp()){
+          return -1;
+        } else if(cell1.getTimestamp()<cell2.getTimestamp()){
+          return 1;
+        } else {
+          if(CellUtil.isDelete(cell1)){
+            return -1;
+          } else if (CellUtil.isDelete(cell2)){
+            return 1;
+          }
+        }
+        return 0;
+      });
+      for(Cell getCell : result){
+        if(!(CellUtil.matchingFamily(getCell, cell) && CellUtil.matchingQualifier(getCell, cell))){
+          continue;
+        }
+        if(!PrivateCellUtil.isDeleteType(getCell) && getCell.getTimestamp()!=latestCellTS){
+          if (currentVersion >= maxVersions) {
+            Cell tempCell = null;
+            try {
+              tempCell = PrivateCellUtil.deepClone(cell);
+            } catch (CloneNotSupportedException e) {
+              throw new IOException(e);
+            }
+            PrivateCellUtil.setTimestamp(tempCell, getCell.getTimestamp());
+            cells.add(tempCell);
+          } else if (currentVersion == 0) {
+            PrivateCellUtil.setTimestamp(cell, getCell.getTimestamp());
+            cells.add(cell);
+          }
+          currentVersion++;
+        }
+        latestCellTS = getCell.getTimestamp();
+      }
+
+    } else {
+      Cell getCell = result.get(0);

Review comment:
       I am not sure if sorting of result is required here. If not required, rest looks good.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-652424070


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 28s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m  3s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m 33s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 46s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 33s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 13s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 43s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 51s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 214m 10s |  hbase-server in the patch failed.  |
   |  |   | 243m 24s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2009 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux cd9b83d2672a 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 9ad16aa376 |
   | Default Java | 2020-01-14 |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/testReport/ |
   | Max. process+thread count | 3042 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656767138


   > > The only way to completely solve all the related problems is HBASE-15968, where we also consider MVCC so earlier modification will not effect later modifications.
   > 
   > Agreed. But we need more test and usage case for this feature. @wchevreuil Do you meet same case for this problem? You can take a try for it. It will be better if we can foucs on one same solution. And for the document, I am +1 to update it and make it more clearly.
   
   I agree we should aim towards a unique solution. Is HBASE-15968 not considered production ready because of performance concerns? And since "versions" feature behaves inconsistently without HBASE-15968, shouldn't we recall it as not production ready as well? I can work on documenting the inconsistencies caused by versions and related delete markers, meanwhile. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil commented on a change in pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil commented on a change in pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#discussion_r449650052



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,88 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,

Review comment:
       Had not done previously because original method modified was package private. Doing now.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,88 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,
+      byte[] byteNow, List<Cell> deleteCells) throws IOException {
+    List<Cell> result = new ArrayList<>();
+    result.addAll(deleteCells);

Review comment:
       Yes.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil commented on a change in pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil commented on a change in pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#discussion_r450329196



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,87 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  private void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,
+      byte[] byteNow, List<Cell> deleteCells) throws IOException {
+    List<Cell> result = new ArrayList<>(deleteCells);
+    Scan scan = new Scan(get);
+    scan.setRaw(true);
+    this.getScanner(scan).next(result);
+    List<Cell> cells = new ArrayList<>();
     if (result.size() < count) {
       // Nothing to delete
       PrivateCellUtil.updateLatestStamp(cell, byteNow);
-      return;
-    }
-    if (result.size() > count) {
-      throw new RuntimeException("Unexpected size: " + result.size());
+      cells.add(cell);
+      deleteCells.addAll(cells);
+    } else if (result.size() > count) {
+      int currentVersion = 0;
+      long latestCellTS = Long.MAX_VALUE;
+      result.sort((cell1, cell2) -> {
+        if(cell1.getTimestamp()>cell2.getTimestamp()){
+          return -1;
+        } else if(cell1.getTimestamp()<cell2.getTimestamp()){
+          return 1;
+        } else {
+          if(CellUtil.isDelete(cell1)){
+            return -1;
+          } else if (CellUtil.isDelete(cell2)){
+            return 1;
+          }
+        }
+        return 0;
+      });
+      for(Cell getCell : result){
+        if(!(CellUtil.matchingFamily(getCell, cell) && CellUtil.matchingQualifier(getCell, cell))){
+          continue;
+        }
+        if(!PrivateCellUtil.isDeleteType(getCell) && getCell.getTimestamp()!=latestCellTS){
+          if (currentVersion >= maxVersions) {
+            Cell tempCell = null;
+            try {
+              tempCell = PrivateCellUtil.deepClone(cell);
+            } catch (CloneNotSupportedException e) {
+              throw new IOException(e);
+            }
+            PrivateCellUtil.setTimestamp(tempCell, getCell.getTimestamp());
+            cells.add(tempCell);
+          } else if (currentVersion == 0) {
+            PrivateCellUtil.setTimestamp(cell, getCell.getTimestamp());
+            cells.add(cell);
+          }
+          currentVersion++;
+        }
+        latestCellTS = getCell.getTimestamp();
+      }
+
+    } else {
+      Cell getCell = result.get(0);

Review comment:
       It's not needed, because we don't have to worry about additional versions, we only need to put a single marker for current TS.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656193610


   The only way to completely solve all the related problems is HBASE-15968, where we also consider MVCC so earlier modification will not effect later modifications. All other solutions can lead to inconsistent results under some strange scenarios...
   
   Agree that we should improve the document. I think we should provide some suggested ways to solve some strange behaviors. For example, the fix in this PR could also be done from client side right?
   
   And on HBASE-15968, let's wait some time for the input from @infraio .
   
   Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#discussion_r448324924



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,88 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,
+      byte[] byteNow, List<Cell> deleteCells) throws IOException {
+    List<Cell> result = new ArrayList<>();
+    result.addAll(deleteCells);
+    Scan scan = new Scan(get);
+    scan.setRaw(true);
+    this.getScanner(scan).next(result);
+    result.sort((cell1, cell2) -> {
+      if(cell1.getTimestamp()>cell2.getTimestamp()){
+        return -1;
+      } else if(cell1.getTimestamp()<cell2.getTimestamp()){
+        return 1;
+      } else {
+        if(CellUtil.isDelete(cell1)){
+          return -1;
+        } else if (CellUtil.isDelete(cell2)){
+          return 1;
+        }
+      }
+      return 0;
+    });
+    List<Cell> cells = new ArrayList<>();
     if (result.size() < count) {

Review comment:
       result sorting doesn't seem useful for this condition. Can we avoid sorting for this?
   ```
   if (result.size() < count) {
   ..
   ..
     deleteCells.addAll(cells);
     return;
   }
   
   result.sort();
   
   if (result.size() > count){
   ..
   ..
   }else{
   ..
   }
   deleteCells.addAll(cells);
   ```

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,88 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,

Review comment:
       nit: `private` ?

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,88 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,
+      byte[] byteNow, List<Cell> deleteCells) throws IOException {
+    List<Cell> result = new ArrayList<>();
+    result.addAll(deleteCells);

Review comment:
       nit: we can wrap it with ArrayList constructor: `new ArrayList<>(deleteCells);`

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,88 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }

Review comment:
       Would you prefer using a boolean to make single call to updateDeleteLatestVersionTimestamp?
   ```
             boolean updateDelTs=false;
             if (coprocessorHost != null) {
               if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                   byteNow, get)) {
                 updateDelTs=true;
               }
             } else {
               updateDelTs=true;
             }
             if(updateDelTs){
               updateDeleteLatestVersionTimestamp(cell, get, count,
                 this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
                 byteNow, deleteCells);
             }
   ```
   Only if you feel this is more readable :)

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,88 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,
+      byte[] byteNow, List<Cell> deleteCells) throws IOException {
+    List<Cell> result = new ArrayList<>();
+    result.addAll(deleteCells);
+    Scan scan = new Scan(get);
+    scan.setRaw(true);
+    this.getScanner(scan).next(result);
+    result.sort((cell1, cell2) -> {
+      if(cell1.getTimestamp()>cell2.getTimestamp()){
+        return -1;
+      } else if(cell1.getTimestamp()<cell2.getTimestamp()){
+        return 1;
+      } else {
+        if(CellUtil.isDelete(cell1)){
+          return -1;
+        } else if (CellUtil.isDelete(cell2)){
+          return 1;
+        }
+      }
+      return 0;
+    });
+    List<Cell> cells = new ArrayList<>();
     if (result.size() < count) {
       // Nothing to delete
       PrivateCellUtil.updateLatestStamp(cell, byteNow);
-      return;
-    }
-    if (result.size() > count) {
-      throw new RuntimeException("Unexpected size: " + result.size());
+      cells.add(cell);
+      deleteCells.addAll(cells);
+    } else if (result.size() > count) {
+      int currentVersion = 0;
+      long latestCellTS = Long.MAX_VALUE;
+      for(Cell getCell : result){
+        if(!(CellUtil.matchingFamily(getCell, cell) && CellUtil.matchingQualifier(getCell, cell))){
+          continue;
+        }
+        if(!PrivateCellUtil.isDeleteType(getCell) && getCell.getTimestamp()!=latestCellTS){
+          if (currentVersion >= maxVersions) {
+            Cell tempCell = null;
+            try {
+              tempCell = PrivateCellUtil.deepClone(cell);
+            } catch (CloneNotSupportedException e) {
+              throw new IOException(e);
+            }
+            PrivateCellUtil.setTimestamp(tempCell, getCell.getTimestamp());
+            cells.add(tempCell);
+          } else if (currentVersion == 0) {
+            PrivateCellUtil.setTimestamp(cell, getCell.getTimestamp());
+            cells.add(cell);
+          }
+          currentVersion++;
+        }
+        latestCellTS = getCell.getTimestamp();
+      }
+
+    } else {
+      Cell getCell = result.get(0);
+      PrivateCellUtil.setTimestamp(cell, getCell.getTimestamp());
+      cells.add(cell);

Review comment:
       Should we not deepClone cell, update timestamp and then add that cloned cell to cells list similar to above branch?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-653668004


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   6m 30s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m  4s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 57s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m  2s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 45s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 56s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 56s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m  2s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 201m 30s |  hbase-server in the patch passed.  |
   |  |   | 232m 33s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2009 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 8036fa02d1c8 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / e614b89c33 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/testReport/ |
   | Max. process+thread count | 2858 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-653625358


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   6m 36s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 53s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 16s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m  6s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 44s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  12m 20s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 16s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 11s |  The patch does not generate ASF License warnings.  |
   |  |   |  41m  2s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2009 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 0cfce759e350 4.15.0-91-generic #92-Ubuntu SMP Fri Feb 28 11:09:48 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / e614b89c33 |
   | Max. process+thread count | 84 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil commented on a change in pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil commented on a change in pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#discussion_r449650680



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,88 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,
+      byte[] byteNow, List<Cell> deleteCells) throws IOException {
+    List<Cell> result = new ArrayList<>();
+    result.addAll(deleteCells);
+    Scan scan = new Scan(get);
+    scan.setRaw(true);
+    this.getScanner(scan).next(result);
+    result.sort((cell1, cell2) -> {
+      if(cell1.getTimestamp()>cell2.getTimestamp()){
+        return -1;
+      } else if(cell1.getTimestamp()<cell2.getTimestamp()){
+        return 1;
+      } else {
+        if(CellUtil.isDelete(cell1)){
+          return -1;
+        } else if (CellUtil.isDelete(cell2)){
+          return 1;
+        }
+      }
+      return 0;
+    });
+    List<Cell> cells = new ArrayList<>();
     if (result.size() < count) {
       // Nothing to delete
       PrivateCellUtil.updateLatestStamp(cell, byteNow);
-      return;
-    }
-    if (result.size() > count) {
-      throw new RuntimeException("Unexpected size: " + result.size());
+      cells.add(cell);
+      deleteCells.addAll(cells);
+    } else if (result.size() > count) {
+      int currentVersion = 0;
+      long latestCellTS = Long.MAX_VALUE;
+      for(Cell getCell : result){
+        if(!(CellUtil.matchingFamily(getCell, cell) && CellUtil.matchingQualifier(getCell, cell))){
+          continue;
+        }
+        if(!PrivateCellUtil.isDeleteType(getCell) && getCell.getTimestamp()!=latestCellTS){
+          if (currentVersion >= maxVersions) {
+            Cell tempCell = null;
+            try {
+              tempCell = PrivateCellUtil.deepClone(cell);
+            } catch (CloneNotSupportedException e) {
+              throw new IOException(e);
+            }
+            PrivateCellUtil.setTimestamp(tempCell, getCell.getTimestamp());
+            cells.add(tempCell);
+          } else if (currentVersion == 0) {
+            PrivateCellUtil.setTimestamp(cell, getCell.getTimestamp());
+            cells.add(cell);
+          }
+          currentVersion++;
+        }
+        latestCellTS = getCell.getTimestamp();
+      }
+
+    } else {
+      Cell getCell = result.get(0);
+      PrivateCellUtil.setTimestamp(cell, getCell.getTimestamp());
+      cells.add(cell);

Review comment:
       It's been a while since I originally proposed this patch, but IIRC we don't need it here, because there's no extra version to fabricate a new delete marker.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-652429756


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 12s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 59s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m 12s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 56s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 59s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 59s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m  3s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 227m 47s |  hbase-server in the patch failed.  |
   |  |   | 253m 39s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2009 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 83d023340484 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 9ad16aa376 |
   | Default Java | 1.8.0_232 |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/testReport/ |
   | Max. process+thread count | 2997 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656042766


   Oh, the PR has already been merged...
   
   I skimmed the comments on jira and also the patch, so the approach here is when we delete a specific version, we will go through all the old versions and added a delete marker for them if needed.
   
   This approach can solve a problem but will introduce other problems...
   What if then user manually added a new cell with the version you deleted? In general, the user should see the new cell but actually, it depends on whether we have done a major compaction?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] infraio commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
infraio commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656530633


   > And on HBASE-15968, let's wait some time for the input from @infraio .
   
   Not used in our production cluster now......


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656134402


   > Yeah, but it's the same current behaviour for manually deleted ones, right?
   
   No, for the scenario below they are not the same.
   
   For example, you have t1 > t2 > t3, and you want to keep 2 versions. You delete t2, in the old behavior, t3 will come back, which is strange, so in your PR here, you also issue a delete for t3, to make sure it will not come back. But then if you put a new value with t3, in the old behavior, you can see the new value, but with the PR here, you can not see it, as there is a delete markar for t3...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil merged pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil merged pull request #2009:
URL: https://github.com/apache/hbase/pull/2009


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil commented on a change in pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil commented on a change in pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#discussion_r449650148



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
##########
@@ -3170,37 +3171,88 @@ public void prepareDeleteTimestamps(Mutation mutation, Map<byte[], List<Cell>> f
           count = kvCount.get(qual);
 
           Get get = new Get(CellUtil.cloneRow(cell));
-          get.readVersions(count);
-          get.addColumn(family, qual);
+          get.readVersions(Integer.MAX_VALUE);
           if (coprocessorHost != null) {
             if (!coprocessorHost.prePrepareTimeStampForDeleteVersion(mutation, cell,
                 byteNow, get)) {
-              updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+              updateDeleteLatestVersionTimestamp(cell, get, count,
+                  this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                    byteNow, deleteCells);
+
             }
           } else {
-            updateDeleteLatestVersionTimestamp(cell, get, count, byteNow);
+            updateDeleteLatestVersionTimestamp(cell, get, count,
+                this.htableDescriptor.getColumnFamily(family).getMaxVersions(),
+                  byteNow, deleteCells);
           }
         } else {
           PrivateCellUtil.updateLatestStamp(cell, byteNow);
+          deleteCells.add(cell);
         }
       }
+      e.setValue(deleteCells);
     }
   }
 
-  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, byte[] byteNow)
-      throws IOException {
-    List<Cell> result = get(get, false);
-
+  void updateDeleteLatestVersionTimestamp(Cell cell, Get get, int count, int maxVersions,
+      byte[] byteNow, List<Cell> deleteCells) throws IOException {
+    List<Cell> result = new ArrayList<>();
+    result.addAll(deleteCells);
+    Scan scan = new Scan(get);
+    scan.setRaw(true);
+    this.getScanner(scan).next(result);
+    result.sort((cell1, cell2) -> {
+      if(cell1.getTimestamp()>cell2.getTimestamp()){
+        return -1;
+      } else if(cell1.getTimestamp()<cell2.getTimestamp()){
+        return 1;
+      } else {
+        if(CellUtil.isDelete(cell1)){
+          return -1;
+        } else if (CellUtil.isDelete(cell2)){
+          return 1;
+        }
+      }
+      return 0;
+    });
+    List<Cell> cells = new ArrayList<>();
     if (result.size() < count) {

Review comment:
       Yes.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil edited a comment on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil edited a comment on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656080107


   > Oh, the PR has already been merged...
   
   I can revert it back. This has been a recurring source of confusion and complains among our customer base, and having inconsistent results prior and after compaction doesn't seem right, maybe we should make HBASE-15968 solution default here?
   
   > I skimmed the comments on jira and also the patch, so the approach here is when we delete a specific version, we will go through all the old versions and added a delete marker for them if needed.
   > 
   > This approach can solve a problem but will introduce other problems...
   > What if then user manually added a new cell with the version you deleted? In general, the user should see the new cell but actually, it depends on whether we have done a major compaction?
   
   Yeah, but it's the same current behaviour for manually deleted ones, right? I guess both are solved by HBASE-15968 solution?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache9 commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache9 commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656132049


   > Yeah, but it's the same current behaviour for manually deleted ones, right? I guess both are solved by HBASE-15968 solution?
   
   Theoretically,  HBASE-15968 can solve the problem.
   But HBASE-15968 itself is a PITA.
   The implementation is much more complicated, and can impact performance. We used to also have customers wanted to solve the problem but AFAIK, we haven't actually enabled it on any of our production cluster(@infraio correct me if I'm wrong). So I do not think the feature is stable enough...
   The problem is that, if you only use normal read/write/delete, you do not need this 'new behavior', everything is fine. The effort on changing the behavior will likely effect the performance...
   
   For the customers who have special usage, for example, set keepDeletedCells, need more than one versions, want to delete a specific version, and which is the most important, set the timestamp manually and the timestamp is not always increasing, the 'new behavior' can solve the problem, but usually it could also be solved by the customers themselves, by changing their usage to avoid hittin the bad cases...
   
   So if users meet the problem, I will suggest them to go to HBASE-15968, but usually, they will just change their usage...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-653651861


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 31s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 12s |  master passed  |
   | +1 :green_heart: |  compile  |   1m  4s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 49s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 42s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 56s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  3s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 43s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 39s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 129m 25s |  hbase-server in the patch passed.  |
   |  |   | 155m  2s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2009 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux f460b8d40baf 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / e614b89c33 |
   | Default Java | 2020-01-14 |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/testReport/ |
   | Max. process+thread count | 4620 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2009/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656179536


   >So if users meet the problem, I will suggest them to go to HBASE-15968, but usually, they will just change their usage...
   
   Ok, had reverted it from master.
   
   > For example, you have t1 > t2 > t3, and you want to keep 2 versions. You delete t2, in the old behavior, t3 will come back, which is strange, so in your PR here, you also issue a delete for t3, to make sure it will not come back. But then if you put a new value with t3, in the old behavior, you can see the new value, but with the PR here, you can not see it, as there is a delete marker for t3...
   
   Yeah, got your point, what I meant was that currently you already have same problem for the deleted t2, in your example, where you can't bring it back until the delete marker is gone. And for both cases, keepDeletedCells is a problem.
   
   There's another problem reported (not solved on this PR), when you put a Delete for a TS larger than all existing versions, it has actually no effects, not filtering out any of the older versions, because VERSION_DELETE requires delete TS equality, however, documentation states:
   
   `For example, let’s suppose we want to delete a row. For this you can specify a version, or else by default the currentTimeMillis is used. What this means is delete all cells where the version is less than or equal to this version. `
   
   It could be fixed in a similar way, by fabricating individual delete markers for all existing versions older than the specified delete timestamp, or we can just assume this is correct, then update documentation accordingly.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656080107


   > Oh, the PR has already been merged...
   I can revert it back. This has been a recurring source of confusion and complains among our customer base, and having inconsistent results prior and after compaction doesn't seem right, maybe we should make HBASE-15968 solution default here?
   
   > I skimmed the comments on jira and also the patch, so the approach here is when we delete a specific version, we will go through all the old versions and added a delete marker for them if needed.
   > 
   > This approach can solve a problem but will introduce other problems...
   > What if then user manually added a new cell with the version you deleted? In general, the user should see the new cell but actually, it depends on whether we have done a major compaction?
   Yeah, but it's the same current behaviour for manually deleted ones, right? I guess both are solved by HBASE-15968 solution?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] infraio commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
infraio commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-656533910


   > The only way to completely solve all the related problems is HBASE-15968, where we also consider MVCC so earlier modification will not effect later modifications. 
   
   Agreed. But we need more test and usage case for this feature. @wchevreuil Do you meet same case for this problem? You can take a try for it. It will be better if we can foucs on one same solution. And for the document, I am +1 to update it and make it more clearly.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil commented on pull request #2009: HBASE-21596 Delete for a specific cell version can bring back version…

Posted by GitBox <gi...@apache.org>.
wchevreuil commented on pull request #2009:
URL: https://github.com/apache/hbase/pull/2009#issuecomment-653614684


   Addressed @virajjasani review comments. Fixed previous failed UTs, as those were validating the wrong behaviour. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org