You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/11/02 18:37:45 UTC

[GitHub] [accumulo] milleruntime opened a new pull request #2342: MinC Bug when Iterator returns null

milleruntime opened a new pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342


   This PR creates an IT to expose a bug introduced in #1605. If a flush doesn't produce any entries in `DataFileValue`, then a `StoredTabletFile` won't be created, since nothing was inserted into the metadata, and the newFile object will be null. Refactoring was done in `DatafileManager` to the bringMinorCompactionOnline() method so that it now returns the newFile. But this file could be null, if nothing was written to the metadata table. I am not sure what should be done to fix this issue but I was at least was able to reproduce it in an IT. This bug is particularly nasty since Accumulo thinks the flush succeeds but the error will be thrown towards the end of the process and the file never gets created.
   
   Server error printed in tablet server log:
   <pre>
   2021-11-02T13:23:03,134 [tablet.MinorCompactionTask] ERROR: Unknown error during minor compaction for extent: 1<<
   java.lang.RuntimeException: Exception occurred during minor compaction on 1<<
           at org.apache.accumulo.tserver.tablet.Tablet.minorCompact(Tablet.java:820) ~[classes/:?]
           at org.apache.accumulo.tserver.tablet.MinorCompactionTask.run(MinorCompactionTask.java:99) [classes/:?]
           at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57) [htrace-core-3.2.0-incubating.jar:3.2.0-incubating]
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
           at java.lang.Thread.run(Thread.java:829) [?:?]
   Caused by: java.lang.NullPointerException
           at java.util.Objects.requireNonNull(Objects.java:221) ~[?:?]
           at java.util.ImmutableCollections$List12.<init>(ImmutableCollections.java:372) ~[?:?]
           at java.util.List.of(List.java:807) ~[?:?]
           at org.apache.accumulo.tserver.tablet.Tablet.minorCompact(Tablet.java:814) ~[classes/:?]
           ... 5 more
   </pre>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] EdColeman commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
EdColeman commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-961358442






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745647289



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +219,30 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  static long checkFlushId(AccumuloClient c, String tableName, long prevFlushID) throws Exception {
+    try (Scanner scanner = c.createScanner(MetadataTable.NAME, Authorizations.EMPTY)) {
+      String tableId = c.tableOperations().tableIdMap().get(tableName);
+      scanner.setRange(new Range(new Text(tableId + ";"), true, new Text(tableId + "<"), true));
+      scanner.fetchColumnFamily(TabletsSection.ServerColumnFamily.NAME);
+      TabletsSection.ServerColumnFamily.FLUSH_COLUMN.fetch(scanner);
+
+      long flushId = -1L;
+      for (Entry<Key,Value> entry : scanner) {
+        Key key = entry.getKey();
+        String val = entry.getValue().toString();
+
+        if (key.getColumnQualifierData().toString().equals(FLUSH_QUAL)) {
+          flushId = Long.parseLong(val);

Review comment:
       If there are multiple tablets in a table, then those tablets could possible have diff flush ids (they probably should not).  The code seems like it takes the flush id of the last tablet.  Should it check that all tablet flush ids are the same?  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745891296



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +221,34 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  /**
+   * Verify that flush ID gets updated properly and is the same for all tablets.
+   */
+  static long checkFlushId(ClientContext c, TableId tableId, long prevFlushID) throws Exception {
+    try (var metaScan = c.getAmple().readTablets().forTable(tableId).fetch(FLUSH_ID).build()) {

Review comment:
       Realized one more thing, I think this may only return tablets were the column is present.  If a tablet didn't have this column, then it may not even show up (like there would not even be tablet metadata entry of not present).  Because we are only fetching the flush_id column in the underly scan, tablet w/o that column do not show up.  Adding the check consistency will make it fetch the prev row col also which will make tablets w/o the column visible.  Also get the benefit of ensure the tablets are consistent.
   
   ```suggestion
       try (var metaScan = c.getAmple().readTablets().forTable(tableId).fetch(FLUSH_ID).checkConsistency().build()) {
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745671816



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +219,30 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  static long checkFlushId(AccumuloClient c, String tableName, long prevFlushID) throws Exception {
+    try (Scanner scanner = c.createScanner(MetadataTable.NAME, Authorizations.EMPTY)) {
+      String tableId = c.tableOperations().tableIdMap().get(tableName);
+      scanner.setRange(new Range(new Text(tableId + ";"), true, new Text(tableId + "<"), true));
+      scanner.fetchColumnFamily(TabletsSection.ServerColumnFamily.NAME);
+      TabletsSection.ServerColumnFamily.FLUSH_COLUMN.fetch(scanner);
+
+      long flushId = -1L;
+      for (Entry<Key,Value> entry : scanner) {
+        Key key = entry.getKey();
+        String val = entry.getValue().toString();
+
+        if (key.getColumnQualifierData().toString().equals(FLUSH_QUAL)) {
+          flushId = Long.parseLong(val);

Review comment:
       Yeah that is a good idea.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-958672993


   @milleruntime Does this IT currently fail, then?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745891296



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +221,34 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  /**
+   * Verify that flush ID gets updated properly and is the same for all tablets.
+   */
+  static long checkFlushId(ClientContext c, TableId tableId, long prevFlushID) throws Exception {
+    try (var metaScan = c.getAmple().readTablets().forTable(tableId).fetch(FLUSH_ID).build()) {

Review comment:
       Realized one more thing, I think this may only return tablets were the column is present.  If a tablet didn't have this column, then it may not even show up (like there would not even be tablet metadata entry of not present).  Because we are only fetching the flush_id column in the underly scan, tablet w/o that column do not show up.  Adding the check consistency will make it fetch the prev row col also which will make tablets w/o the column show up in the scan.  Also get the benefit of ensure the tablets are consistent.
   
   ```suggestion
       try (var metaScan = c.getAmple().readTablets().forTable(tableId).fetch(FLUSH_ID).checkConsistency().build()) {
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745945462



##########
File path: core/src/main/java/org/apache/accumulo/core/logging/TabletLogger.java
##########
@@ -143,8 +145,11 @@ public static void compacted(KeyExtent extent, CompactionJob job, TabletFile out
         asFileNames(job.getFiles()));
   }
 
-  public static void flushed(KeyExtent extent, TabletFile newDatafile) {
-    fileLog.debug("Flushed {} created {} from [memory]", extent, newDatafile);
+  public static void flushed(KeyExtent extent, Optional<StoredTabletFile> newDatafile) {
+    if (newDatafile.isPresent())
+      fileLog.debug("Flushed {} created {} from [memory]", extent, newDatafile.get());
+    else
+      fileLog.debug("Flushed {} from [memory] but no file was written.", extent);

Review comment:
       This is a really nice improvement.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745011531



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java
##########
@@ -811,7 +812,7 @@ DataFileValue minorCompact(InMemoryMap memTable, TabletFile tmpDatafile, TabletF
         var storedFile = getDatafileManager().bringMinorCompactionOnline(tmpDatafile, newDatafile,
             new DataFileValue(stats.getFileSize(), stats.getEntriesWritten()), commitSession,
             flushId);
-        compactable.filesAdded(true, List.of(storedFile));
+        storedFile.ifPresent(stf -> compactable.filesAdded(true, List.of(stf)));

Review comment:
       Just to make sure I understand this original issue, was null being passed to List.of() what caused the problem?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-958991447






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-959243094


   This test also fails on 2.0.1. It seems like the behavior (the "file" column won't be put in the metadata table if there are no entries in the file.) is still the same. So we might not need this IT and I may just change this to fix the NPE introduced in #1605.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745647289



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +219,30 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  static long checkFlushId(AccumuloClient c, String tableName, long prevFlushID) throws Exception {
+    try (Scanner scanner = c.createScanner(MetadataTable.NAME, Authorizations.EMPTY)) {
+      String tableId = c.tableOperations().tableIdMap().get(tableName);
+      scanner.setRange(new Range(new Text(tableId + ";"), true, new Text(tableId + "<"), true));
+      scanner.fetchColumnFamily(TabletsSection.ServerColumnFamily.NAME);
+      TabletsSection.ServerColumnFamily.FLUSH_COLUMN.fetch(scanner);
+
+      long flushId = -1L;
+      for (Entry<Key,Value> entry : scanner) {
+        Key key = entry.getKey();
+        String val = entry.getValue().toString();
+
+        if (key.getColumnQualifierData().toString().equals(FLUSH_QUAL)) {
+          flushId = Long.parseLong(val);

Review comment:
       If there are multiple tablets in a table, then those tablets could possible have diff flush ids (they probably should not).  The code seems like it take the flush id of the last tablet.  Should it check that all tablet flush ids are the same?  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-961338048


   These changes do fix the error though the logging is still confusing. The tserver logs still show the flush going through like there was no issue. While ingesting data I can see the entries being increases in the monitor but then they will eventually be removed and then will go back to 1 entry after a flush. Should there be additional logging to track what is going on otherwise the results are still kind of confusing?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] EdColeman commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
EdColeman commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962028714


   > I don't think the number of entries ingested through bulk import show up the monitor. 
   
   The entities counts for bulk ingest should show up after a compaction - I believe the numbers are pulled from the metadata file colFam for example:
   
   > 3< file:/default_tablet/F000009y.rf []    186,1
   
   would be 186 bytes with 1 entry.  On bulk import, the size is known, but the entity count is 0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745097805



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +219,30 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  static long checkFlushId(AccumuloClient c, String tableName, long prevFlushID) throws Exception {
+    try (Scanner scanner = c.createScanner(MetadataTable.NAME, Authorizations.EMPTY)) {

Review comment:
       Could use ample for this instead of a scanner.  Not sure if the following compiles or works, just typed it up as a quick example.
   
   ```java
   try(TabletsMetadata tablets = ((ClientContext)c).getAmple().readTablets(tableId).fetch(FLUSH_ID).build())){
         for(TabletMetadata tablet : tablets) {
             tablet.getFlushId();
          }      
   
   
   }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-958672993


   @milleruntime Does this IT currently fail, then?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745042747



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java
##########
@@ -811,7 +812,7 @@ DataFileValue minorCompact(InMemoryMap memTable, TabletFile tmpDatafile, TabletF
         var storedFile = getDatafileManager().bringMinorCompactionOnline(tmpDatafile, newDatafile,
             new DataFileValue(stats.getFileSize(), stats.getEntriesWritten()), commitSession,
             flushId);
-        compactable.filesAdded(true, List.of(storedFile));
+        storedFile.ifPresent(stf -> compactable.filesAdded(true, List.of(stf)));

Review comment:
       Yes. A NPE would be thrown. I don't know if it affected anything with compactions since it only happens when no file gets written.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745630432



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +219,30 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  static long checkFlushId(AccumuloClient c, String tableName, long prevFlushID) throws Exception {
+    try (Scanner scanner = c.createScanner(MetadataTable.NAME, Authorizations.EMPTY)) {
+      String tableId = c.tableOperations().tableIdMap().get(tableName);
+      scanner.setRange(new Range(new Text(tableId + ";"), true, new Text(tableId + "<"), true));
+      scanner.fetchColumnFamily(TabletsSection.ServerColumnFamily.NAME);
+      TabletsSection.ServerColumnFamily.FLUSH_COLUMN.fetch(scanner);
+
+      long flushId = -1L;
+      for (Entry<Key,Value> entry : scanner) {
+        Key key = entry.getKey();
+        String val = entry.getValue().toString();
+
+        if (key.getColumnQualifierData().toString().equals(FLUSH_QUAL)) {
+          flushId = Long.parseLong(val);

Review comment:
       Are you asking if we should check if the flush ids are different for all the tablets?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-959775697


   My last updates change the methods to return an `Optional<StoredTabletFile>` to indicate whether the file was written to metadata or not. This should prevent any future coding errors with that object being null. I also rewrote the IT to check this behavior since I don't think we have any tests like that yet. I don't think it is a critical test but it only takes ~20 seconds to run on my machine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-961217575






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962013768






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-961338048






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-964326867


   @Manno15 @keith-turner I was in the process of updating this PR based on your feedback and noticed that the tablet logger is deceptive when no entries in the file. With no entries, the temp file will get dropped before the rename so I added this temporary log statement to prove it...
   Proof of concept debug line I added to DataFileManager.bringMinorCompactionOnline():
   <pre>
   if (dfv.getNumEntries() == 0) {
       log.debug("Delete file cause its gonna be empty {}", tmpDatafile);
       vm.deleteRecursively(tmpDatafile.getPath());
   }
   </pre>
   From the logs:
   <pre>
   2021-11-09T11:25:47,482 [tablet.DatafileManager] DEBUG: Delete file cause its gonna be empty file:.../accumulo/tables/1/t-0000057/F0000002.rf_tmp
   2021-11-09T11:25:47,489 [tablet.files] DEBUG: Flushed 1;s;a created file:.../accumulo/tables/1/t-0000057/F0000002.rf from [memory]
   </pre>
   And the more I thought about it, this could be really tricky for an admin trying to debug an issue with a file. The File name is a nice unique thing to grep for in logs and this could be very misleading when troubleshooting. Since I am changing the type returned in this method, I think I am going to modify the `tablet.files` log statement for flush to only include the file when it gets stored in the metadata.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745891296



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +221,34 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  /**
+   * Verify that flush ID gets updated properly and is the same for all tablets.
+   */
+  static long checkFlushId(ClientContext c, TableId tableId, long prevFlushID) throws Exception {
+    try (var metaScan = c.getAmple().readTablets().forTable(tableId).fetch(FLUSH_ID).build()) {

Review comment:
       Realized one more thing, I think this may only return tablets were the column is present.  If a tablet didn't have this column, then it may not even show up (like there would not even be tablet metadate entry of not present).  Because we are only fetching the flush_id column in the underly scan, tablet w/o that column do not show up.  Adding the check consistency will make it fetch the prev row col also which will make tablets w/o the column visible.  Also get the benefit of ensure the tablets are consistent.
   
   ```suggestion
       try (var metaScan = c.getAmple().readTablets().forTable(tableId).fetch(FLUSH_ID).checkConsistency().build()) {
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962086055


   > If a user misconfigures an iterator but doesn't realize it, they might have a tough time determining what is going on in their system. 
   
   This has been a tricky part of iterators for some time. I think the accumulo-iterator-test-harness package was added many versions ago to help with debugging iterators but I don't know of anyone who really uses it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 edited a comment on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 edited a comment on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962024655


   > How are you ingesting data? I don't think the number of entries ingested through bulk import show up the monitor.
   
   I just did a small amount of continuous ingest.
   
   > Also, if you have this null iterator (realistically an iterator that shouldn't be run) configured for minc then it will essentially drop your entries on a flush.
   
   Yeah, I did realize that. I should have revised my previous comment to be more clear on the issue I was suggesting. That it is unclear in the logs that something like that is occurring. If a user misconfigures an iterator but doesn't realize it, they might have a tough time determining what is going on in their system. The monitor could show files being dropped while the logs show that the flush was completed fully. Though maybe something like that won't realistically occur. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 edited a comment on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 edited a comment on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962024655


   > How are you ingesting data? I don't think the number of entries ingested through bulk import show up the monitor.
   
   I just did a small amount of continuous ingest.
   
   > Also, if you have this null iterator (realistically an iterator that shouldn't be run) configured for minc then it will essentially drop your entries on a flush.
   
   Yeah, I did realize that. I should have revised my previous comment to be more clear on the issue I was suggesting. That it is unclear in the logs that something like that is occurring. If a user misconfigures an iterator but doesn't realize it, they might have a tough time determining what is going on in their system. The monitor could show files being dropped while the logs show that the flush was completed fully. Though maybe something like that won't realistically occur. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime merged pull request #2342: Create FlushNoFileIT and fix NPE in tablet file flush code

Posted by GitBox <gi...@apache.org>.
milleruntime merged pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962013768


   > While ingesting data I can see the entries being increases in the monitor but then they will eventually be removed and then will go back to 1 entry after a flush.
   
   How are you ingesting data? I don't think the number of entries ingested through bulk import show up the monitor. Also, if you have this null iterator (realistically an iterator that shouldn't be run) configured for minc then it will essentially drop your entries on a flush. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962097872


   > I haven't looked into it directly, but could a log message be added when entries get dropped during a flush or when the newFile is empty?
   
   Yeah it is probably worth putting a debug statement. I don't think it should be warn or error since it can happen under perfectly normal conditions. It is probably not high level enough to log at info either.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-961367417


   > Just curious - what level is the of logging that you are seeing? Additional logging at debug or higher could be noisy so systems that are running C.I. and end up with a lot of flushing.
   
   At the debug level. I mostly mean in the instance that this failure would occur. Not necessarily every flush. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-958672993


   @milleruntime Does this IT currently fail, then?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-958991447


   > @milleruntime Does this IT currently fail, then?
   
   Yes. But it may not be failing for the reasons I had originally though. The check after the flush scans the metadata looking for the "file" column but this won't be put in the metadata table if there are no entries in the file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] EdColeman commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
EdColeman commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-961358442






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-961186745


   > So overall I think this change is good. I haven't run into any issues while testing but I also haven't been able to reproduce the issue against main so I have only been able to go off of the results of your test for validation.
   
   I think the only thing you will see is a stacktrace in the log of the NPE. Assuming it doesn't lead to issues with new compactable code, everything else should work as expected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745089271



##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FlushNoFileIT.java
##########
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.test.functional;
+
+import static org.junit.Assert.assertTrue;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.EnumSet;
+import java.util.Map;
+
+import org.apache.accumulo.core.client.Accumulo;
+import org.apache.accumulo.core.client.AccumuloClient;
+import org.apache.accumulo.core.client.BatchWriter;
+import org.apache.accumulo.core.client.IteratorSetting;
+import org.apache.accumulo.core.client.admin.NewTableConfiguration;
+import org.apache.accumulo.core.data.ByteSequence;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Range;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.iterators.IteratorEnvironment;
+import org.apache.accumulo.core.iterators.IteratorUtil;
+import org.apache.accumulo.core.iterators.SortedKeyValueIterator;
+import org.apache.accumulo.harness.AccumuloClusterHarness;
+import org.apache.hadoop.io.Text;
+import org.junit.Test;
+
+/**
+ * Tests that Accumulo will flush but not create a file that has 0 entries.
+ */
+public class FlushNoFileIT extends AccumuloClusterHarness {
+
+  public static class NullIterator implements SortedKeyValueIterator<Key,Value> {
+
+    @Override
+    public void init(SortedKeyValueIterator<Key,Value> source, Map<String,String> options,
+        IteratorEnvironment env) {}
+
+    @Override
+    public boolean hasTop() {
+      return false;
+    }
+
+    @Override
+    public void next() throws IOException {}
+
+    @Override
+    public void seek(Range range, Collection<ByteSequence> columnFamilies, boolean inclusive) {}
+
+    @Override
+    public Key getTopKey() {
+      return null;
+    }
+
+    @Override
+    public Value getTopValue() {
+      return null;
+    }
+
+    @Override
+    public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env) {
+      return null;
+    }
+  }
+
+  @Override
+  protected int defaultTimeoutSeconds() {
+    return 60;
+  }
+
+  @Test
+  public void test() throws Exception {
+    try (AccumuloClient c = Accumulo.newClient().from(getClientProps()).build()) {
+
+      String tableName = getUniqueNames(1)[0];
+
+      NewTableConfiguration ntc = new NewTableConfiguration();
+      IteratorSetting iteratorSetting = new IteratorSetting(20, NullIterator.class);
+      ntc.attachIterator(iteratorSetting, EnumSet.of(IteratorUtil.IteratorScope.minc));
+
+      c.tableOperations().create(tableName, ntc);
+
+      try (BatchWriter bw = c.createBatchWriter(tableName)) {
+        Mutation m = new Mutation(new Text("r1"));
+        m.put("acf", tableName, "1");
+        bw.addMutation(m);
+      }
+
+      FunctionalTestUtils.checkRFiles(c, tableName, 1, 1, 0, 0);
+
+      c.tableOperations().flush(tableName, null, null, true);
+
+      FunctionalTestUtils.checkRFiles(c, tableName, 1, 1, 0, 0);
+
+      long flushId = FunctionalTestUtils.checkFlushId(c, tableName, 0);
+
+      try (BatchWriter bw = c.createBatchWriter(tableName)) {
+        Mutation m = new Mutation(new Text("r2"));
+        m.put("acf", tableName, "1");
+        bw.addMutation(m);
+      }
+
+      c.tableOperations().flush(tableName, null, null, true);
+
+      FunctionalTestUtils.checkRFiles(c, tableName, 1, 1, 0, 0);
+
+      long secondFlushId = FunctionalTestUtils.checkFlushId(c, tableName, flushId);
+      assertTrue("Flush ID did not change", secondFlushId > flushId);
+    }

Review comment:
       Would also be nice to do a scan of the table.   Just to ensure the scan does not throw any errors for this case.

##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +219,30 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  static long checkFlushId(AccumuloClient c, String tableName, long prevFlushID) throws Exception {
+    try (Scanner scanner = c.createScanner(MetadataTable.NAME, Authorizations.EMPTY)) {

Review comment:
       Could use ample for this instead of a scanner.  Not sure if the following compiles or works, just typed it up as a quick example.
   
   ```java
   try(TabletsMetadata tablets = ((ClientContext)c).getAmple().readTablets(tableId).fetch(FLUSH_ID).build())){
         for(TabletMetadata tablet : tablets) {
             tablets.getFlushId();
          }      
   
   
   }
   ```

##########
File path: test/src/main/java/org/apache/accumulo/test/functional/FunctionalTestUtils.java
##########
@@ -218,4 +219,30 @@ private static FateStatus getFateStatus(ClientContext context, AccumuloCluster c
       throw new RuntimeException(e);
     }
   }
+
+  static long checkFlushId(AccumuloClient c, String tableName, long prevFlushID) throws Exception {
+    try (Scanner scanner = c.createScanner(MetadataTable.NAME, Authorizations.EMPTY)) {
+      String tableId = c.tableOperations().tableIdMap().get(tableName);
+      scanner.setRange(new Range(new Text(tableId + ";"), true, new Text(tableId + "<"), true));
+      scanner.fetchColumnFamily(TabletsSection.ServerColumnFamily.NAME);
+      TabletsSection.ServerColumnFamily.FLUSH_COLUMN.fetch(scanner);
+
+      long flushId = -1L;
+      for (Entry<Key,Value> entry : scanner) {
+        Key key = entry.getKey();
+        String val = entry.getValue().toString();
+
+        if (key.getColumnQualifierData().toString().equals(FLUSH_QUAL)) {
+          flushId = Long.parseLong(val);

Review comment:
       Does it matter if diff tablets have diff flush ids?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] EdColeman commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
EdColeman commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-961358442


   Just curious - what level is the of logging that you are seeing?  Additional logging at debug or higher could be noisy so systems that are running C.I. and end up with a lot of flushing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 edited a comment on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 edited a comment on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962024655


   > How are you ingesting data? I don't think the number of entries ingested through bulk import show up the monitor.
   
   I just did a small amount of continuous ingest.
   
   > Also, if you have this null iterator (realistically an iterator that shouldn't be run) configured for minc then it will essentially drop your entries on a flush.
   
   Yeah, I did realize that. I should have revised my previous comment to be more clear on the issue I was suggesting. That it is unclear in the logs that something like that is occurring. If a user misconfigures an iterator but doesn't realize it, they might have a tough time determining what is going on in their system. The monitor could show files being dropped while the logs show that the flush was completed fully. Though maybe something like that won't realistically occur. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962013768






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-958672993


   @milleruntime Does this IT currently fail, then?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-958991447






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-961217575


   > I think the only thing you will see is a stacktrace in the log of the NPE. Assuming it doesn't lead to issues with new compactable code, everything else should work as expected.
   
   I apparently forgot to read the iterator part of the ticket. I was able to reproduce the error now. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962024655


   > How are you ingesting data? I don't think the number of entries ingested through bulk import show up the monitor.
   
   I just did a small amount of continuous ingest.
   
   > Also, if you have this null iterator (realistically an iterator that shouldn't be run) configured for minc then it will essentially drop your entries on a flush.
   
   Yeah, I did realize that. I should have revised my previous comment to be more clear on the issue I was suggesting. That it is unclear in the logs that something like that is occurring. If a user misconfigures an iterator but doesn't realize it, they might have a tough time determining what is going on in their system. The monitor could show files being dropped while the logs show that the flush was completed fully. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] Manno15 commented on pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#issuecomment-962091348


   > This has been a tricky part of iterators for some time. I think the accumulo-iterator-test-harness package was added many versions ago to help with debugging iterators but I don't know of anyone who really uses it.
   
   I haven't looked into it directly, but could a log message be added when entries get dropped during a flush or when the newFile is empty? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2342: MinC Bug when Iterator returns null

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2342:
URL: https://github.com/apache/accumulo/pull/2342#discussion_r745011531



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java
##########
@@ -811,7 +812,7 @@ DataFileValue minorCompact(InMemoryMap memTable, TabletFile tmpDatafile, TabletF
         var storedFile = getDatafileManager().bringMinorCompactionOnline(tmpDatafile, newDatafile,
             new DataFileValue(stats.getFileSize(), stats.getEntriesWritten()), commitSession,
             flushId);
-        compactable.filesAdded(true, List.of(storedFile));
+        storedFile.ifPresent(stf -> compactable.filesAdded(true, List.of(stf)));

Review comment:
       Just to make sure I understand this original issue, was null being passed to List.of() here what caused the problem?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org