You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/02/15 09:49:35 UTC

[GitHub] [ozone] szetszwo opened a new pull request #3091: HDDS-6323. Close RocksObject(s).

szetszwo opened a new pull request #3091:
URL: https://github.com/apache/ozone/pull/3091


   ## What changes were proposed in this pull request?
   
   Ensure that the RocksObjects are closed properly.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6323
   
   ## How was this patch tested?
   
   Existing unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo edited a comment on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
szetszwo edited a comment on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1040359052


   Triggered an existing bug that ColumnFamilyOptions were shared and never closed. With the change, some unit tests failed since the new code will close ColumnFamilyOptions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1055076093


   @kerneltime please review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1058779657


   @kerneltime , do you have anymore comments?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1059889387


   > I was referring to the cache of open DBs maintained by the Datanode ...
   
   @kerneltime , I see.  Then, further accesses to the cached DB should see an exception since the DB is already closed.  This seems okay.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
kerneltime commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1086335457


   @szetszwo I am not sure how this code is expected to work overall.
   1. `ReferenceCountedDB` lives in the `ContainerCache` and currently the logic to open and close a DB is tied to the lifecycle of the cache entry. 
   2. Can you point out the documentation recommending that the DB should be closed on any exception.
   3. The main problem we have in our code is that RocksDB expects that any Java class that inherits `RocksObject` should be manually closed when done instead of depending on the JVM calling the finalizer which then frees up the memory allocated by the underlying JNI code. This code addresses some of the code paths but there are other places where we need to reason about our memory management. We currently do not have a leak but are depending on the JVM to eventually clean up the memory. 
   4. The segfaults witnessed by the code were related to the open and close path for RocksDB and this code change would not solve the problems we encountered.
   I think it is a worthwhile effort to fix our usage of RocksDB but we need to complete the model including cache lifecycle management. If the underlying DB is closed it should be evicted from the cache as well after the reference count has gone down to 0. Also, if not all Exceptions need the DB to be closed, we should then only close the DB for the Exceptions that cannot be recovered.
   ```
     public enum Code {
       Ok(                 (byte)0x0),
       NotFound(           (byte)0x1),
       Corruption(         (byte)0x2),
       NotSupported(       (byte)0x3),
       InvalidArgument(    (byte)0x4),
       IOError(            (byte)0x5),
       MergeInProgress(    (byte)0x6),
       Incomplete(         (byte)0x7),
       ShutdownInProgress( (byte)0x8),
       TimedOut(           (byte)0x9),
       Aborted(            (byte)0xA),
       Busy(               (byte)0xB),
       Expired(            (byte)0xC),
       TryAgain(           (byte)0xD),
       Undefined(          (byte)0x7F);
   ```
   NotFound shows up as an error code that is can be sent up as part of the RocksDBException, is the normal key get impacted with closing the DB on Exception? 
   ```
     public byte[] get(ColumnFamily family, byte[] key) throws IOException {
       try {
         return db.get(family.getHandle(), key);
       } catch (RocksDBException e) {
         close();
         throw toIOException("get " + bytes2String(key) + " from " + family, e);
       }
     }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
kerneltime commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1059616431


   > 
   
   
   
   > > ... the DB should be closed on all exceptions?
   > 
   > @kerneltime , if we found any cases that the DB should not be closed, we may change the code later.
   > 
   > > ... Also, what is the expectation for the LRU cache that holds references to the reference-counted DBs?
   > 
   > When an object needs to be closed so that it won't be access anymore. We probably won't have to care about the LRU cache. No?
   
   I was referring to the cache of open DBs maintained by the Datanode and the closing via reference counting that is implemented.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1042559858


   > Triggered an existing bug that ColumnFamilyOptions were shared and never closed. With the change, some unit tests failed since the new code will close ColumnFamilyOptions.
   
   The bug mentioned above is now fixed in the this pull request.  "All checks have passed" as show in https://github.com/apache/ozone/runs/5202916023


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo edited a comment on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
szetszwo edited a comment on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1040359052


   Triggered an existing bug that ColumnFamilyOptions were shared and never closed. With the change, some unit tests failed since the new code will close ColumnFamilyOptions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1068693649


   @kerneltime , anymore comments?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1040359052


   The change triggered an existing bug that ColumnFamilyOptions weres shared and never closed.  With the change, some unit tests failed since the new code will close ColumnFamilyOptions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1040359052


   The change triggered an existing bug that ColumnFamilyOptions weres shared and never closed.  With the change, some unit tests failed since the new code will close ColumnFamilyOptions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1055111318


   > ... the DB should be closed on all exceptions?
   
   @kerneltime , if we found any cases that the DB should not be closed, we may change the code later.
   
   > ... Also, what is the expectation for the LRU cache that holds references to the reference-counted DBs?
   
   When an object needs to be closed so that it won't be access anymore.  We probably won't have to care about the LRU cache.  No?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #3091: HDDS-6323. Close RocksObject(s).

Posted by GitBox <gi...@apache.org>.
kerneltime commented on pull request #3091:
URL: https://github.com/apache/ozone/pull/3091#issuecomment-1055106287


   The wrapping of underlying DB to ensure that close is in the right direction, though I am not sure if the expectation is that the DB should be closed on all exceptions? Also, what is the expectation for the LRU cache that holds references to the reference-counted DBs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org