You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/06/25 17:37:30 UTC

[GitHub] [hive] zabetak opened a new pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

zabetak opened a new pull request #2432:
URL: https://github.com/apache/hive/pull/2432


   ### What changes were proposed in this pull request?
   1. Handle dynamic routing appenders via Log4j's `IdlePurgePolicy`
   2. Remove in-house management of appenders.
   
   ### Why are the changes needed?
   1. Solve descriptor leak as shown in HIVE-24590
   2. Delegate burden of managing appenders to Log4j.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   `mvn test -Dtest=TestOperationLoggingLayout`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-957925686


   @zabetak Thanks for the clarification.  I work a lot supporting Hive so I need to know what impact this change has.
   
   What is responsible for cleaning up these files?  Could there be a case where the cleanup process happens and then the file is re-created and never scheduled for cleanup again?
   
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr edited a comment on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr edited a comment on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past.
   
   Thank you for helping to document this behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
zabetak commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958309007


   > What is responsible for cleaning up these files? Could there be a case where the cleanup process happens and then the file is re-created and never scheduled for cleanup again?
   
   @belugabehr It's been a while since I have worked on this code but if I remember well the operation log file deletion is handled by `OperationLogCleaner`. When the query/operation ends it is registered for cleanup (see https://github.com/apache/hive/blob/6e152aa28bc5116bf9210f9deb0f95d2d73183f7/service/src/java/org/apache/hive/service/cli/operation/Operation.java#L323) . The deletion may happen immediately or with some delay as specified by the properties. 
   
   If for some reason there is a log event for the **same** query after the cleanup then I assume the file will be recreated and it will never be removed from the file-system. This could happen even without the changes introduced in this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
zabetak commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-909110409


   Hey @prasanthj , can you have a final look on this for getting this in? It seems that more and more people are bumping into the problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
zabetak commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-951288032


   > Ya, nothing about this solution feels ideal. How was the default of 60s chosen? That seems pretty short to me if there's a stall in the query engine (or anywhere else).
   
   Even if there is a big delay and the 60sec window passes there is no problem. The appender will be closed and it will reopen again when the next log event arrives. Having a purge policy guarantees that there will be no leak no matter what happens.
   
   Something that may become problematic is the arrival of many (in the order of thousands) queries in the 60 sec window. This will lead to the creation of many appenders and usage of many file descriptors. However, this might never be a problem for Hive, at least not before other parts of the system need to be fixed first.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] nrg4878 commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
nrg4878 commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-941055126


   @belugabehr Could you please review this if you have some cycles? Much appreciated. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-957925686


   @zabetak Thanks for the clarification.  I work a lot supporting Hive so I need to know what impact this change has.
   
   What is responsible for cleaning up these files?  Could there be a case where the cleanup process happens and then the file is re-created and never scheduled for cleanup again?
   
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
zabetak commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-951762656


   > So, what would the affect be if the time were to pass and another write request came in? Would it create a second file? Append to the end of an existing file? Is the original log files always deleted when the logger goes idle?
   
   The deletion of the log files is not handled by Log4j (not in this case at least). When the logger goes idle, the appender is closed along with the respective file descriptor but the file remains as is. When the appender comes back it will write to the same file; if for whatever reason the file is not there it will create a file with the same name and start writing there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-957925686


   @zabetak Thanks for the clarification.  I work a lot supporting Hive so I need to know what impact this change has.
   
   What is responsible for cleaning up these files?  Could there be a case where the cleanup process happens and then the file is re-created and never scheduled for cleanup again?
   
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr edited a comment on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr edited a comment on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past.
   
   Thank you for helping to document this behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr edited a comment on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr edited a comment on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past.
   
   Thank you for helping to document this behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr edited a comment on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr edited a comment on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past.
   
   Thank you for helping to document this behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-951245726


   Hey,
   
   > Ideally the whole programmatic registration of the appender should be replaced by log4j.properties or other more standard mechanism but this is a bit more complicated to do so I would leave it outside the scope of this PR.
   
   Ya, nothing about this solution feels ideal.  How was the default of 60s chosen?  That seems pretty short to me if there's a stall in the query engine (or anywhere else).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-951493240


   @zabetak Thanks for the clarification.  Makes sense.
   
   So, what would the affect be if the time were to pass and another write request came in?  Would it create a second file?  Append to the end of an existing file? Is the original log files always deleted when the logger goes idle?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] belugabehr commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
zabetak commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-909110409


   Hey @prasanthj , can you have a final look on this for getting this in? It seems that more and more people are bumping into the problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
zabetak commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-909110409


   Hey @prasanthj , can you have a final look on this for getting this in? It seems that more and more people are bumping into the problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
zabetak commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-872010471


   Hi @prasanthj , following your last comment on https://github.com/apache/hive/pull/1849 I added Hive property to control the lifespan of appenders in https://github.com/apache/hive/pull/2432/commits/bc683ec2d6959e5aba7d216abcd060a07e854d46. 
   
   Ideally the whole programmatic registration of the appender should be replaced by `log4j.properties` or other more standard mechanism but this is a bit more complicated to do so I would leave it outside the scope of this PR. 
   
   All the tests are green, so if you can have a final look and merge it that would be great since apparently more people are hitting this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
zabetak commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-954569967


   Hey @belugabehr do you have further comments regarding this PR? Can we move this forward and merge it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak closed pull request #2432: HIVE-24590: Operation logging still leaks log4j appenders

Posted by GitBox <gi...@apache.org>.
zabetak closed pull request #2432:
URL: https://github.com/apache/hive/pull/2432


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org