You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@fluo.apache.org by GitBox <gi...@apache.org> on 2022/09/28 13:18:38 UTC

[GitHub] [fluo-uno] milleruntime opened a new issue, #286: tserver logging lost on restart

milleruntime opened a new issue, #286:
URL: https://github.com/apache/fluo-uno/issues/286

   I have multiple tservers running on the same instance. If I kill one of them (kill -9 PID) and restart using the Accumulo scripts (accumulo-cluster or accumulo-service) then the logging for the restarted tserver gets lost or munged into the same log as another.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@fluo.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [fluo-uno] milleruntime commented on issue #286: tserver logging lost on restart

Posted by GitBox <gi...@apache.org>.
milleruntime commented on issue #286:
URL: https://github.com/apache/fluo-uno/issues/286#issuecomment-1262550649

   The problem isn't the logs get overridden, the problem is that there is no logging for a restarted tserver. For example:
   I start a cluster with 2 tservers:
   <pre>
   12:46:25 {main} ~/workspace/uno/install/logs/accumulo$ grep "address = " *.log
   tserver1_ip-10-113-14-231.log:2022-09-29T12:45:04,103 [tserver.TabletServer] INFO : address = localhost:9997
   tserver2_ip-10-113-14-231.log:2022-09-29T12:45:04,279 [tserver.TabletServer] INFO : address = localhost:10000
   </pre>
   I kill one of them and tserver2 running on 10000 is dead. I restart the tserver using `accumulo-cluster start-tservers` and now there are 2 running again.
   <pre>
   uno status
   Accumulo processes running: tserver(19392) manager(19736) gc(19777) monitor(19825) tserver(21303) 
   </pre>
   But only one of the logs gets updated and I never see the second tserver logging to anything:
   <pre>
   12:52:13 {main} ~/workspace/uno/install/logs/accumulo$ grep "address = " *.log
   tserver1_ip-10-113-14-231.log:2022-09-29T12:45:04,103 [tserver.TabletServer] INFO : address = localhost:9997
   tserver2_ip-10-113-14-231.log:2022-09-29T12:45:04,279 [tserver.TabletServer] INFO : address = localhost:10000
   12:56:07 {main} ~/workspace/uno/install/logs/accumulo$ ls -ltr tserver*.log
   -rw-r--r-- 1 mpmill4 domain users  69700 Sep 29 12:48 tserver2_ip-10-113-14-231.log
   -rw-r--r-- 1 mpmill4 domain users 116116 Sep 29 12:56 tserver1_ip-10-113-14-231.log
   </pre>
   Only the log for tserver1_ip-10-113-14-231.log gets updated and I never see the address 10000 starting up logged anywhere, even though Accumulo sees it and it seems to function fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@fluo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [fluo-uno] ctubbsii commented on issue #286: tserver logging lost on restart

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #286:
URL: https://github.com/apache/fluo-uno/issues/286#issuecomment-1262496514

   I'm not sure this is a problem. If you need the logs for the previous run separate, then you can just copy them before you restart, right? Uno should keep things pretty simple, so we probably don't want to make it too complex with tracking different restart iterations for logging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@fluo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [fluo-uno] ctubbsii commented on issue #286: tserver logging lost on restart

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #286:
URL: https://github.com/apache/fluo-uno/issues/286#issuecomment-1262623491

   Okay, I see. I was able to reproduce this with only one server, and I saw the same thing for all services, not just tserver. If I used `uno accumulo start` instead of `$ACCUMULO_HOME/bin/accumulo-cluster start`, then the logging was updated fine. So, it seems to be specifically related to the manual use of `accumulo-cluster`.
   
   One difference, when I look at `/proc/<PID>/environ`, is that using `accumulo-cluster` directly does not set `ACCUMULO_LOG_DIR`. Looking at `install/accumulo-2.1.0-SNAPSHOT/conf/accumulo-env.sh`, I can see that `ACCUMULO_LOG_DIR` is defaulting to `install/accumulo-2.1.0-SNAPSHOT/logs`. And, sure enough, when I look there, the logs for the restarted processes can be found there.
   
   So, this looks like a situation where the behavior of Uno is to try to put the logs in an special place specifically for Uno, but manually running `accumulo-cluster` or other non-Uno scripts to modify Accumulo will cause it to run with its own environment.
   
   There are a few solutions to this that I can think of:
   
   1. Instead of Uno configuring the `ACCUMULO_LOG_DIR` with an environment variable for itself, it can modify the `accumulo-env.sh` script so that it stores the Uno preferred location for logs, for any subsequent operations that aren't aware of Uno.
   2. Instead of Uno trying to customize the location of the log directory at all, it could just let Accumulo use its default location, and create a link to it at `install/logs/accumulo` that points to the Accumulo's standard location.
   3. We could modify `uno env` to ensure `ACCUMULO_LOG_DIR` is exported, and if you want Accumulo scripts to be aware of Uno's preferences, you'll just have to run `source <(uno env)` before you run the script, like `accumulo-cluster` that bypasses Uno.
   
   The last option is easiest and least disruptive, but user's will still see this issue if they forget to source the environment. The first option is the most foolproof, but requires us to be more careful that our modifications to the Accumulo environment file work across Accumulo versions. The second option is a decent middle-ground, but may require us to pay a bit of attention to how we reset/wipe the cluster to ensure we're clearing out old files differently than how we're currently doing it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@fluo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org