You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2022/02/14 19:31:21 UTC

[GitHub] [accumulo] keith-turner opened a new issue #2495: Explore consistently formatting all ids in Accumulo logs.

keith-turner opened a new issue #2495:
URL: https://github.com/apache/accumulo/issues/2495


   Accumulo has multiple different id types like table ids, FATE transaction ids, namespace ids, external compactions ids, scan session ids, etc.  These ids are logged.  For logging fate transaction ids an attempt was made to logs these very consistently by always calling [this function](https://github.com/apache/accumulo/blob/4dc8d3effc9eeccd514317b3028bc10e1cf71c22/core/src/main/java/org/apache/accumulo/fate/FateTxId.java#L55) when logging them.   This method consistently formats the id and includes a type.  The type is nice for two reasons, first it make the ID easily distinguishable when scanning logs.  Second its nice to be able to grep a log just for the TYPE and see all log messages that have a FATE transaction id in them.
   
   This concept could be expanded to all ID types in Accumulo.  We could consistently format type and id for all ids in Accumulo.  We could logs ids like `FATE[123]`, `TABLE_ID[456]`, `ECID[abd]`, `NAMESPACE_ID[xyz]`, etc in the logs.  Seeing `TABLE_ID[456]` in a log message is much better than seeing `456` in the log, especially for logs messages where its not clear what `456` is.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1039472733


   This issue was opened based on discussion on #2492 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
milleruntime commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040155418






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1039487902


   For table and namespace IDs, this would somewhat conflict with the goal to keep `toString()` equivalent to `canonical()` in the API contract for `AbstractId`.
   
   However, I had been thinking about a proper `Namespace` and `Table` object, which would be a pair of `(name, ID)`, which could be used internally, and provide a similar result via its `toString()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
milleruntime commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040172379


   With the concrete types we have implemented in `TableId` and `NamespaceId`, I don't think it is a big of deal anymore to have the `toString()` method return a formatted string. I think most places it is used is in string concatenation and logging. The other places where want a `canonical()` ID it is also pretty clear.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
milleruntime commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040155418


   We could just create our own interface like `StringBuilderFormattable`, call it something like `Loggable`. Then make whatever objects we want to have a log friendly `toString()` implement it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040430095


   I think having toString different from canonical is error prone. People will make mistakes, and not realize... whether it's concatenating a path in ZK, constructing a metadata entry, or just an innocuous log statement, mistakes will occur, and they will not be easy to detect, because toString is not explicit in string concatenation, it's merely implied.
   
   However, ultimately, I don't think it's necessary to bother with trying to format the toString for table IDs and namespace IDs. Most of the time when we want to log, we want to log the table / namespace, not just the table ID / namespace ID. I would prefer to have a concrete type for Table and Namespace, whose toString is a formatted representation of both the name and its ID. I'd much rather see something like `deleting TABLE[NAME=mytable;ID=2c]` than `deleting TABLEID(2c)`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1039472733






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040421907


   So far it seems like there are three viable options (with the log4j specific option not being viable) 
   
   - Modify toString() on AbstractId to format the id and include a type
     - Cons: have to find all places where this is not suitable and change to canonical().  
     - Pros: New code written that logs the id will automatically use the formatted id.  For user facing output devs usually test that when writing it and if they don;t like toString() output they can switch to canonical().
   - Add a new method to AbstractId to format the id and include a type.
      - Cons: Have to modify all existing logging to call new method.  Have to remember to use the new method when writing new logging stmts.
      - Pros: canonical() and toString() are the same, the more I think about this not a huge pro.
   - Do nothing
     - Cons : Logging does consistently include type w/ id.
     - Pros: So easy to do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1039487902






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040750939


   I'd have to mine through tickets to see if there's a current issue for it. I know it was a part of the set of ideas I had for the 2.0 API, to have a concrete Table and Namespace object, as well as part of the recent work I had been looking into about standardizing how we pass table information over RPCs (as name or id). I think Table and Namespace formatting would be covered by that, but this issue might still apply to other kinds of identifiers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040454976


   >  People will make mistakes, and not realize... whether it's concatenating a path in ZK, constructing a metadata entry
   
   That is a really good con. I edited my earlier comment to add it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040543967


   > However, ultimately, I don't think it's necessary to bother with trying to format the toString for table IDs and namespace IDs. Most of the time when we want to log, we want to log the table / namespace, not just the table ID / namespace ID. I would prefer to have a concrete type for Table and Namespace, whose toString is a formatted representation of both the name and its ID. I'd much rather see something like deleting TABLE[NAME=mytable;ID=2c] than deleting TABLEID(2c).
   
   That would be really nice.  I would love to see something like `TABLE[NAME=mytable;ID=2c]` consistently in the logs.  Is there an issue for that?  If we open an issue for that, then we could probably close this.  I don't think there is anything general to do.  If we create a new  FateTxID type then how it logs can be tailored specifically for it, but there is nothing more general to do in the code base for other id types.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner edited a comment on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner edited a comment on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040421907


   So far it seems like there are three viable options (with the log4j specific option not being viable) 
   
   - Modify toString() on AbstractId to format the id and include a type
     - Cons: have to find all places where this is not suitable and change to canonical().  Could lead to errors when constructing metadata table entries and ZK/DFS paths.
     - Pros: New code written that logs the id will automatically use the formatted id.  For user facing output devs usually test that when writing it and if they don;t like toString() output they can switch to canonical().
   - Add a new method to AbstractId to format the id and include a type.
      - Cons: Have to modify all existing logging to call new method.  Have to remember to use the new method when writing new logging stmts.
      - Pros: canonical() and toString() are the same, the more I think about this not a huge pro.
   - Do nothing
     - Cons : Logging does consistently include type w/ id.
     - Pros: So easy to do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1039751024


   > The problem with using something other than toString(), is that we have to remember to explicitly convert it ourselves, and we frequently forget to do that. A log builder or some other wrapper around logging could force us to do it, though (because its API could require some kind of Loggable object instead of accepting strings).
   
   Looked around for ways to have custom formatting for objects in log4j and found this :
   
   https://logging.apache.org/log4j/log4j-2.17.0/log4j-api/apidocs/org/apache/logging/log4j/util/StringBuilderFormattable.html
   
   ran the following test
   
   ```java
   package org.apache.accumulo.test;
   
   import org.apache.logging.log4j.util.StringBuilderFormattable;
   import org.slf4j.LoggerFactory;
   
   public class Test {
     public static class Foo implements StringBuilderFormattable {
       @Override public void formatTo(StringBuilder buffer) {
         buffer.append("TABLE_ID[99]");
       }
     }
   
     @org.junit.Test
     public void test1(){
       var log = LoggerFactory.getLogger("TL1");
       log.info("This is only a test: {}", new Foo());
     }
   }
   ```
   
   which printed 
   
   ```
   2022-02-14T20:20:35,625 [TL1] INFO : This is only a test: TABLE_ID[99]
   ```
   
   So maybe the abstract id type could implement this interface for logging the id.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040409228


   >  I don't think it is a big of deal anymore to have the toString() method return a formatted string. I think most places it is used is in string concatenation and logging.
   
   If the majority use of toString is for logging then maybe the simplest thing is to make toString format its return. There may be user facing outputs from utilities/shell that we may want to handle differently, if those are a minority then maybe they could just call canonical().


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1039699492


   > We could have `String toLog()` method or something similar that is short and can be used when logging the id.
   
   The problem with using something other than `toString()`, is that we have to remember to explicitly convert it ourselves, and we frequently forget to do that. A log builder or some other wrapper around logging could force us to do it, though (because its API could require some kind of `Loggable` object instead of accepting strings).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner edited a comment on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner edited a comment on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040421907


   So far it seems like there are three viable options (with the log4j specific option not being viable) 
   
   - Modify toString() on AbstractId to format the id and include a type
     - Cons: have to find all places where this is not suitable and change to canonical().  Could lead to errors when constructing metadata table entries and ZK/DFS paths.
     - Pros: New code written that logs the id will automatically use the formatted id.  For user facing output devs usually test that when writing it and if they don;t like toString() output they can switch to canonical().
   - Add a new method to AbstractId to format the id and include a type.
      - Cons: Have to modify all existing logging to call new method.  Have to remember to use the new method when writing new logging stmts.
      - Pros: canonical() and toString() are the same, the more I think about this not a huge pro.
   - Do nothing
     - Cons : Logging does consistently include type w/ id.
     - Pros: So easy to do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] EdColeman commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
EdColeman commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040217942


   Not sure if it would matter - but ids are effectively (if not outright) immutable - the output string could be created on instantiation of the id so that concatenation / formatting would not need to occur on every log statement - the toString would just return a fixed string of the instance. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] EdColeman commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
EdColeman commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1040217942


   Not sure if it would matter - but ids are effectively (if not outright) immutable - the output string could be created on instantiation of the id so that concatenation / formatting would not need to occur on every log statement - the toString would just return a fixed string of the instance. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
keith-turner commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1039550843


   > For table and namespace IDs, this would somewhat conflict with the goal to keep toString() equivalent to canonical() in the API contract for AbstractId.
   
   I was not really sure of the best way to implement this, was not sure about changing toString() on existing id type.  I like toString() being the same as canonical() personally, because that is nice when someone adds their own context.   We could have `String toLog()` method or something similar that is short and can be used when logging the id.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] ctubbsii commented on issue #2495: Explore consistently formatting all ids in Accumulo logs.

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #2495:
URL: https://github.com/apache/accumulo/issues/2495#issuecomment-1039888132


   > ```
   > 2022-02-14T20:20:35,625 [TL1] INFO : This is only a test: TABLE_ID[99]
   > ```
   
   Using `slf4j-simple` instead of `log4j-slf4j-impl`, this prints:
   
   ```
   [main] INFO TL1 - This is only a test: Test$Foo@545997b1
   ```
   
   So, it only works with the `log4j-slf4j-impl` at runtime, and not other slf4j implementations.
   
   > So maybe the abstract id type could implement this interface for logging the id.
   
   That would require having a hard dependency on log4j again in code and leaks this 3rd party library class into the class hierarchy of our public API, both of which have caused us problems historically and we have previously moved to resolve.
   
   Given the API leakage, the tight coupling with log4j, and the fact that it won't work with any other slf4j binding, I don't think this solution is viable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org