You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/10/27 22:54:39 UTC

[GitHub] [pinot] walterddr opened a new issue #7652: Add system/metadata table

walterddr opened a new issue #7652:
URL: https://github.com/apache/pinot/issues/7652


   Background
   ===
   Currently Pinot doesn't have a way to store system metadata (other than looking through the logs)
   
   This makes it super hard to debug, for example when I debug with @dongxiaoman in issue #7612 . It is super hard to figure out exactly which segment causes the AIOOB exception other than doing a log search. 
   
   Pitch
   ===
   I was wondering if we can create some sort of system logging mechanism such as: https://dev.mysql.com/doc/refman/5.7/en/server-logs.html.
   
   Draft Design Ideas
   ===
   comparing with the mysql server logs. we can start with several subcategory of system metadata tables such as
   - server error logs ( indexed by server ID / segment ID ?)
   - query logs (controller / broker / server, indexed by requestID ?)
   - minion task logs (taskID?)
   
   I am not sure if it is a good idea to add another step to write to a system metadata table in broker/server/controller/minion, especially if it potentially delays the query response time or minion task runtime. I was wondering if we can create a pinot-metadata SPI similar to pinot-metrics SPI so that we can easily plug in different metadata store backends and metadata writer (agent-based, direct push, JDBC, ... )
   
   Thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on issue #7652: Add system/metadata table

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on issue #7652:
URL: https://github.com/apache/pinot/issues/7652#issuecomment-957897274






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on issue #7652: Add system/metadata table

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on issue #7652:
URL: https://github.com/apache/pinot/issues/7652#issuecomment-957897274


   Have you considered adding an interface to publishing such metadata to a stream? You can then consume it in Pinot and get analytics on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on issue #7652: Add system/metadata table

Posted by GitBox <gi...@apache.org>.
walterddr commented on issue #7652:
URL: https://github.com/apache/pinot/issues/7652#issuecomment-958021398


   > Have you gone through https://docs.pinot.apache.org/developers/developers-and-contributors/contribution-guidelines ?
   > 
   > I would like to see a detailed design doc on what problem is being solved, and what areas are covered (more than logs? Any link with metrics? etc.) Overall, my understanding is that you are trying to ease Pinot operation. Please correct me if I am wrong.
   > 
   > thanks.
   
   sounds good. I will create one. thanks for the pointer and the review in advance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on issue #7652: Add system/metadata table

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on issue #7652:
URL: https://github.com/apache/pinot/issues/7652#issuecomment-957897274


   Have you considered adding an interface to publishing such metadata to a stream? You can then consume it in Pinot and get analytics on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on issue #7652: Add system/metadata table

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on issue #7652:
URL: https://github.com/apache/pinot/issues/7652#issuecomment-958074211


   > > Have you gone through https://docs.pinot.apache.org/developers/developers-and-contributors/contribution-guidelines ?
   > > I would like to see a detailed design doc on what problem is being solved, and what areas are covered (more than logs? Any link with metrics? etc.) Overall, my understanding is that you are trying to ease Pinot operation. Please correct me if I am wrong.
   > > thanks.
   > 
   > sounds good. I will create one. thanks for the pointer and the review in advance.
   
   Thank you, I can review the design doc. Please be sure to include what kiind of events you want to capture, and how it is going to help in operation. We have been thinking about this problem now, and would really like to have some long-term (months/years) data and trends captured. As examples:
   - controllers emit some table characteristics (number of segments, avg/min/max/median segment size, etc.)
   - minions emit some events for each segment they process (table name, segmentname, num records remved, etc.)
   - merger emits some number of segments processed
   - Brokers emit the RequestStatistics (or a part of it)
   - Multi-tenant realtime installations can benefit from knowing the "load" on each realtime server over time (how many segments are being handled, overall consumption rates, query rates, etc.) A rebalance command can analyze past trends to try to rebalance correctly, for example


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on issue #7652: Add system/metadata table

Posted by GitBox <gi...@apache.org>.
walterddr commented on issue #7652:
URL: https://github.com/apache/pinot/issues/7652#issuecomment-957968337


   > Have you considered adding an interface to publishing such metadata to a stream? You can then consume it in Pinot and get analytics on it.
   
   this sounds like a great idea. in fact I was thinking about making the `collect(metadata)` as a stream interface or something similar to real-time segment upload logic. 
   
   The main idea for this SPI is to 
   1. create a separate user-facing API to log "important" logs that are different from the normal info/warn/error
   2. enforce some type of schema on the log itself instead of a generic JSON blob
   3. separate the "collect" and the "flush".
   
   I was not my intention to limit the interface to publish to disk/database but not a stream.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on issue #7652: Add system/metadata table

Posted by GitBox <gi...@apache.org>.
walterddr commented on issue #7652:
URL: https://github.com/apache/pinot/issues/7652#issuecomment-957968337


   > Have you considered adding an interface to publishing such metadata to a stream? You can then consume it in Pinot and get analytics on it.
   
   this sounds like a great idea. in fact I was thinking about making the `collect(metadata)` as a stream interface or something similar to real-time segment upload logic. 
   
   The main idea for this SPI is to 
   1. create a separate user-facing API to log "important" logs that are different from the normal info/warn/error
   2. enforce some type of schema on the log itself instead of a generic JSON blob
   3. separate the "collect" and the "flush".
   
   I was not my intention to limit the interface to publish to disk/database but not a stream.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr commented on issue #7652: Add system/metadata table

Posted by GitBox <gi...@apache.org>.
walterddr commented on issue #7652:
URL: https://github.com/apache/pinot/issues/7652#issuecomment-957968337






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on issue #7652: Add system/metadata table

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on issue #7652:
URL: https://github.com/apache/pinot/issues/7652#issuecomment-958005394


   Have you gone through https://docs.pinot.apache.org/developers/developers-and-contributors/contribution-guidelines ? 
   
   I would like to see a detailed design doc on what problem is being solved, and what areas are covered (more than logs? Any link with metrics? etc.) Overall, my understanding is that you are trying to ease Pinot operation. Please correct me if I am wrong.
   
   thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org