You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/10/11 05:29:58 UTC

[GitHub] [druid] zachjsh opened a new issue, #13208: Coordinator crashes trying to compact a datasource ingested through MSQ with PARTITIONED BY ALL

zachjsh opened a new issue, #13208:
URL: https://github.com/apache/druid/issues/13208

   Please provide a detailed title (e.g. "Broker crashes when using TopN query with Bound filter" instead of just "Broker crashes").
   
   ### Affected Version
   Tested on latest master branch 25.0.0-SNAPSHOT
   
   
   ### Description
   
   Ingested a json file with only the following 2 rows through MSQ
   
   ```
    % cat lookup_data_small.json 
   {"k": "stringggggggggggggggggggggggggggggggggggggggggg0", "v": "stringValuegggggggggggggggggggggggggggggggggggggg0"}
   {"k": "stringggggggggggggggggggggggggggggggggggggggggg1", "v": "stringValuegggggggggggggggggggggggggggggggggggggg1"}
   %
   ```
   
   I just kept clicking through the web-console screens, taking whatever defaults were provided. The spec generated produced an insert with a `PARTITIONED BY ALL` clause. The interval detected by this is 
   
   ```
   "intervals": {
           "type": "intervals",
           "intervals": [
             "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
           ]
         },
   ```
   
   After the table is ingested configure auto-compaction on the table. To do this I simply changed the default config to the following
   
   ```
   {
     "dataSource": "lookup_data_small",
     "granularitySpec": {
       "segmentGranularity": "DAY"
     },
     "tuningConfig": {
       "partitionsSpec": {
         "type": "dynamic"
       }
     }
   }
   ```
   
   The only difference here is the addition of the granulartySpec. 
   
   Wait for the next coorindator indexer cycle to run (I just changed by setting `druid.coordinator.period.indexingPeriod=PT60s` in coordinator runtime properties file)
   
   This causes the Coordinator to crash and take up a ton of CPU and memory before doing so. I believe the issue is in building the interval to segments map here: https://github.com/apache/druid/blob/691e26d2429cd81afef092dbda32e24ad56bb510/server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java#L121.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] paul-rogers commented on issue #13208: Coordinator crashes trying to compact a datasource ingested through MSQ with PARTITIONED BY ALL

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on issue #13208:
URL: https://github.com/apache/druid/issues/13208#issuecomment-1276428019

   @cryptoe, the other thought is that if we're working out a timeline, there are two ways to proceed. One is to enumerate all possible chunks and map segment into those chunks. The other approach is to sort and group. We should choose the approach that minimizes memory use. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] AmatyaAvadhanula commented on issue #13208: Coordinator crashes trying to compact a datasource ingested through MSQ with PARTITIONED BY ALL

Posted by GitBox <gi...@apache.org>.
AmatyaAvadhanula commented on issue #13208:
URL: https://github.com/apache/druid/issues/13208#issuecomment-1291497354

   Would it help to change the interval in the compaction spec and use it as a boundary for the generated time chunks?
   This may not work if the data is spread sparsely across a long time period though


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] AmatyaAvadhanula commented on issue #13208: Coordinator crashes trying to compact a datasource ingested through MSQ with PARTITIONED BY ALL

Posted by GitBox <gi...@apache.org>.
AmatyaAvadhanula commented on issue #13208:
URL: https://github.com/apache/druid/issues/13208#issuecomment-1302886876

   https://github.com/apache/druid/pull/13304 tries to skip compaction when segments of ALL granularity are being compacted to a finer granularity. 
   While it prevents a coordinator crash, the compaction for such datasources is not yet supported


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] cryptoe commented on issue #13208: Coordinator crashes trying to compact a datasource ingested through MSQ with PARTITIONED BY ALL

Posted by GitBox <gi...@apache.org>.
cryptoe commented on issue #13208:
URL: https://github.com/apache/druid/issues/13208#issuecomment-1276148517

   Thanks for reporting this. 
   
   I was looking into this and the issue seems to be in how we are generating the probable time chunks for an interval. 
   
   https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/coordinator/duty/NewestSegmentFirstIterator.java#L131
   
   In this case, the interval is -infinity,+infinity because of partitioned by `ALL`
   When we set the segment granularity as day in the compaction spec we generate all probable day time chunks which results in a heap issue for the coordinator. 
   
   Should we add a check/validation which fails the compaction, when the compaction changes the segment granularity to something other than `ALL` for a data source whose segment granularity is `ALL`. 
   
   Open to ideas for another approach as well. 
   
   cc @gianm @paul-rogers @maytasm 
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] paul-rogers commented on issue #13208: Coordinator crashes trying to compact a datasource ingested through MSQ with PARTITIONED BY ALL

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on issue #13208:
URL: https://github.com/apache/druid/issues/13208#issuecomment-1276425778

   @cryptoe, are we doing the probabilistic analysis to limit memory footprint when we we have a large number of segments? if so, we can start by looking at the total count of segments. If the segment is smaller than some number, I'd assume we can use a deterministic analysis. That is, if there is one segment, we don't need to sample: the one segment will fit in memory just fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on issue #13208: Coordinator crashes trying to compact a datasource ingested through MSQ with PARTITIONED BY ALL

Posted by GitBox <gi...@apache.org>.
gianm commented on issue #13208:
URL: https://github.com/apache/druid/issues/13208#issuecomment-1276835275

   It'd be better to make this work than to make this be an error. The way to do it is to avoid `Granularity.getIterable`, which is problematic when using non-ALL granularities on a large interval. There are similar issues on the query side, also caused by this behavior from `Granularity.getIterable`: see #12944, #13182, #13206. It would require adjusting some logic "inside-out".


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] AmatyaAvadhanula closed issue #13208: Coordinator crashes trying to compact a datasource ingested through MSQ with PARTITIONED BY ALL

Posted by GitBox <gi...@apache.org>.
AmatyaAvadhanula closed issue #13208: Coordinator crashes trying to compact a datasource ingested through MSQ with PARTITIONED BY ALL
URL: https://github.com/apache/druid/issues/13208


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org