You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/06/02 16:05:24 UTC
[GitHub] [pinot] navina opened a new issue, #8819: Updating schema with new MV column causes index failures and stops consumption
navina opened a new issue, #8819:
URL: https://github.com/apache/pinot/issues/8819
One of the users added a new MV column to the schema and reloaded the segments. Fyi, the data in the topic doesn't have any values for this new field. It failed with the following exception:
```2022/05/31 19:47:55.833 ERROR [LLRealtimeSegmentDataManager_calls__0__856__20220531T1847Z] [calls__0__856__20220531T1847Z] Could not build segment
java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0
at org.apache.pinot.segment.local.segment.index.readers.constant.ConstantMVForwardIndexReader.getDictIdMV(ConstantMVForwardIndexReader.java:48) ~[startree-pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.
11.0-SNAPSHOT-b92cbb4df88eee0bf0fbb39dfc434387370edc8e]
at org.apache.pinot.segment.local.segment.readers.PinotSegmentColumnReader.getValue(PinotSegmentColumnReader.java:89) ~[startree-pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-b92cbb4df88e
ee0bf0fbb39dfc434387370edc8e]
at org.apache.pinot.segment.local.segment.readers.PinotSegmentRecordReader.getRecord(PinotSegmentRecordReader.java:226) ~[startree-pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-b92cbb4df8
8eee0bf0fbb39dfc434387370edc8e]
at org.apache.pinot.segment.local.segment.readers.PinotSegmentRecordReader.next(PinotSegmentRecordReader.java:215) ~[startree-pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-b92cbb4df88eee0
bf0fbb39dfc434387370edc8e]
at org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl.build(SegmentIndexCreationDriverImpl.java:218) ~[startree-pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSH
OT-b92cbb4df88eee0bf0fbb39dfc434387370edc8e]
at org.apache.pinot.segment.local.realtime.converter.RealtimeSegmentConverter.build(RealtimeSegmentConverter.java:123) ~[startree-pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-b92cbb4df88
eee0bf0fbb39dfc434387370edc8e]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.buildSegmentInternal(LLRealtimeSegmentDataManager.java:839) [startree-pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-b92cbb4df88eee0bf0fbb39dfc434387370edc8e]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.buildSegmentForCommit(LLRealtimeSegmentDataManager.java:766) [startree-pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-b92cbb4df88eee0bf0fbb39dfc434387370edc8e]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:665) [startree-pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-b92cbb4df88eee0bf0fbb39dfc434387370edc8e]
at java.lang.Thread.run(Thread.java:829) [?:?]
```
## Steps to reproduce
1. Run the `RealtimeQuickstart`
2. Load some data
3. `Edit schema` to add a new MV column
4. Reload the segments
### Things to note
* The quickstart is using High level consumer, where as , the customer was using low level consumer.
* When reproducing, the same exception happens. However, this time the stacktrace is from the query path.
```
2022/06/02 11:31:20.041 ERROR [BaseCombineOperator] [pqw-6] Caught exception while processing query: QueryContext{_tableName='meetupRsvp_REALTIME', _subquery=null, _selectExpressions=[event_id, event_name, event_tags, event_time, group_city, group_country, group_id, group_lat, group_lon, group_name, location, mtime, rsvp_count, venue_name], _aliasList=[null, null, null, null, null, null, null, null, null, null, null, null, null, null], _filter=null, _groupByExpressions=null, _havingFilter=null, _orderByExpressions=null, _limit=10, _offset=0, _queryOptions={responseFormat=sql, groupByMode=sql, timeoutMs=10000}, _expressionOverrideHints={}, _explain=false}
java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0
at org.apache.pinot.segment.local.segment.index.readers.constant.ConstantMVForwardIndexReader.getDictIdMV(ConstantMVForwardIndexReader.java:48) ~[classes/:?]
at org.apache.pinot.core.common.DataFetcher$ColumnValueReader.readStringValuesMV(DataFetcher.java:701) ~[classes/:?]
at org.apache.pinot.core.common.DataFetcher.fetchStringValues(DataFetcher.java:391) ~[classes/:?]
at org.apache.pinot.core.common.DataBlockCache.getStringValuesForMVColumn(DataBlockCache.java:462) ~[classes/:?]
at org.apache.pinot.core.operator.docvalsets.ProjectionBlockValSet.getStringValuesMV(ProjectionBlockValSet.java:179) ~[classes/:?]
at org.apache.pinot.core.common.RowBasedBlockValueFetcher.createFetcher(RowBasedBlockValueFetcher.java:84) ~[classes/:?]
at org.apache.pinot.core.common.RowBasedBlockValueFetcher.<init>(RowBasedBlockValueFetcher.java:33) ~[classes/:?]
at org.apache.pinot.core.operator.query.SelectionOnlyOperator.getNextBlock(SelectionOnlyOperator.java:97) ~[classes/:?]
at org.apache.pinot.core.operator.query.SelectionOnlyOperator.getNextBlock(SelectionOnlyOperator.java:40) ~[classes/:?]
at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:39) ~[classes/:?]
at org.apache.pinot.core.operator.combine.BaseCombineOperator.processSegments(BaseCombineOperator.java:158) ~[classes/:?]
at org.apache.pinot.core.operator.combine.BaseCombineOperator$1.runJob(BaseCombineOperator.java:101) [classes/:?]
at org.apache.pinot.core.util.trace.TraceRunnable.run(TraceRunnable.java:40) [classes/:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111) [guava-20.0.jar:?]
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) [guava-20.0.jar:?]
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) [guava-20.0.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
```
* Another exception seen, tangential to this issue is a `NumberFormatException` in `SegmentsValidationAndRetentionConfig` and `SegmentStatusChecker`
```
2022/06/02 11:32:53.528 ERROR [SegmentStatusChecker] [pool-17-thread-4] Caught exception while updating segment status for table meetupRsvp_REALTIME
java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Integer.java:614) ~[?:?]
at java.lang.Integer.parseInt(Integer.java:770) ~[?:?]
at org.apache.pinot.spi.config.table.SegmentsValidationAndRetentionConfig.getReplicasPerPartitionNumber(SegmentsValidationAndRetentionConfig.java:176) ~[classes/:?]
at org.apache.pinot.controller.helix.SegmentStatusChecker.updateTableConfigMetrics(SegmentStatusChecker.java:143) ~[classes/:?]
at org.apache.pinot.controller.helix.SegmentStatusChecker.processTable(SegmentStatusChecker.java:113) ~[classes/:?]
at org.apache.pinot.controller.helix.SegmentStatusChecker.processTable(SegmentStatusChecker.java:56) ~[classes/:?]
at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.processTables(ControllerPeriodicTask.java:116) ~[classes/:?]
at org.apache.pinot.controller.helix.core.periodictask.ControllerPeriodicTask.runTask(ControllerPeriodicTask.java:85) ~[classes/:?]
at org.apache.pinot.core.periodictask.BasePeriodicTask.run(BasePeriodicTask.java:150) ~[classes/:?]
at org.apache.pinot.core.periodictask.BasePeriodicTask.run(BasePeriodicTask.java:135) ~[classes/:?]
at org.apache.pinot.core.periodictask.PeriodicTaskScheduler.lambda$start$0(PeriodicTaskScheduler.java:87) ~[classes/:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] navina commented on issue #8819: Updating schema with new MV column causes index failures and stops consumption
Posted by GitBox <gi...@apache.org>.
navina commented on issue #8819:
URL: https://github.com/apache/pinot/issues/8819#issuecomment-1218392411
This issue can be reproduced by adding the following test query in `BaseClusterIntegrationTestSet#testReload` :
`SELECT SUMMV(NewIntMVDimension) FROM " + rawTableName`
The problem wasn't caught in the existing integration tests because the simple `select *` type of queries is by default limited to 10 records. If we increase limit in the query, it is very expensive and likely to run out of heap space. So, using an aggregated function on a column ensures that the query fetches from the consuming segment as well.
The test has been updated in the PR #9205
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang closed issue #8819: Updating schema with new MV column causes index failures and stops consumption
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang closed issue #8819: Updating schema with new MV column causes index failures and stops consumption
URL: https://github.com/apache/pinot/issues/8819
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] navina commented on issue #8819: Updating schema with new MV column causes index failures and stops consumption
Posted by GitBox <gi...@apache.org>.
navina commented on issue #8819:
URL: https://github.com/apache/pinot/issues/8819#issuecomment-1145058747
Looks like I can't assign it to myself. But I will pick this up!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org