You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/04/23 06:26:32 UTC

[GitHub] [incubator-druid] mihai-cazacu-adswizz edited a comment on issue #7510: [materialized view] IOException: user supplied segments list did not match with segments list obtained from db

mihai-cazacu-adswizz edited a comment on issue #7510: [materialized view] IOException: user supplied segments list did not match with segments list obtained from db
URL: https://github.com/apache/incubator-druid/issues/7510#issuecomment-485418674
 
 
   I've added some logs:
   
   ```
   if (segmentsList.size() == userSuppliedSegmentsList.size()) {
     Set<DataSegment> segmentsSet = new HashSet<>(segmentsList);
   
     for (DataSegment userSegment : userSuppliedSegmentsList) {
       if (!segmentsSet.contains(userSegment)) {
         log.error("Segment %s is not present in %s",
                 userSegment,
                 segmentsSet.stream()
                         .map(DataSegment::toString)
                         .collect(Collectors.joining(","))
         );
         throw new IOException("user supplied segments list did not match with segments list obtained from db");
       }
     }
   } else {
     log.error("Different segment size: %d != %d", segmentsList.size(), userSuppliedSegmentsList.size());
     throw new IOException("user supplied segments list did not match with segments list obtained from db");
   }
   ```
   
   and this is the error message:
   
   ```
   2019-04-22T13:15:22,551 ERROR [task-runner-0-priority-0] org.apache.druid.indexer.HadoopIngestionSpec - Different segment size: 37 != 7
   2019-04-22T13:15:22,551 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.common.task.HadoopIndexTask - Encountered exception in run():
   java.io.IOException: user supplied segments list did not match with segments list obtained from db
   	at org.apache.druid.indexer.HadoopIngestionSpec.updateSegmentListIfDatasourcePathSpecIsUsed(HadoopIngestionSpec.java:204) ~[druid-indexing-hadoop-0.13.1-incubating-SNAPSHOT.jar:0.13.1-incubating-SNAPSHOT]
   	at org.apache.druid.indexing.common.task.HadoopIndexTask.runInternal(HadoopIndexTask.java:265) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:232) [druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:421) [druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:393) [druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_201]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
   ```
   
   In Aurora I've found `7` rows by running:
   
   ```
   SELECT *
   FROM druid_segments
   WHERE
       dataSource = "the-data-source"
       AND used   = true
       AND start  >= "2018-01-29T00:00:00.000Z"
       AND end    <= "2018-02-05T00:00:00.000Z";
   ```
   
   which is the number specified by the aforementioned exception.
   
   The S3 is also containing `7` files:
   ```
   aws s3 ls --profile meru-prod s3://.../segments/the-data-source/2018-01-29T00:00:00.000Z_2018-02-05T00:00:00.000Z/2018-05-24T09:30:16.545Z/
                              PRE 0/
                              PRE 1/
                              PRE 2/
                              PRE 3/
                              PRE 4/
                              PRE 5/
                              PRE 6/
   ```
   
   I don't know yet why `segmentsList` has `37` elements.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org