You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2022/01/05 09:31:44 UTC

[GitHub] [druid] junegunn opened a new issue #12120: `/druid/coordinator/v1/intervals` not returning all intervals for all datasources

junegunn opened a new issue #12120:
URL: https://github.com/apache/druid/issues/12120


   ### Affected Version
   
   0.21.0
   
   ### Description
   
   [The documentation](https://druid.apache.org/docs/latest/operations/api-reference.html#get-7) states that `/druid/coordinator/v1/intervals` returns *all intervals for all datasources*, but that doesn't seem to be the case. The intervals returned by the API are a small subset of the actual segment intervals in the cluster.
   
   Is this a bug or is this just a case of outdated documentation?
   
   FYI, here is a Ruby script I used to example the result of the API
   
   ```ruby
   #!/usr/bin/env ruby
   # frozen_string_literal: true
   
   require 'net/http'
   require 'json'
   
   ROUTER_HOST = 'FILL_IN'
   
   datasource = ARGV.first
   
   def get(url)
     Net::HTTP.get(URI(url), 'Authorization' => ENV['DRUID_AUTH'])
   end
   
   # /druid/coordinator/v1/datasources?simple
   all_datasources = JSON.parse(
     get("http://#{ROUTER_HOST}:8888/druid/coordinator/v1/datasources?simple"), symbolize_names: true
   )
   puts "Total number of segments: #{all_datasources.sum { _1.dig(:properties, :segments, :count) }}"
   puts "  #{datasource}: #{all_datasources.find { _1[:name] == datasource }.dig(:properties, :segments, :count)}"
   puts
   
   # /druid/coordinator/v1/intervals
   all_intervals = JSON.parse(get("http://#{ROUTER_HOST}:8888/druid/coordinator/v1/intervals"))
   puts "Number of all distinct intervals: #{all_intervals.length}"
   puts "Number of all intervals for all datasources: #{all_intervals.values.flat_map(&:values).length}"
   
   intervals = all_intervals.select { _2.key?(datasource) }.keys
   puts "Number of intervals for #{datasource}: #{intervals.length}"
   puts "Last 10 intervals for #{datasource}:"
   puts intervals.sort.last(10).reverse.map { "  - #{_1}" }
   puts
   
   # /druid/coordinator/v1/datasources/{datasource}/intervals
   intervals = JSON.parse(get("http://#{ROUTER_HOST}:8888/druid/coordinator/v1/datasources/#{datasource}/intervals"))
   puts "Number of intervals (datasource specified): #{intervals.length}"
   ```
   
   And it reports
   
   ```
   Total number of segments: 616782
     pivot-action-log: 7097
   
   Number of all distinct intervals: 10055
   Number of all intervals for all datasources: 26739
   Number of intervals for pivot-action-log: 176
   Last 10 intervals for pivot-action-log:
     - 2022-01-04T14:00:00.000Z/2022-01-04T15:00:00.000Z
     - 2022-01-02T11:00:00.000Z/2022-01-02T12:00:00.000Z
     - 2022-01-01T01:00:00.000Z/2022-01-01T02:00:00.000Z
     - 2022-01-01T00:00:00.000Z/2022-01-01T01:00:00.000Z
     - 2021-12-31T08:00:00.000Z/2021-12-31T09:00:00.000Z
     - 2021-12-30T13:00:00.000Z/2021-12-30T14:00:00.000Z
     - 2021-12-28T19:00:00.000Z/2021-12-28T20:00:00.000Z
     - 2021-12-28T10:00:00.000Z/2021-12-28T11:00:00.000Z
     - 2021-12-27T11:00:00.000Z/2021-12-27T12:00:00.000Z
     - 2021-12-26T08:00:00.000Z/2021-12-26T09:00:00.000Z
   
   Number of intervals (datasource specified): 7076
   ```
   
   (We can clearly see gaps in the intervals)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org