You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/05/17 11:44:12 UTC

[GitHub] [druid] quanghuynguyen1902 opened a new issue #11263: Can't stream data with kafka when use druid with docker

quanghuynguyen1902 opened a new issue #11263:
URL: https://github.com/apache/druid/issues/11263


   I have file docker-compose as follow: 
   
   ```javascript
   zookeeper:
       container_name: zookeeper
       image: zookeeper:3.5
       networks:
         - nginx
       environment:
         - ZOO_MY_ID=1
   
     coordinator:
       image: apache/druid:0.20.2
       container_name: coordinator
       volumes:
         - ./storage:/opt/data
         - ./coordinator_var:/opt/druid/var
       depends_on: 
         - zookeeper
         - postgres
       ports:
         - "8081:8081"
       command:
         - coordinator
       networks:
         - nginx
       env_file:
         - environment
   
     broker:
       image: apache/druid:0.20.2
       container_name: broker
       volumes:
         - ./broker_var:/opt/druid/var
       depends_on: 
         - zookeeper
         - postgres
         - coordinator
       ports:
         - "8082:8082"
       command:
         - broker
       networks:
         - nginx
       env_file:
         - environment
   
     historical:
       image: apache/druid:0.20.2
       container_name: historical
       volumes:
         - ./storage:/opt/data
         - ./historical_var:/opt/druid/var
       depends_on: 
         - zookeeper
         - postgres
         - coordinator
       ports:
         - "8083:8083"
       command:
         - historical
       networks:
         - nginx
       env_file:
         - environment
   
     middlemanager:
       image: apache/druid:0.20.2
       container_name: middlemanager
       volumes:
         - ./storage:/opt/data
         - ./middle_var:/opt/druid/var
       depends_on: 
         - zookeeper
         - postgres
         - coordinator
       ports:
         - "8091:8091"
       command:
         - middleManager
       networks:
         - nginx
       env_file:
         - environment
   
     router:
       image: apache/druid:0.20.2
       container_name: router
       volumes:
         - ./router_var:/opt/druid/var
       depends_on:
         - zookeeper
         - postgres
         - coordinator
       ports:
         - "8888:8888"
       command:
         - router
       networks:
         - nginx
       env_file:
         - environment
   ```
   
   and I connect kakfa stream data as follow: 
   ```python
   import requests
   
   # api push data to druid
   druidURL = 'http://coordinator:8081/druid/indexer/v1/supervisor'
   druid_schema_path = "kafka.json"
   headers =  {'content-type': 'application/json'}
   with open(druid_schema_path, 'rb') as f:
       response = requests.post(druidURL, headers=headers, data=f).json()
       print(response)
   ```
   and file kafka.json
   ```json
   {
       "type": "kafka",
       "dataSchema": {
         "dataSource": "requests",
         "timestampSpec": {
           "column": "timestamp"
         },
         "dimensionsSpec": {
           "dimensions" : [
             "created_at",
             "client",
             "status",
             "url",
             "user_agent",
             "request_method",
             "upstream_connect_time",
             "upstream_header_time",
             "upstream_response_time",
             "request_time",
             "size",
             "user_id",
             "app_id"
           ]
         },
         "granularitySpec": {
           "type": "uniform",
           "segmentGranularity": "HOUR",
           "queryGranularity": "NONE"
         }
       },
       "ioConfig": {
         "topic": "requests",
         "inputFormat": {
           "type": "json"
         },
         "consumerProperties": {
           "bootstrap.servers": "kafka:9092"
         },
         "taskCount": 1,
         "replicas": 1,
         "taskDuration": "PT1H"
       },
       "tuningConfig": {
         "type": "kafka",
         "maxRowsPerSegment": 5000000
       }
     }
   ```
   when i run file python error 
   ```python
   could not resolve type id 'kafka' as a subtype of `org.apache.druid.indexing.overlord.supervisor.supervisor spec
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on issue #11263: Can't stream data with kafka when use druid with docker

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on issue #11263:
URL: https://github.com/apache/druid/issues/11263#issuecomment-852734803


   Seems like kafka indexing extension is not loaded. Is `druid-kafka-indexing-service` in the list of property `druid_extensions_loadList` which is defined in the `environment` file ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org