You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/03/31 10:59:32 UTC

[GitHub] [druid] clintropolis commented on pull request #11040: add avro stream input format

clintropolis commented on pull request #11040:
URL: https://github.com/apache/druid/pull/11040#issuecomment-810975092


   >Sorry, I don't understand what you said. Do you mean an integration test is developing for Avro stream and when it finished, I can add a new json about this test? Or I need create this integration test by myself?
   
   Ah sorry, let me try to explain a bit more. So in the case of avro inline schema and avro schema registry, you should be able to just add the JSON files with the `InputFormat` template and get the tests for free. The integration test is [`ITKafkaIndexingServiceDataFormatTest`](https://github.com/apache/druid/blob/master/integration-tests/src/test/java/org/apache/druid/tests/parallelized/ITKafkaIndexingServiceDataFormatTest.java) is used to test the same data with kafka streaming using a variety of different data formats which are supported by kafka. It works by iterating over the JSON templates in https://github.com/apache/druid/tree/master/integration-tests/src/test/resources/stream/data to test each data-format present in that directory with the same set of data. For each of these data formats in the integration tests, there is implemented a corresponding [`EventSerializer`](https://github.com/apache/druid/blob/master/integration-tests/src/main/java/org/apache/druid/testing/util
 s/EventSerializer.java), which writes data for the tests to the Kafka stream for the format, and the parser or inputFormat templates are then used to construct a supervisor to read the data and run the actual tests to verify stuff is working correctly.
   
   [`AvroEventSerializer`](https://github.com/apache/druid/blob/master/integration-tests/src/main/java/org/apache/druid/testing/utils/AvroEventSerializer.java), and [`AvroSchemaRegistryEventSerializer`](https://github.com/apache/druid/blob/master/integration-tests/src/main/java/org/apache/druid/testing/utils/AvroSchemaRegistryEventSerializer.java) are already present because there are integration tests using the Avro stream parsers ([avro inline](https://github.com/apache/druid/blob/master/integration-tests/src/test/resources/stream/data/avro/parser/input_row_parser.json) and [avro schema registry](https://github.com/apache/druid/blob/master/integration-tests/src/test/resources/stream/data/avro_schema_registry/parser/input_row_parser.json)), they are just missing the input format templates because it didn't exist until this PR (JSON, CSV, and TSV do have input format templates, [which might be useful as a reference](https://github.com/apache/druid/tree/master/integration-tests/src/te
 st/resources/stream/data)). You should just be able to adapt those parser templates into the equivalent input format template.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org