You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/11/22 19:00:07 UTC

[GitHub] [druid] techdocsmith opened a new pull request #11975: clarify avro support & general style improvements

techdocsmith opened a new pull request #11975:
URL: https://github.com/apache/druid/pull/11975


   This PR updates the Avro extension doc to clarify currently supported inputformats and parsers.
   
   This PR has:
   - [ x] been self-reviewed.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r755835703



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:

Review comment:
       "extension enables Druid" ... "extension enables the following.." seems repetitive




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754569276



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`
+- Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.

Review comment:
       ```suggestion
   - Complex named types are keyed by their names, this includes `record`, `fixed`, and `enum`.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754575578



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`
+- Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
+- The Avro null type is elided as its value can only ever be null
 
-This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed.
-i.e only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
+This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed. For example: only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
 
-The members can then be accessed using a [flattenSpec](../../ingestion/data-formats.md#flattenspec) similar other nested types.
+You can then access the members of the union with a [flattenSpec](../../ingestion/data-formats.md#flattenspec) like you would for other nested types.
 
-#### Binary types
-`bytes` and `fixed` Avro types will be returned by default as base64 encoded strings unless the `binaryAsString` option is enabled on the Avro parser.
-This setting will decode these types as UTF-8 strings.
+### Binary types
+The extension returns `bytes` and `fixed` Avro types as base64 encoded strings by default. If you enable the `binaryAsString` option on the Avro parser, the extension decodes these types as UTF-8 strings.

Review comment:
       ```suggestion
   By default, the extension returns `bytes` and `fixed` Avro types as base64 encoded strings. To decode these types as UTF-8 strings, enable the `binaryAsString` option on the Avro parser.
   ```
   Moved some text around to improve readability.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754569453



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`
+- Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
+- The Avro null type is elided as its value can only ever be null

Review comment:
       ```suggestion
   - The Avro null type is elided as its value can only ever be null.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754566091



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`

Review comment:
       ```suggestion
   - Primitive types and unnamed complex types are keyed by their type name, such as `int`, `string`.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754566091



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`

Review comment:
       ```suggestion
   - Primitive types and unnamed complex types are keyed by their type name, such as `int` and `string`.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754571156



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`
+- Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
+- The Avro null type is elided as its value can only ever be null
 
-This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed.
-i.e only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
+This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed. For example: only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.

Review comment:
       ```suggestion
   This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed. For example, only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754561850



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types

Review comment:
       ```suggestion
   ## Avro types
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754583908



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`
+- Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
+- The Avro null type is elided as its value can only ever be null
 
-This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed.
-i.e only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
+This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed. For example: only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
 
-The members can then be accessed using a [flattenSpec](../../ingestion/data-formats.md#flattenspec) similar other nested types.
+You can then access the members of the union with a [flattenSpec](../../ingestion/data-formats.md#flattenspec) like you would for other nested types.
 
-#### Binary types
-`bytes` and `fixed` Avro types will be returned by default as base64 encoded strings unless the `binaryAsString` option is enabled on the Avro parser.
-This setting will decode these types as UTF-8 strings.
+### Binary types
+The extension returns `bytes` and `fixed` Avro types as base64 encoded strings by default. If you enable the `binaryAsString` option on the Avro parser, the extension decodes these types as UTF-8 strings.
 
-#### Enums
-`enum` types will be returned as `string` of the enum symbol.
+### Enums
+The extension returns `enum` types  as `string` of the enum symbol.
 
-#### Complex types
-`record` and `map` types representing nested data can be ingested using [flattenSpec](../../ingestion/data-formats.md#flattenspec) on the parser.
+### Complex types
+You can ingest `record` and `map` types representing nested data with a [flattenSpec](../../ingestion/data-formats.md#flattenspec) on the parser.
 
-#### Logical types
-Druid doesn't currently support Avro logical types, they will be ignored and fields will be handled according to the underlying primitive type.
+### Logical types
+Druid doesn't currently support Avro logical types. It ignores them and handles fields according to the underlying primitive type.

Review comment:
       ```suggestion
   Druid does not currently support Avro logical types. It ignores them and handles fields according to the underlying primitive type.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754566091



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`

Review comment:
       ```suggestion
   - Primitive types and unnamed complex types are keyed by their type name. For example: `int`, `string`.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754566091



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`

Review comment:
       ```suggestion
   - Primitive types and unnamed complex types are keyed by their type name. For example: `int`, `string`
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754561644



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).

Review comment:
       ```suggestion
   To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions) for more information.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754575578



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`
+- Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
+- The Avro null type is elided as its value can only ever be null
 
-This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed.
-i.e only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
+This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed. For example: only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
 
-The members can then be accessed using a [flattenSpec](../../ingestion/data-formats.md#flattenspec) similar other nested types.
+You can then access the members of the union with a [flattenSpec](../../ingestion/data-formats.md#flattenspec) like you would for other nested types.
 
-#### Binary types
-`bytes` and `fixed` Avro types will be returned by default as base64 encoded strings unless the `binaryAsString` option is enabled on the Avro parser.
-This setting will decode these types as UTF-8 strings.
+### Binary types
+The extension returns `bytes` and `fixed` Avro types as base64 encoded strings by default. If you enable the `binaryAsString` option on the Avro parser, the extension decodes these types as UTF-8 strings.

Review comment:
       ```suggestion
   The extension returns `bytes` and `fixed` Avro types as base64 encoded strings by default. To decode these types as UTF-8 strings, enable the `binaryAsString` option on the Avro parser.
   ```
   Moved some text around to improve readability.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ektravel commented on a change in pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
ektravel commented on a change in pull request #11975:
URL: https://github.com/apache/druid/pull/11975#discussion_r754576394



##########
File path: docs/development/extensions-core/avro.md
##########
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format. The Avro extension enables the following ingestion input methods:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions).
 
-### Avro Types
+## Avro Types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed their type name. For example: `int`, `string`
+- Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
+- The Avro null type is elided as its value can only ever be null
 
-This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed.
-i.e only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
+This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed. For example: only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
 
-The members can then be accessed using a [flattenSpec](../../ingestion/data-formats.md#flattenspec) similar other nested types.
+You can then access the members of the union with a [flattenSpec](../../ingestion/data-formats.md#flattenspec) like you would for other nested types.
 
-#### Binary types
-`bytes` and `fixed` Avro types will be returned by default as base64 encoded strings unless the `binaryAsString` option is enabled on the Avro parser.
-This setting will decode these types as UTF-8 strings.
+### Binary types
+The extension returns `bytes` and `fixed` Avro types as base64 encoded strings by default. If you enable the `binaryAsString` option on the Avro parser, the extension decodes these types as UTF-8 strings.
 
-#### Enums
-`enum` types will be returned as `string` of the enum symbol.
+### Enums
+The extension returns `enum` types  as `string` of the enum symbol.

Review comment:
       ```suggestion
   The extension returns `enum` types as `string` of the enum symbol.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] asdf2014 merged pull request #11975: clarify avro support & general style improvements

Posted by GitBox <gi...@apache.org>.
asdf2014 merged pull request #11975:
URL: https://github.com/apache/druid/pull/11975


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org