You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by as...@apache.org on 2021/11/28 08:10:36 UTC

[druid] branch master updated: clarify avro support & general style improvements (#11975)

This is an automated email from the ASF dual-hosted git repository.

asdf2014 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
     new f536f31  clarify avro support & general style improvements (#11975)
f536f31 is described below

commit f536f31229cc4b842762646fa9ec0ba28148b1ed
Author: Charles Smith <te...@gmail.com>
AuthorDate: Sun Nov 28 00:10:18 2021 -0800

    clarify avro support & general style improvements (#11975)
    
    * clarify avro support & general style improvements
    
    * Update docs/development/extensions-core/avro.md
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
    
    * Update docs/development/extensions-core/avro.md
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
    
    * Update docs/development/extensions-core/avro.md
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
    
    * Update docs/development/extensions-core/avro.md
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
    
    * Update docs/development/extensions-core/avro.md
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
    
    * Update docs/development/extensions-core/avro.md
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
    
    * Update docs/development/extensions-core/avro.md
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
    
    * Update docs/development/extensions-core/avro.md
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
    
    * Update docs/development/extensions-core/avro.md
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
    
    * Update avro.md
    
    remove redundancy
    
    Co-authored-by: Katya Macedo  <38...@users.noreply.github.com>
---
 docs/development/extensions-core/avro.md | 54 +++++++++++++++-----------------
 1 file changed, 25 insertions(+), 29 deletions(-)

diff --git a/docs/development/extensions-core/avro.md b/docs/development/extensions-core/avro.md
index 70346d5..ac1b7ef 100644
--- a/docs/development/extensions-core/avro.md
+++ b/docs/development/extensions-core/avro.md
@@ -22,47 +22,43 @@ title: "Apache Avro"
   ~ under the License.
   -->
 
-## Avro extension
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format as follows:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
 
-This Apache Druid extension enables Druid to ingest and understand the Apache Avro data format. This extension provides 
-two Avro Parsers for stream ingestion and Hadoop batch ingestion. 
-See [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser) and [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser)
-for more details about how to use these in an ingestion spec.
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
 
-Additionally, it provides an InputFormat for reading Avro OCF files when using
-[native batch indexing](../../ingestion/native-batch.md), see [Avro OCF](../../ingestion/data-formats.md#avro-ocf)
-for details on how to ingest OCF files.
+## Load the Avro extension
 
-Make sure to [include](../../development/extensions.md#loading-extensions) `druid-avro-extensions` in the extensions load list.
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../development/extensions.md#loading-extensions) for more information.
 
-### Avro Types
+## Avro types
 
-Druid supports most Avro types natively, there are however some exceptions which are detailed here.
+Druid supports most Avro types natively. This section describes some  exceptions.
 
-#### Unions
+### Unions
 Druid has two modes for supporting `union` types.
 
-The default mode will treat unions as a single value regardless of the type it is populated with.
+The default mode treats unions as a single value regardless of the type of data populating the union.
 
-If you wish to operate on each different member of a union however you can set `extractUnionsByType` on the Avro parser in which case unions will be expanded into nested objects according to the following rules:
-* Primitive types and unnamed complex types are keyed their type name. i.e `int`, `string`
-* Complex named types are keyed by their names, this includes `record`, `fixed` and `enum`.
-* The Avro null type is elided as its value can only ever be null
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed by their type name, such as `int` and `string`.
+- Complex named types are keyed by their names, this includes `record`, `fixed`, and `enum`.
+- The Avro null type is elided as its value can only ever be null.
 
-This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed.
-i.e only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
+This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed. For example, only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
 
-The members can then be accessed using a [flattenSpec](../../ingestion/data-formats.md#flattenspec) similar other nested types.
+You can then access the members of the union with a [flattenSpec](../../ingestion/data-formats.md#flattenspec) like you would for other nested types.
 
-#### Binary types
-`bytes` and `fixed` Avro types will be returned by default as base64 encoded strings unless the `binaryAsString` option is enabled on the Avro parser.
-This setting will decode these types as UTF-8 strings.
+### Binary types
+The extension returns `bytes` and `fixed` Avro types as base64 encoded strings by default. To decode these types as UTF-8 strings, enable the `binaryAsString` option on the Avro parser.
 
-#### Enums
-`enum` types will be returned as `string` of the enum symbol.
+### Enums
+The extension returns `enum` types as `string` of the enum symbol.
 
-#### Complex types
-`record` and `map` types representing nested data can be ingested using [flattenSpec](../../ingestion/data-formats.md#flattenspec) on the parser.
+### Complex types
+You can ingest `record` and `map` types representing nested data with a [flattenSpec](../../ingestion/data-formats.md#flattenspec) on the parser.
 
-#### Logical types
-Druid doesn't currently support Avro logical types, they will be ignored and fields will be handled according to the underlying primitive type.
+### Logical types
+Druid does not currently support Avro logical types. It ignores them and handles fields according to the underlying primitive type.

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org