You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by vi...@apache.org on 2023/05/19 18:22:46 UTC

[druid] branch docs-kafka-format created (now 3ff071dfa4)

This is an automated email from the ASF dual-hosted git repository.

victoria pushed a change to branch docs-kafka-format
in repository https://gitbox.apache.org/repos/asf/druid.git


      at 3ff071dfa4 list the encoding formats

This branch includes the following new commits:

     new 3ff071dfa4 list the encoding formats

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[druid] 01/01: list the encoding formats

Posted by vi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

victoria pushed a commit to branch docs-kafka-format
in repository https://gitbox.apache.org/repos/asf/druid.git

commit 3ff071dfa411884b90c3dc18e4d593fc5db27a69
Author: Victoria Lim <vt...@users.noreply.github.com>
AuthorDate: Fri May 19 11:22:39 2023 -0700

    list the encoding formats
---
 docs/development/extensions-core/kafka-ingestion.md | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/docs/development/extensions-core/kafka-ingestion.md b/docs/development/extensions-core/kafka-ingestion.md
index 7a4b49f173..46426e55f2 100644
--- a/docs/development/extensions-core/kafka-ingestion.md
+++ b/docs/development/extensions-core/kafka-ingestion.md
@@ -153,7 +153,13 @@ You would configure it as follows:
 
 - `valueFormat`: Define how to parse the payload value. Set this to the payload parsing input format (`{ "type": "json" }`).
 - `timestampColumnName`: Supply a custom name for the Kafka timestamp in the Druid schema to avoid conflicts with columns from the payload. The default is `kafka.timestamp`.
-- `headerFormat`: The default "string" decodes UTF8-encoded strings from the Kafka header. If you need another format, you can implement your own parser.
+- `headerFormat`: The default value `string` decodes strings in UTF-8 encoding from the Kafka header.
+   Other supported encoding formats include the following:
+   - `ISO-8859-1`: ISO Latin Alphabet No. 1, that is, ISO-LATIN-1.
+   - `US-ASCII`: Seven-bit ASCII. Also known as ISO646-US. The Basic Latin block of the Unicode character set.
+   - `UTF-16`: Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark.
+   - `UTF-16BE`: Sixteen-bit UCS Transformation Format, big-endian byte order.
+   - `UTF-16LE`: Sixteen-bit UCS Transformation Format, little-endian byte order.
 - `headerColumnPrefix`: Supply a prefix to the Kafka headers to avoid any conflicts with columns from the payload. The default is `kafka.header.`.
   Considering the header from the example, Druid maps the headers to the following columns: `kafka.header.env`, `kafka.header.zone`.
 - `keyFormat`: Supply an input format to parse the key. Only the first value will be used.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org