You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flume.apache.org by "Israel Ekpo (Jira)" <ji...@apache.org> on 2021/04/09 18:42:00 UTC

[jira] [Updated] (FLUME-1988) Add Support for Additional Deserializers for SpoolingDirectorySource

     [ https://issues.apache.org/jira/browse/FLUME-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Israel Ekpo updated FLUME-1988:
-------------------------------
    Status: Open  (was: Patch Available)

> Add Support for Additional Deserializers for SpoolingDirectorySource
> --------------------------------------------------------------------
>
>                 Key: FLUME-1988
>                 URL: https://issues.apache.org/jira/browse/FLUME-1988
>             Project: Flume
>          Issue Type: New Feature
>          Components: Docs, Sinks+Sources
>    Affects Versions: 1.4.0
>            Reporter: Israel Ekpo
>            Assignee: Israel Ekpo
>            Priority: Major
>              Labels: serializers
>         Attachments: EventDeserializerType.java, RegexDelimiterDeSerializer.java, ResettableTestStringInputStream.java, TestRegexDelimiterDeSerializer.java
>
>
> There are certain use cases for SpoolingDirectorySource where the events in the log file are not delimited with newline characters.
> Certain log files that contain stack traces, xml documents and pretty JSON strings seem to contain multiple new line characters within each event.
> We can use alternative logic such as specific characters, strings or regular expressions to determine when the event is complete.
> Hence I am proposing the following new deserializers based on org.apache.flume.serialization.LineDeserializer
> # org.apache.flume.serialization.RegexDelimiterDeSerializer
> Allows the user to specify a regular expression that is a delimiter for events within the log file
> # org.apache.flume.serialization.CharSequenceDelimiterDeSerializer
> Allows the user to specify a comma separated character sequence that is a delimiter for events within the log file
> The user will specify an integer for the ascii characters and we will use that as the delimter.
> For example support for \r\n could be specified as 13,10
> A list of codes is available at http://www.asciitable.com/
> We will also need to update the user guide with examples on how to configure and specify a custom deserializer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@flume.apache.org
For additional commands, e-mail: issues-help@flume.apache.org