You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Edward Sargisson (JIRA)" <ji...@apache.org> on 2013/06/18 18:16:20 UTC

[jira] [Commented] (FLUME-2089) ElasticSearchSink raises YAMLException when event body has unexpected encoding.

    [ https://issues.apache.org/jira/browse/FLUME-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686878#comment-13686878 ] 

Edward Sargisson commented on FLUME-2089:
-----------------------------------------

Hi Allan,
Thank you for the patch - I'm really glad we've got it.

A question for you: is this not mostly a character set encoding issue? You would know from your data - which is why I'm asking.
i.e. ContentBuilderUtil in addSimpleField() uses the platform's default character set - which may not be appropriate for all data.
I'm wondering if we need a parameter to the ElasticSearchSink to allow this to be set.

Your thoughts?

Secondly, a quibble. The es 0.90.1 upgrade is in FLUME-2049.

Lastly, once you're happy, perhaps you could put your patch into a review board at reviews.apache.org and set the Group to Flume. That will progress it along.

Thanks again!
                
> ElasticSearchSink raises YAMLException when event body has unexpected encoding.
> -------------------------------------------------------------------------------
>
>                 Key: FLUME-2089
>                 URL: https://issues.apache.org/jira/browse/FLUME-2089
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.4.0, v1.3.1
>            Reporter: Edward Sargisson
>
> Detected by Allan Feid and documented on the user list http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%3CCAN94UWe6UvcOKT1S%2BXANC-sy0qFsxet3RJY9PVkj-eSfO5fk6Q%40mail.gmail.com%3E
> Steps:
> Send an event with the body as follows:
> foo¤data¤1371126476.436¤0.005¤555¤10.1.1.1¤HTTP/1.1¤GET¤http¤vhost¤/path/url¤¤-¤200¤
> referrer.com/search/?query=\x8D\x91\x89\xEF\x8Bc\x8E\x96\x93\xB0¤-¤-¤-
> Expected Results:
> The event is stored in elasticsearch.
> Actual Results:
> >> 10 Jun 2013 09:52:34,360 ERROR
> >> [SinkRunner-PollingRunner-DefaultSinkProcessor]
> >> (org.apache.flume.SinkRunner$PollingRunner.run:160)  - Unable to deliver
> >> event. Exception follows.
> >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.error.YAMLException:
> >> java.io.CharConversionException: Invalid UTF-8 start byte 0xfc (at char
> >> #81, byte #-1)
> >>  at
> >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:198)
> >> at
> >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.<init>(StreamReader.java:62)
> >>  at
> >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLParser.<init>(YAMLParser.java:147)
> >> at
> >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLFactory._createParser(YAMLFactory.java:530)
> >>  at
> >> org.elasticsearch.common.jackson.dataformat.yaml.YAMLFactory.createJsonParser(YAMLFactory.java:420)
> >> at
> >> org.elasticsearch.common.xcontent.yaml.YamlXContent.createParser(YamlXContent.java:83)
> >>  at
> >> org.apache.flume.sink.elasticsearch.ContentBuilderUtil.addComplexField(ContentBuilderUtil.java:61)
> >> at
> >> org.apache.flume.sink.elasticsearch.ContentBuilderUtil.appendField(ContentBuilderUtil.java:47)
> >>  at
> >> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.appendBody(ElasticSearchLogStashEventSerializer.java:87)
> >> at
> >> org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.getContentBuilder(ElasticSearchLogStashEventSerializer.java:79)
> >>  at
> >> org.apache.flume.sink.elasticsearch.ElasticSearchSink.process(ElasticSearchSink.java:178)
> >> at
> >> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> >>  at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> >> at java.lang.Thread.run(Thread.java:662)
> >> Caused by: java.io.CharConversionException: Invalid UTF-8 start byte 0xfc
> >> (at char #81, byte #-1)
> >> at
> >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.reportInvalidInitial(UTF8Reader.java:395)
> >>  at
> >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.read(UTF8Reader.java:247)
> >> at
> >> org.elasticsearch.common.jackson.dataformat.yaml.UTF8Reader.read(UTF8Reader.java:157)
> >>  at
> >> org.elasticsearch.common.jackson.dataformat.yaml.snakeyaml.reader.StreamReader.update(StreamReader.java:182)
> >> ... 13 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira