You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Pavel Zalunin (JIRA)" <ji...@apache.org> on 2014/09/24 23:14:35 UTC

[jira] [Created] (FLUME-2476) ContentBuilderUtil.appendField incorrectly manages json-like data

Pavel Zalunin created FLUME-2476:
------------------------------------

             Summary: ContentBuilderUtil.appendField incorrectly manages json-like data
                 Key: FLUME-2476
                 URL: https://issues.apache.org/jira/browse/FLUME-2476
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: v1.5.0.1
         Environment: elasticsearch 1.1.0, elasticsearch 0.90.13
            Reporter: Pavel Zalunin
            Priority: Blocker


There is a problem in org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer.getContentBuilder returns incorrect value, in case when Event contains well-formed json (possible as sub-string).

Example:
{code}
ElasticSearchDynamicSerializer ser = new ElasticSearchDynamicSerializer();
Event event = new SimpleEvent();
event.setBody("{\"true\":\"false\"}".getBytes());
System.out.println(ser.getContentBuilder(event).string());
//prints:
//{"body":"org.elasticsearch.common.xcontent.XContentBuilder@31a5fdb9"}
{code} 

I tried to find origins of the problem, and found this chunk of code (flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java):
{code}
public static void addComplexField(XContentBuilder builder, String fieldName,
      XContentType contentType, byte[] data) throws IOException {
    XContentParser parser = null;
    try {
      XContentBuilder tmp = jsonBuilder();
      parser = XContentFactory.xContent(contentType).createParser(data);
      parser.nextToken();
      tmp.copyCurrentStructure(parser);
      builder.field(fieldName, tmp); //here field (String, String) called, because there is no method(String,XContentBuilder) 
//maybe tmp.string() should be here instead?
{code}

It makes impossible to send any string, which contains json to elasticsearch sink using this serializer.

Maybe I'm wrong, but what are benefits from decoding, then encoding chunk of json data?

If it really needed, maybe it is possible to add some option to ElasticSearchDynamicSerializer which forces to use ContentBuilderUtil.addSimpleField instead of ContentBuilderUtil.appendField ?

I can volunteer to create patch when will be decided what is the best way to avoid this issue.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)