You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by Gian Merlino <gi...@apache.org> on 2019/10/22 15:44:36 UTC

Custom Parser

Hey Tony,

I accidentally rejected your message to the Druid dev list about writing a
custom parser, by fat-fingering the item in the moderation queue. Sorry
about that!

You had asked about being pointed in a useful direction in terms of writing
a custom parser for a proprietary data format. You might find
https://github.com/implydata/druid-example-extension a useful example. It
is written against an older version of Druid but the same basic structure
should work on the latest version.

If that doesn't help enough, please feel free to write back! By the way, if
you join the dev list (email dev-subscribe@druid.apache.org) then your
messages will go through without the need for moderation.

Gian

Re: Custom Parser

Posted by Gian Merlino <gi...@apache.org>.
That JSON spec and module code looks ok to me. Some things to try verifying:

1) Make sure your extension is built against the same version of Druid that
you are deploying (in the pom.xml). Using different Druid versions can
cause compatibility problems.
2) Double check that MyTypeDruidParser has a @JsonProperty("parseSpec")
ParseSpec in its constructor.
3) Maybe try walking back to a known good state for an extension. If you
use the https://github.com/implydata/druid-example-extension (with pom.xml
modification for the version of Druid that you are deploying) does the
example in the README work?

On Tue, Oct 22, 2019 at 10:29 AM Tony Schwartz <to...@gmail.com>
wrote:

> Thank you.  I've tried a lot of different things.  I'm just fighting it.
> Right now, I'm stuck getting this error:
> Error: HTML Error: java.lang.IllegalArgumentException: Unexpected token
> (END_OBJECT), expected FIELD_NAME: missing property 'format' that is to
> contain type id (for class org.apache.druid.data.input.impl.ParseSpec) at
> [Source: N/A; line: -1, column: -1]
>
> My json spec looks like:
> {
>   "type": "kafka",
>   "ioConfig": {
>     "type": "kafka",
>     "consumerProperties": {
>       "bootstrap.servers": "kafkaserver:9092"
>     },
>     "topic": "event"
>   },
>   "tuningConfig": {
>     "type": "kafka"
>   },
>   "dataSchema": {
>     "dataSource": "event",
>     "granularitySpec": {
>       "type": "uniform",
>       "segmentGranularity": "HOUR",
>       "queryGranularity": "HOUR"
>     },
>     "parser": {
>       "type": "mytype",
>       "parseSpec": {
>         "format": "mytype",
>         "timestampSpec": {
>           "column": "timestamp",
>           "format": "auto"
>         },
>         "dimensionsSpec": {
>           "dimensions": []
>         }
>       }
>     }
>   }
> }
>
>
> my module looks like:
>
> public class MyTypeEventDruidParserModule implements DruidModule {
>    @Override
>    public List<? extends Module> getJacksonModules() {
>       return ImmutableList.of(
>          new SimpleModule( "LighthouseEventDruidParserModule"
> ).registerSubtypes(
>             new NamedType( MyTypeDruidParser.class, "mytype"),
>             new NamedType( MyTypeDruidParseSpec.class, "mytype")
>          )
>       );
>    }
>
>    @Override
>    public void configure(Binder binder) {
>    }
> }
>
>
> On Tue, Oct 22, 2019 at 11:44 AM Gian Merlino <gi...@apache.org> wrote:
>
> > Hey Tony,
> >
> > I accidentally rejected your message to the Druid dev list about writing
> a
> > custom parser, by fat-fingering the item in the moderation queue. Sorry
> > about that!
> >
> > You had asked about being pointed in a useful direction in terms of
> writing
> > a custom parser for a proprietary data format. You might find
> > https://github.com/implydata/druid-example-extension a useful example.
> It
> > is written against an older version of Druid but the same basic structure
> > should work on the latest version.
> >
> > If that doesn't help enough, please feel free to write back! By the way,
> if
> > you join the dev list (email dev-subscribe@druid.apache.org) then your
> > messages will go through without the need for moderation.
> >
> > Gian
> >
>

Re: Custom Parser

Posted by Tony Schwartz <to...@gmail.com>.
Thank you.  I've tried a lot of different things.  I'm just fighting it.
Right now, I'm stuck getting this error:
Error: HTML Error: java.lang.IllegalArgumentException: Unexpected token
(END_OBJECT), expected FIELD_NAME: missing property 'format' that is to
contain type id (for class org.apache.druid.data.input.impl.ParseSpec) at
[Source: N/A; line: -1, column: -1]

My json spec looks like:
{
  "type": "kafka",
  "ioConfig": {
    "type": "kafka",
    "consumerProperties": {
      "bootstrap.servers": "kafkaserver:9092"
    },
    "topic": "event"
  },
  "tuningConfig": {
    "type": "kafka"
  },
  "dataSchema": {
    "dataSource": "event",
    "granularitySpec": {
      "type": "uniform",
      "segmentGranularity": "HOUR",
      "queryGranularity": "HOUR"
    },
    "parser": {
      "type": "mytype",
      "parseSpec": {
        "format": "mytype",
        "timestampSpec": {
          "column": "timestamp",
          "format": "auto"
        },
        "dimensionsSpec": {
          "dimensions": []
        }
      }
    }
  }
}


my module looks like:

public class MyTypeEventDruidParserModule implements DruidModule {
   @Override
   public List<? extends Module> getJacksonModules() {
      return ImmutableList.of(
         new SimpleModule( "LighthouseEventDruidParserModule"
).registerSubtypes(
            new NamedType( MyTypeDruidParser.class, "mytype"),
            new NamedType( MyTypeDruidParseSpec.class, "mytype")
         )
      );
   }

   @Override
   public void configure(Binder binder) {
   }
}


On Tue, Oct 22, 2019 at 11:44 AM Gian Merlino <gi...@apache.org> wrote:

> Hey Tony,
>
> I accidentally rejected your message to the Druid dev list about writing a
> custom parser, by fat-fingering the item in the moderation queue. Sorry
> about that!
>
> You had asked about being pointed in a useful direction in terms of writing
> a custom parser for a proprietary data format. You might find
> https://github.com/implydata/druid-example-extension a useful example. It
> is written against an older version of Druid but the same basic structure
> should work on the latest version.
>
> If that doesn't help enough, please feel free to write back! By the way, if
> you join the dev list (email dev-subscribe@druid.apache.org) then your
> messages will go through without the need for moderation.
>
> Gian
>