You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Rich Midwinter <ri...@gmail.com> on 2015/07/23 20:09:16 UTC

Setting namespace for Avro schema before importing Twitter data

Hi

I'm using Flume to import Twitter data and if I generate an Avro schema
from the data it looks like:

{
"type" : "record",
"name" : "Doc",
"doc" : "adoc",
"fields" : [ {
 "name" : "id",
 "type" : "string"
}, {
 "name" : "user_friends_count",
 "type" : [ "int", "null" ]
}, {
 "name" : "user_location",
 "type" : [ "string", "null" ]
}, {
 "name" : "user_description",
 "type" : [ "string", "null" ]
}, {
 "name" : "user_statuses_count",
 "type" : [ "int", "null" ]
}, {
 "name" : "user_followers_count",
 "type" : [ "int", "null" ]
}, {
 "name" : "user_name",
 "type" : [ "string", "null" ]
}, {
 "name" : "user_screen_name",
 "type" : [ "string", "null" ]
}, {
 "name" : "created_at",
 "type" : [ "string", "null" ]
}, {
 "name" : "text",
 "type" : [ "string", "null" ]
}, {
 "name" : "retweet_count",
 "type" : [ "long", "null" ]
}, {
 "name" : "retweeted",
 "type" : [ "boolean", "null" ]
}, {
 "name" : "in_reply_to_user_id",
 "type" : [ "long", "null" ]
}, {
 "name" : "source",
 "type" : [ "string", "null" ]
}, {
 "name" : "in_reply_to_status_id",
 "type" : [ "long", "null" ]
}, {
 "name" : "media_url_https",
 "type" : [ "string", "null" ]
}, {
 "name" : "expanded_url",
 "type" : [ "string", "null" ]
} ]
}

Unfortunately this doesn't have a namespace tag (although I'd also like to
change the name value from Doc to something more relevant, like Tweet) and
so generated Java code is in the default package and then I can't use it as
a dependency.

Does anyone know how I can set a namespace for Flume to use?

Thanks
Rich