You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by AnilKumar B <ak...@gmail.com> on 2014/02/03 05:04:38 UTC

Re: shifting sequenceFileOutput format to Avro format

Hi Yong,

I followed your 2nd  suggestion. My data format is is nested(list of map),
So I created .avsc as below.

{"namespace": "test.avro",
 "type": "record",
 "name": "Session",
 "fields": [
   {"name":"VisitCommon", "type": {
           "type": "map", "values":"string"},
   {"name":"events",
    "type": {
    "type": "array",
    "items":{
    "name":"Event",
    "type":"map",
    "values":"string"}
    }
    }
 ]
}

And I tried creating corresponding classes by using avro tool and with
plugin, but there are few errors on generated java code. What could be the
issue?

1) Error: The method deepCopy(Schema, List<Map<CharSequence,CharSequence>>)
is undefined for the type GenericData
2) And also observed there is some deprecated code.
 @Deprecated public
java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;

I used eclipse plugin as mentioned below.
http://avro.apache.org/docs/1.7.6/mr.html




Thanks & Regards,
B Anil Kumar.


On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com> wrote:

> Thanks Yong.
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:
>
>> In avro, you need to think about a schema to match your data. Avor's
>> schema is very flexible and should be able to store all kinds of data.
>>
>> If you have a Json string, you have 2 options to generate the Avro schema
>> for it:
>>
>> 1) Use "type: string" to store the whole Json string into Avro. This will
>> be easiest, but you have to parse the data later when you use it.
>> 2) Use Avro schema to match your json data, using matching structure from
>> avro for your data, like 'record, array, map' etc.
>>
>> Yong
>>
>> ------------------------------
>> Date: Fri, 31 Jan 2014 00:13:59 +0530
>> Subject: shifting sequenceFileOutput format to Avro format
>> From: akumarb2010@gmail.com
>> To: user@hadoop.apache.org
>>
>>
>> Hi,
>>
>> As of now in my jobs, I am using SequenceFileOutputFormat and I am
>> emitting custom java objects as MR output.
>>
>> Now I am planning to emit it in avro format, I went through  few blogs
>> but still have following doubts.
>>
>> 1) My current custom Writable objects has nested json format as
>> toString(), So when I shift to avro format, should I just emit json string
>> in avro format, instead of writable custom object?
>>
>> 2) If so, how can I create schema? My json string is nested and will have
>> random key/value pairs.
>>
>> 3) Or can I still emit as custom objects?
>>
>>
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>
>

RE: shifting sequenceFileOutput format to Avro format

Posted by java8964 <ja...@hotmail.com>.
Hi, Kumar:
I will suggest you can seek help of Avro in the Avro mailing list in the future, which can be registered here:
http://avro.apache.org/mailing_lists.html
About your schema, you missed one "}"  in your file.
yzhang$ more test.avsc{"namespace": "test.avro", "type": "record", "name": "Session", "fields": [   {"name":"VisitCommon", "type": {"type": "map", "values":"string"}},   {"name":"events", "type": {        "type": "array",        "items":{        "name":"Event",        "type":"map",        "values":"string"}        }    } ]}
yzhang$ java -jar ~/lib/avro-tools-1.7.6.jar compile schema test.avsc output/Input files to compile:  test.avsc
yzhang$ ls -ls output/test/avro/Session.java16 -rw-r--r--  1 yzhang  staff  7371 Feb  4 14:05 output/test/avro/Session.java
Date: Tue, 4 Feb 2014 22:22:53 +0530
Subject: Re: shifting sequenceFileOutput format to Avro format
From: akumarb2010@gmail.com
To: user@hadoop.apache.org

I tried with different versions of avro-maven-plugin, with 1.7.5, 1.7.6 and with jdk1.7.0_45 version.
I am unable to resolve it.
Error message is as below:

[ERROR] symbol:   method deepCopy(org.apache.avro.Schema,java.util.Map<java.lang.CharSequence,java.lang.CharSequence>)[ERROR] location: class org.apache.avro.generic.GenericData
Thanks & Regards,
B Anil Kumar.



On Tue, Feb 4, 2014 at 12:06 AM, AnilKumar B <ak...@gmail.com> wrote:

Can anyone please suggest on how to resolve this issue?Thanks & Regards,
B Anil Kumar.



On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:


Hi Yong,
I followed your 2nd  suggestion. My data format is is nested(list of map), So I created .avsc as below.
{"namespace": "test.avro",


 "type": "record", "name": "Session", "fields": [   {"name":"VisitCommon", "type": {           "type": "map", "values":"string"},


   {"name":"events",     "type": {    	"type": "array",    	"items":{


    	"name":"Event",    	"type":"map",    	"values":"string"}


    	}    } ]}
And I tried creating corresponding classes by using avro tool and with plugin, but there are few errors on generated java code. What could be the issue?



1) Error: The method deepCopy(Schema, List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData2) And also observed there is some deprecated code. @Deprecated public java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;




I used eclipse plugin as mentioned below.http://avro.apache.org/docs/1.7.6/mr.html






Thanks & Regards,
B Anil Kumar.



On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com> wrote:


Thanks Yong.Thanks & Regards,
B Anil Kumar.



On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:





In avro, you need to think about a schema to match your data. Avor's schema is very flexible and should be able to store all kinds of data.
If you have a Json string, you have 2 options to generate the Avro schema for it:




1) Use "type: string" to store the whole Json string into Avro. This will be easiest, but you have to parse the data later when you use it.2) Use Avro schema to match your json data, using matching structure from avro for your data, like 'record, array, map' etc.




Yong

Date: Fri, 31 Jan 2014 00:13:59 +0530
Subject: shifting sequenceFileOutput format to Avro format
From: akumarb2010@gmail.com




To: user@hadoop.apache.org

Hi,
As of now in my jobs, I am using SequenceFileOutputFormat and I am emitting custom java objects as MR output.




Now I am planning to emit it in avro format, I went through  few blogs but still have following doubts.

1) My current custom Writable objects has nested json format as toString(), So when I shift to avro format, should I just emit json string in avro format, instead of writable custom object? 





2) If so, how can I create schema? My json string is nested and will have random key/value pairs.
3) Or can I still emit as custom objects? 






Thanks & Regards,
B Anil Kumar.

 		 	   		  







 		 	   		  

RE: shifting sequenceFileOutput format to Avro format

Posted by java8964 <ja...@hotmail.com>.
Hi, Kumar:
I will suggest you can seek help of Avro in the Avro mailing list in the future, which can be registered here:
http://avro.apache.org/mailing_lists.html
About your schema, you missed one "}"  in your file.
yzhang$ more test.avsc{"namespace": "test.avro", "type": "record", "name": "Session", "fields": [   {"name":"VisitCommon", "type": {"type": "map", "values":"string"}},   {"name":"events", "type": {        "type": "array",        "items":{        "name":"Event",        "type":"map",        "values":"string"}        }    } ]}
yzhang$ java -jar ~/lib/avro-tools-1.7.6.jar compile schema test.avsc output/Input files to compile:  test.avsc
yzhang$ ls -ls output/test/avro/Session.java16 -rw-r--r--  1 yzhang  staff  7371 Feb  4 14:05 output/test/avro/Session.java
Date: Tue, 4 Feb 2014 22:22:53 +0530
Subject: Re: shifting sequenceFileOutput format to Avro format
From: akumarb2010@gmail.com
To: user@hadoop.apache.org

I tried with different versions of avro-maven-plugin, with 1.7.5, 1.7.6 and with jdk1.7.0_45 version.
I am unable to resolve it.
Error message is as below:

[ERROR] symbol:   method deepCopy(org.apache.avro.Schema,java.util.Map<java.lang.CharSequence,java.lang.CharSequence>)[ERROR] location: class org.apache.avro.generic.GenericData
Thanks & Regards,
B Anil Kumar.



On Tue, Feb 4, 2014 at 12:06 AM, AnilKumar B <ak...@gmail.com> wrote:

Can anyone please suggest on how to resolve this issue?Thanks & Regards,
B Anil Kumar.



On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:


Hi Yong,
I followed your 2nd  suggestion. My data format is is nested(list of map), So I created .avsc as below.
{"namespace": "test.avro",


 "type": "record", "name": "Session", "fields": [   {"name":"VisitCommon", "type": {           "type": "map", "values":"string"},


   {"name":"events",     "type": {    	"type": "array",    	"items":{


    	"name":"Event",    	"type":"map",    	"values":"string"}


    	}    } ]}
And I tried creating corresponding classes by using avro tool and with plugin, but there are few errors on generated java code. What could be the issue?



1) Error: The method deepCopy(Schema, List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData2) And also observed there is some deprecated code. @Deprecated public java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;




I used eclipse plugin as mentioned below.http://avro.apache.org/docs/1.7.6/mr.html






Thanks & Regards,
B Anil Kumar.



On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com> wrote:


Thanks Yong.Thanks & Regards,
B Anil Kumar.



On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:





In avro, you need to think about a schema to match your data. Avor's schema is very flexible and should be able to store all kinds of data.
If you have a Json string, you have 2 options to generate the Avro schema for it:




1) Use "type: string" to store the whole Json string into Avro. This will be easiest, but you have to parse the data later when you use it.2) Use Avro schema to match your json data, using matching structure from avro for your data, like 'record, array, map' etc.




Yong

Date: Fri, 31 Jan 2014 00:13:59 +0530
Subject: shifting sequenceFileOutput format to Avro format
From: akumarb2010@gmail.com




To: user@hadoop.apache.org

Hi,
As of now in my jobs, I am using SequenceFileOutputFormat and I am emitting custom java objects as MR output.




Now I am planning to emit it in avro format, I went through  few blogs but still have following doubts.

1) My current custom Writable objects has nested json format as toString(), So when I shift to avro format, should I just emit json string in avro format, instead of writable custom object? 





2) If so, how can I create schema? My json string is nested and will have random key/value pairs.
3) Or can I still emit as custom objects? 






Thanks & Regards,
B Anil Kumar.

 		 	   		  







 		 	   		  

RE: shifting sequenceFileOutput format to Avro format

Posted by java8964 <ja...@hotmail.com>.
Hi, Kumar:
I will suggest you can seek help of Avro in the Avro mailing list in the future, which can be registered here:
http://avro.apache.org/mailing_lists.html
About your schema, you missed one "}"  in your file.
yzhang$ more test.avsc{"namespace": "test.avro", "type": "record", "name": "Session", "fields": [   {"name":"VisitCommon", "type": {"type": "map", "values":"string"}},   {"name":"events", "type": {        "type": "array",        "items":{        "name":"Event",        "type":"map",        "values":"string"}        }    } ]}
yzhang$ java -jar ~/lib/avro-tools-1.7.6.jar compile schema test.avsc output/Input files to compile:  test.avsc
yzhang$ ls -ls output/test/avro/Session.java16 -rw-r--r--  1 yzhang  staff  7371 Feb  4 14:05 output/test/avro/Session.java
Date: Tue, 4 Feb 2014 22:22:53 +0530
Subject: Re: shifting sequenceFileOutput format to Avro format
From: akumarb2010@gmail.com
To: user@hadoop.apache.org

I tried with different versions of avro-maven-plugin, with 1.7.5, 1.7.6 and with jdk1.7.0_45 version.
I am unable to resolve it.
Error message is as below:

[ERROR] symbol:   method deepCopy(org.apache.avro.Schema,java.util.Map<java.lang.CharSequence,java.lang.CharSequence>)[ERROR] location: class org.apache.avro.generic.GenericData
Thanks & Regards,
B Anil Kumar.



On Tue, Feb 4, 2014 at 12:06 AM, AnilKumar B <ak...@gmail.com> wrote:

Can anyone please suggest on how to resolve this issue?Thanks & Regards,
B Anil Kumar.



On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:


Hi Yong,
I followed your 2nd  suggestion. My data format is is nested(list of map), So I created .avsc as below.
{"namespace": "test.avro",


 "type": "record", "name": "Session", "fields": [   {"name":"VisitCommon", "type": {           "type": "map", "values":"string"},


   {"name":"events",     "type": {    	"type": "array",    	"items":{


    	"name":"Event",    	"type":"map",    	"values":"string"}


    	}    } ]}
And I tried creating corresponding classes by using avro tool and with plugin, but there are few errors on generated java code. What could be the issue?



1) Error: The method deepCopy(Schema, List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData2) And also observed there is some deprecated code. @Deprecated public java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;




I used eclipse plugin as mentioned below.http://avro.apache.org/docs/1.7.6/mr.html






Thanks & Regards,
B Anil Kumar.



On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com> wrote:


Thanks Yong.Thanks & Regards,
B Anil Kumar.



On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:





In avro, you need to think about a schema to match your data. Avor's schema is very flexible and should be able to store all kinds of data.
If you have a Json string, you have 2 options to generate the Avro schema for it:




1) Use "type: string" to store the whole Json string into Avro. This will be easiest, but you have to parse the data later when you use it.2) Use Avro schema to match your json data, using matching structure from avro for your data, like 'record, array, map' etc.




Yong

Date: Fri, 31 Jan 2014 00:13:59 +0530
Subject: shifting sequenceFileOutput format to Avro format
From: akumarb2010@gmail.com




To: user@hadoop.apache.org

Hi,
As of now in my jobs, I am using SequenceFileOutputFormat and I am emitting custom java objects as MR output.




Now I am planning to emit it in avro format, I went through  few blogs but still have following doubts.

1) My current custom Writable objects has nested json format as toString(), So when I shift to avro format, should I just emit json string in avro format, instead of writable custom object? 





2) If so, how can I create schema? My json string is nested and will have random key/value pairs.
3) Or can I still emit as custom objects? 






Thanks & Regards,
B Anil Kumar.

 		 	   		  







 		 	   		  

RE: shifting sequenceFileOutput format to Avro format

Posted by java8964 <ja...@hotmail.com>.
Hi, Kumar:
I will suggest you can seek help of Avro in the Avro mailing list in the future, which can be registered here:
http://avro.apache.org/mailing_lists.html
About your schema, you missed one "}"  in your file.
yzhang$ more test.avsc{"namespace": "test.avro", "type": "record", "name": "Session", "fields": [   {"name":"VisitCommon", "type": {"type": "map", "values":"string"}},   {"name":"events", "type": {        "type": "array",        "items":{        "name":"Event",        "type":"map",        "values":"string"}        }    } ]}
yzhang$ java -jar ~/lib/avro-tools-1.7.6.jar compile schema test.avsc output/Input files to compile:  test.avsc
yzhang$ ls -ls output/test/avro/Session.java16 -rw-r--r--  1 yzhang  staff  7371 Feb  4 14:05 output/test/avro/Session.java
Date: Tue, 4 Feb 2014 22:22:53 +0530
Subject: Re: shifting sequenceFileOutput format to Avro format
From: akumarb2010@gmail.com
To: user@hadoop.apache.org

I tried with different versions of avro-maven-plugin, with 1.7.5, 1.7.6 and with jdk1.7.0_45 version.
I am unable to resolve it.
Error message is as below:

[ERROR] symbol:   method deepCopy(org.apache.avro.Schema,java.util.Map<java.lang.CharSequence,java.lang.CharSequence>)[ERROR] location: class org.apache.avro.generic.GenericData
Thanks & Regards,
B Anil Kumar.



On Tue, Feb 4, 2014 at 12:06 AM, AnilKumar B <ak...@gmail.com> wrote:

Can anyone please suggest on how to resolve this issue?Thanks & Regards,
B Anil Kumar.



On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:


Hi Yong,
I followed your 2nd  suggestion. My data format is is nested(list of map), So I created .avsc as below.
{"namespace": "test.avro",


 "type": "record", "name": "Session", "fields": [   {"name":"VisitCommon", "type": {           "type": "map", "values":"string"},


   {"name":"events",     "type": {    	"type": "array",    	"items":{


    	"name":"Event",    	"type":"map",    	"values":"string"}


    	}    } ]}
And I tried creating corresponding classes by using avro tool and with plugin, but there are few errors on generated java code. What could be the issue?



1) Error: The method deepCopy(Schema, List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData2) And also observed there is some deprecated code. @Deprecated public java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;




I used eclipse plugin as mentioned below.http://avro.apache.org/docs/1.7.6/mr.html






Thanks & Regards,
B Anil Kumar.



On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com> wrote:


Thanks Yong.Thanks & Regards,
B Anil Kumar.



On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:





In avro, you need to think about a schema to match your data. Avor's schema is very flexible and should be able to store all kinds of data.
If you have a Json string, you have 2 options to generate the Avro schema for it:




1) Use "type: string" to store the whole Json string into Avro. This will be easiest, but you have to parse the data later when you use it.2) Use Avro schema to match your json data, using matching structure from avro for your data, like 'record, array, map' etc.




Yong

Date: Fri, 31 Jan 2014 00:13:59 +0530
Subject: shifting sequenceFileOutput format to Avro format
From: akumarb2010@gmail.com




To: user@hadoop.apache.org

Hi,
As of now in my jobs, I am using SequenceFileOutputFormat and I am emitting custom java objects as MR output.




Now I am planning to emit it in avro format, I went through  few blogs but still have following doubts.

1) My current custom Writable objects has nested json format as toString(), So when I shift to avro format, should I just emit json string in avro format, instead of writable custom object? 





2) If so, how can I create schema? My json string is nested and will have random key/value pairs.
3) Or can I still emit as custom objects? 






Thanks & Regards,
B Anil Kumar.

 		 	   		  







 		 	   		  

Re: shifting sequenceFileOutput format to Avro format

Posted by AnilKumar B <ak...@gmail.com>.
I tried with different versions of avro-maven-plugin, with 1.7.5, 1.7.6 and
with jdk1.7.0_45 version.

I am unable to resolve it.

Error message is as below:
[ERROR] symbol:   method
deepCopy(org.apache.avro.Schema,java.util.Map<java.lang.CharSequence,java.lang.CharSequence>)
[ERROR] location: class org.apache.avro.generic.GenericData

Thanks & Regards,
B Anil Kumar.


On Tue, Feb 4, 2014 at 12:06 AM, AnilKumar B <ak...@gmail.com> wrote:

> Can anyone please suggest on how to resolve this issue?
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:
>
>> Hi Yong,
>>
>> I followed your 2nd  suggestion. My data format is is nested(list of
>> map), So I created .avsc as below.
>>
>> {"namespace": "test.avro",
>>  "type": "record",
>>  "name": "Session",
>>  "fields": [
>>    {"name":"VisitCommon", "type": {
>>            "type": "map", "values":"string"},
>>    {"name":"events",
>>     "type": {
>>     "type": "array",
>>     "items":{
>>     "name":"Event",
>>     "type":"map",
>>     "values":"string"}
>>     }
>>     }
>>  ]
>> }
>>
>> And I tried creating corresponding classes by using avro tool and with
>> plugin, but there are few errors on generated java code. What could be the
>> issue?
>>
>> 1) Error: The method deepCopy(Schema,
>> List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData
>> 2) And also observed there is some deprecated code.
>>  @Deprecated public
>> java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;
>>
>> I used eclipse plugin as mentioned below.
>> http://avro.apache.org/docs/1.7.6/mr.html
>>
>>
>>
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>>
>> On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com>wrote:
>>
>>> Thanks Yong.
>>>
>>> Thanks & Regards,
>>> B Anil Kumar.
>>>
>>>
>>> On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:
>>>
>>>> In avro, you need to think about a schema to match your data. Avor's
>>>> schema is very flexible and should be able to store all kinds of data.
>>>>
>>>> If you have a Json string, you have 2 options to generate the Avro
>>>> schema for it:
>>>>
>>>> 1) Use "type: string" to store the whole Json string into Avro. This
>>>> will be easiest, but you have to parse the data later when you use it.
>>>> 2) Use Avro schema to match your json data, using matching structure
>>>> from avro for your data, like 'record, array, map' etc.
>>>>
>>>> Yong
>>>>
>>>> ------------------------------
>>>> Date: Fri, 31 Jan 2014 00:13:59 +0530
>>>> Subject: shifting sequenceFileOutput format to Avro format
>>>> From: akumarb2010@gmail.com
>>>> To: user@hadoop.apache.org
>>>>
>>>>
>>>> Hi,
>>>>
>>>> As of now in my jobs, I am using SequenceFileOutputFormat and I am
>>>> emitting custom java objects as MR output.
>>>>
>>>> Now I am planning to emit it in avro format, I went through  few blogs
>>>> but still have following doubts.
>>>>
>>>> 1) My current custom Writable objects has nested json format as
>>>> toString(), So when I shift to avro format, should I just emit json string
>>>> in avro format, instead of writable custom object?
>>>>
>>>> 2) If so, how can I create schema? My json string is nested and will
>>>> have random key/value pairs.
>>>>
>>>> 3) Or can I still emit as custom objects?
>>>>
>>>>
>>>>
>>>> Thanks & Regards,
>>>> B Anil Kumar.
>>>>
>>>
>>>
>>
>

Re: shifting sequenceFileOutput format to Avro format

Posted by AnilKumar B <ak...@gmail.com>.
I tried with different versions of avro-maven-plugin, with 1.7.5, 1.7.6 and
with jdk1.7.0_45 version.

I am unable to resolve it.

Error message is as below:
[ERROR] symbol:   method
deepCopy(org.apache.avro.Schema,java.util.Map<java.lang.CharSequence,java.lang.CharSequence>)
[ERROR] location: class org.apache.avro.generic.GenericData

Thanks & Regards,
B Anil Kumar.


On Tue, Feb 4, 2014 at 12:06 AM, AnilKumar B <ak...@gmail.com> wrote:

> Can anyone please suggest on how to resolve this issue?
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:
>
>> Hi Yong,
>>
>> I followed your 2nd  suggestion. My data format is is nested(list of
>> map), So I created .avsc as below.
>>
>> {"namespace": "test.avro",
>>  "type": "record",
>>  "name": "Session",
>>  "fields": [
>>    {"name":"VisitCommon", "type": {
>>            "type": "map", "values":"string"},
>>    {"name":"events",
>>     "type": {
>>     "type": "array",
>>     "items":{
>>     "name":"Event",
>>     "type":"map",
>>     "values":"string"}
>>     }
>>     }
>>  ]
>> }
>>
>> And I tried creating corresponding classes by using avro tool and with
>> plugin, but there are few errors on generated java code. What could be the
>> issue?
>>
>> 1) Error: The method deepCopy(Schema,
>> List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData
>> 2) And also observed there is some deprecated code.
>>  @Deprecated public
>> java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;
>>
>> I used eclipse plugin as mentioned below.
>> http://avro.apache.org/docs/1.7.6/mr.html
>>
>>
>>
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>>
>> On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com>wrote:
>>
>>> Thanks Yong.
>>>
>>> Thanks & Regards,
>>> B Anil Kumar.
>>>
>>>
>>> On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:
>>>
>>>> In avro, you need to think about a schema to match your data. Avor's
>>>> schema is very flexible and should be able to store all kinds of data.
>>>>
>>>> If you have a Json string, you have 2 options to generate the Avro
>>>> schema for it:
>>>>
>>>> 1) Use "type: string" to store the whole Json string into Avro. This
>>>> will be easiest, but you have to parse the data later when you use it.
>>>> 2) Use Avro schema to match your json data, using matching structure
>>>> from avro for your data, like 'record, array, map' etc.
>>>>
>>>> Yong
>>>>
>>>> ------------------------------
>>>> Date: Fri, 31 Jan 2014 00:13:59 +0530
>>>> Subject: shifting sequenceFileOutput format to Avro format
>>>> From: akumarb2010@gmail.com
>>>> To: user@hadoop.apache.org
>>>>
>>>>
>>>> Hi,
>>>>
>>>> As of now in my jobs, I am using SequenceFileOutputFormat and I am
>>>> emitting custom java objects as MR output.
>>>>
>>>> Now I am planning to emit it in avro format, I went through  few blogs
>>>> but still have following doubts.
>>>>
>>>> 1) My current custom Writable objects has nested json format as
>>>> toString(), So when I shift to avro format, should I just emit json string
>>>> in avro format, instead of writable custom object?
>>>>
>>>> 2) If so, how can I create schema? My json string is nested and will
>>>> have random key/value pairs.
>>>>
>>>> 3) Or can I still emit as custom objects?
>>>>
>>>>
>>>>
>>>> Thanks & Regards,
>>>> B Anil Kumar.
>>>>
>>>
>>>
>>
>

Re: shifting sequenceFileOutput format to Avro format

Posted by AnilKumar B <ak...@gmail.com>.
I tried with different versions of avro-maven-plugin, with 1.7.5, 1.7.6 and
with jdk1.7.0_45 version.

I am unable to resolve it.

Error message is as below:
[ERROR] symbol:   method
deepCopy(org.apache.avro.Schema,java.util.Map<java.lang.CharSequence,java.lang.CharSequence>)
[ERROR] location: class org.apache.avro.generic.GenericData

Thanks & Regards,
B Anil Kumar.


On Tue, Feb 4, 2014 at 12:06 AM, AnilKumar B <ak...@gmail.com> wrote:

> Can anyone please suggest on how to resolve this issue?
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:
>
>> Hi Yong,
>>
>> I followed your 2nd  suggestion. My data format is is nested(list of
>> map), So I created .avsc as below.
>>
>> {"namespace": "test.avro",
>>  "type": "record",
>>  "name": "Session",
>>  "fields": [
>>    {"name":"VisitCommon", "type": {
>>            "type": "map", "values":"string"},
>>    {"name":"events",
>>     "type": {
>>     "type": "array",
>>     "items":{
>>     "name":"Event",
>>     "type":"map",
>>     "values":"string"}
>>     }
>>     }
>>  ]
>> }
>>
>> And I tried creating corresponding classes by using avro tool and with
>> plugin, but there are few errors on generated java code. What could be the
>> issue?
>>
>> 1) Error: The method deepCopy(Schema,
>> List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData
>> 2) And also observed there is some deprecated code.
>>  @Deprecated public
>> java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;
>>
>> I used eclipse plugin as mentioned below.
>> http://avro.apache.org/docs/1.7.6/mr.html
>>
>>
>>
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>>
>> On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com>wrote:
>>
>>> Thanks Yong.
>>>
>>> Thanks & Regards,
>>> B Anil Kumar.
>>>
>>>
>>> On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:
>>>
>>>> In avro, you need to think about a schema to match your data. Avor's
>>>> schema is very flexible and should be able to store all kinds of data.
>>>>
>>>> If you have a Json string, you have 2 options to generate the Avro
>>>> schema for it:
>>>>
>>>> 1) Use "type: string" to store the whole Json string into Avro. This
>>>> will be easiest, but you have to parse the data later when you use it.
>>>> 2) Use Avro schema to match your json data, using matching structure
>>>> from avro for your data, like 'record, array, map' etc.
>>>>
>>>> Yong
>>>>
>>>> ------------------------------
>>>> Date: Fri, 31 Jan 2014 00:13:59 +0530
>>>> Subject: shifting sequenceFileOutput format to Avro format
>>>> From: akumarb2010@gmail.com
>>>> To: user@hadoop.apache.org
>>>>
>>>>
>>>> Hi,
>>>>
>>>> As of now in my jobs, I am using SequenceFileOutputFormat and I am
>>>> emitting custom java objects as MR output.
>>>>
>>>> Now I am planning to emit it in avro format, I went through  few blogs
>>>> but still have following doubts.
>>>>
>>>> 1) My current custom Writable objects has nested json format as
>>>> toString(), So when I shift to avro format, should I just emit json string
>>>> in avro format, instead of writable custom object?
>>>>
>>>> 2) If so, how can I create schema? My json string is nested and will
>>>> have random key/value pairs.
>>>>
>>>> 3) Or can I still emit as custom objects?
>>>>
>>>>
>>>>
>>>> Thanks & Regards,
>>>> B Anil Kumar.
>>>>
>>>
>>>
>>
>

Re: shifting sequenceFileOutput format to Avro format

Posted by AnilKumar B <ak...@gmail.com>.
I tried with different versions of avro-maven-plugin, with 1.7.5, 1.7.6 and
with jdk1.7.0_45 version.

I am unable to resolve it.

Error message is as below:
[ERROR] symbol:   method
deepCopy(org.apache.avro.Schema,java.util.Map<java.lang.CharSequence,java.lang.CharSequence>)
[ERROR] location: class org.apache.avro.generic.GenericData

Thanks & Regards,
B Anil Kumar.


On Tue, Feb 4, 2014 at 12:06 AM, AnilKumar B <ak...@gmail.com> wrote:

> Can anyone please suggest on how to resolve this issue?
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:
>
>> Hi Yong,
>>
>> I followed your 2nd  suggestion. My data format is is nested(list of
>> map), So I created .avsc as below.
>>
>> {"namespace": "test.avro",
>>  "type": "record",
>>  "name": "Session",
>>  "fields": [
>>    {"name":"VisitCommon", "type": {
>>            "type": "map", "values":"string"},
>>    {"name":"events",
>>     "type": {
>>     "type": "array",
>>     "items":{
>>     "name":"Event",
>>     "type":"map",
>>     "values":"string"}
>>     }
>>     }
>>  ]
>> }
>>
>> And I tried creating corresponding classes by using avro tool and with
>> plugin, but there are few errors on generated java code. What could be the
>> issue?
>>
>> 1) Error: The method deepCopy(Schema,
>> List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData
>> 2) And also observed there is some deprecated code.
>>  @Deprecated public
>> java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;
>>
>> I used eclipse plugin as mentioned below.
>> http://avro.apache.org/docs/1.7.6/mr.html
>>
>>
>>
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>>
>> On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com>wrote:
>>
>>> Thanks Yong.
>>>
>>> Thanks & Regards,
>>> B Anil Kumar.
>>>
>>>
>>> On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:
>>>
>>>> In avro, you need to think about a schema to match your data. Avor's
>>>> schema is very flexible and should be able to store all kinds of data.
>>>>
>>>> If you have a Json string, you have 2 options to generate the Avro
>>>> schema for it:
>>>>
>>>> 1) Use "type: string" to store the whole Json string into Avro. This
>>>> will be easiest, but you have to parse the data later when you use it.
>>>> 2) Use Avro schema to match your json data, using matching structure
>>>> from avro for your data, like 'record, array, map' etc.
>>>>
>>>> Yong
>>>>
>>>> ------------------------------
>>>> Date: Fri, 31 Jan 2014 00:13:59 +0530
>>>> Subject: shifting sequenceFileOutput format to Avro format
>>>> From: akumarb2010@gmail.com
>>>> To: user@hadoop.apache.org
>>>>
>>>>
>>>> Hi,
>>>>
>>>> As of now in my jobs, I am using SequenceFileOutputFormat and I am
>>>> emitting custom java objects as MR output.
>>>>
>>>> Now I am planning to emit it in avro format, I went through  few blogs
>>>> but still have following doubts.
>>>>
>>>> 1) My current custom Writable objects has nested json format as
>>>> toString(), So when I shift to avro format, should I just emit json string
>>>> in avro format, instead of writable custom object?
>>>>
>>>> 2) If so, how can I create schema? My json string is nested and will
>>>> have random key/value pairs.
>>>>
>>>> 3) Or can I still emit as custom objects?
>>>>
>>>>
>>>>
>>>> Thanks & Regards,
>>>> B Anil Kumar.
>>>>
>>>
>>>
>>
>

Re: shifting sequenceFileOutput format to Avro format

Posted by AnilKumar B <ak...@gmail.com>.
Can anyone please suggest on how to resolve this issue?

Thanks & Regards,
B Anil Kumar.


On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:

> Hi Yong,
>
> I followed your 2nd  suggestion. My data format is is nested(list of map),
> So I created .avsc as below.
>
> {"namespace": "test.avro",
>  "type": "record",
>  "name": "Session",
>  "fields": [
>    {"name":"VisitCommon", "type": {
>            "type": "map", "values":"string"},
>    {"name":"events",
>     "type": {
>     "type": "array",
>     "items":{
>     "name":"Event",
>     "type":"map",
>     "values":"string"}
>     }
>     }
>  ]
> }
>
> And I tried creating corresponding classes by using avro tool and with
> plugin, but there are few errors on generated java code. What could be the
> issue?
>
> 1) Error: The method deepCopy(Schema,
> List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData
> 2) And also observed there is some deprecated code.
>  @Deprecated public
> java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;
>
> I used eclipse plugin as mentioned below.
> http://avro.apache.org/docs/1.7.6/mr.html
>
>
>
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com>wrote:
>
>> Thanks Yong.
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>>
>> On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:
>>
>>> In avro, you need to think about a schema to match your data. Avor's
>>> schema is very flexible and should be able to store all kinds of data.
>>>
>>> If you have a Json string, you have 2 options to generate the Avro
>>> schema for it:
>>>
>>> 1) Use "type: string" to store the whole Json string into Avro. This
>>> will be easiest, but you have to parse the data later when you use it.
>>> 2) Use Avro schema to match your json data, using matching structure
>>> from avro for your data, like 'record, array, map' etc.
>>>
>>> Yong
>>>
>>> ------------------------------
>>> Date: Fri, 31 Jan 2014 00:13:59 +0530
>>> Subject: shifting sequenceFileOutput format to Avro format
>>> From: akumarb2010@gmail.com
>>> To: user@hadoop.apache.org
>>>
>>>
>>> Hi,
>>>
>>> As of now in my jobs, I am using SequenceFileOutputFormat and I am
>>> emitting custom java objects as MR output.
>>>
>>> Now I am planning to emit it in avro format, I went through  few blogs
>>> but still have following doubts.
>>>
>>> 1) My current custom Writable objects has nested json format as
>>> toString(), So when I shift to avro format, should I just emit json string
>>> in avro format, instead of writable custom object?
>>>
>>> 2) If so, how can I create schema? My json string is nested and will
>>> have random key/value pairs.
>>>
>>> 3) Or can I still emit as custom objects?
>>>
>>>
>>>
>>> Thanks & Regards,
>>> B Anil Kumar.
>>>
>>
>>
>

Re: shifting sequenceFileOutput format to Avro format

Posted by AnilKumar B <ak...@gmail.com>.
Can anyone please suggest on how to resolve this issue?

Thanks & Regards,
B Anil Kumar.


On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:

> Hi Yong,
>
> I followed your 2nd  suggestion. My data format is is nested(list of map),
> So I created .avsc as below.
>
> {"namespace": "test.avro",
>  "type": "record",
>  "name": "Session",
>  "fields": [
>    {"name":"VisitCommon", "type": {
>            "type": "map", "values":"string"},
>    {"name":"events",
>     "type": {
>     "type": "array",
>     "items":{
>     "name":"Event",
>     "type":"map",
>     "values":"string"}
>     }
>     }
>  ]
> }
>
> And I tried creating corresponding classes by using avro tool and with
> plugin, but there are few errors on generated java code. What could be the
> issue?
>
> 1) Error: The method deepCopy(Schema,
> List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData
> 2) And also observed there is some deprecated code.
>  @Deprecated public
> java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;
>
> I used eclipse plugin as mentioned below.
> http://avro.apache.org/docs/1.7.6/mr.html
>
>
>
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com>wrote:
>
>> Thanks Yong.
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>>
>> On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:
>>
>>> In avro, you need to think about a schema to match your data. Avor's
>>> schema is very flexible and should be able to store all kinds of data.
>>>
>>> If you have a Json string, you have 2 options to generate the Avro
>>> schema for it:
>>>
>>> 1) Use "type: string" to store the whole Json string into Avro. This
>>> will be easiest, but you have to parse the data later when you use it.
>>> 2) Use Avro schema to match your json data, using matching structure
>>> from avro for your data, like 'record, array, map' etc.
>>>
>>> Yong
>>>
>>> ------------------------------
>>> Date: Fri, 31 Jan 2014 00:13:59 +0530
>>> Subject: shifting sequenceFileOutput format to Avro format
>>> From: akumarb2010@gmail.com
>>> To: user@hadoop.apache.org
>>>
>>>
>>> Hi,
>>>
>>> As of now in my jobs, I am using SequenceFileOutputFormat and I am
>>> emitting custom java objects as MR output.
>>>
>>> Now I am planning to emit it in avro format, I went through  few blogs
>>> but still have following doubts.
>>>
>>> 1) My current custom Writable objects has nested json format as
>>> toString(), So when I shift to avro format, should I just emit json string
>>> in avro format, instead of writable custom object?
>>>
>>> 2) If so, how can I create schema? My json string is nested and will
>>> have random key/value pairs.
>>>
>>> 3) Or can I still emit as custom objects?
>>>
>>>
>>>
>>> Thanks & Regards,
>>> B Anil Kumar.
>>>
>>
>>
>

Re: shifting sequenceFileOutput format to Avro format

Posted by AnilKumar B <ak...@gmail.com>.
Can anyone please suggest on how to resolve this issue?

Thanks & Regards,
B Anil Kumar.


On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:

> Hi Yong,
>
> I followed your 2nd  suggestion. My data format is is nested(list of map),
> So I created .avsc as below.
>
> {"namespace": "test.avro",
>  "type": "record",
>  "name": "Session",
>  "fields": [
>    {"name":"VisitCommon", "type": {
>            "type": "map", "values":"string"},
>    {"name":"events",
>     "type": {
>     "type": "array",
>     "items":{
>     "name":"Event",
>     "type":"map",
>     "values":"string"}
>     }
>     }
>  ]
> }
>
> And I tried creating corresponding classes by using avro tool and with
> plugin, but there are few errors on generated java code. What could be the
> issue?
>
> 1) Error: The method deepCopy(Schema,
> List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData
> 2) And also observed there is some deprecated code.
>  @Deprecated public
> java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;
>
> I used eclipse plugin as mentioned below.
> http://avro.apache.org/docs/1.7.6/mr.html
>
>
>
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com>wrote:
>
>> Thanks Yong.
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>>
>> On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:
>>
>>> In avro, you need to think about a schema to match your data. Avor's
>>> schema is very flexible and should be able to store all kinds of data.
>>>
>>> If you have a Json string, you have 2 options to generate the Avro
>>> schema for it:
>>>
>>> 1) Use "type: string" to store the whole Json string into Avro. This
>>> will be easiest, but you have to parse the data later when you use it.
>>> 2) Use Avro schema to match your json data, using matching structure
>>> from avro for your data, like 'record, array, map' etc.
>>>
>>> Yong
>>>
>>> ------------------------------
>>> Date: Fri, 31 Jan 2014 00:13:59 +0530
>>> Subject: shifting sequenceFileOutput format to Avro format
>>> From: akumarb2010@gmail.com
>>> To: user@hadoop.apache.org
>>>
>>>
>>> Hi,
>>>
>>> As of now in my jobs, I am using SequenceFileOutputFormat and I am
>>> emitting custom java objects as MR output.
>>>
>>> Now I am planning to emit it in avro format, I went through  few blogs
>>> but still have following doubts.
>>>
>>> 1) My current custom Writable objects has nested json format as
>>> toString(), So when I shift to avro format, should I just emit json string
>>> in avro format, instead of writable custom object?
>>>
>>> 2) If so, how can I create schema? My json string is nested and will
>>> have random key/value pairs.
>>>
>>> 3) Or can I still emit as custom objects?
>>>
>>>
>>>
>>> Thanks & Regards,
>>> B Anil Kumar.
>>>
>>
>>
>

Re: shifting sequenceFileOutput format to Avro format

Posted by AnilKumar B <ak...@gmail.com>.
Can anyone please suggest on how to resolve this issue?

Thanks & Regards,
B Anil Kumar.


On Mon, Feb 3, 2014 at 9:34 AM, AnilKumar B <ak...@gmail.com> wrote:

> Hi Yong,
>
> I followed your 2nd  suggestion. My data format is is nested(list of map),
> So I created .avsc as below.
>
> {"namespace": "test.avro",
>  "type": "record",
>  "name": "Session",
>  "fields": [
>    {"name":"VisitCommon", "type": {
>            "type": "map", "values":"string"},
>    {"name":"events",
>     "type": {
>     "type": "array",
>     "items":{
>     "name":"Event",
>     "type":"map",
>     "values":"string"}
>     }
>     }
>  ]
> }
>
> And I tried creating corresponding classes by using avro tool and with
> plugin, but there are few errors on generated java code. What could be the
> issue?
>
> 1) Error: The method deepCopy(Schema,
> List<Map<CharSequence,CharSequence>>) is undefined for the type GenericData
> 2) And also observed there is some deprecated code.
>  @Deprecated public
> java.util.Map<java.lang.CharSequence,java.lang.CharSequence> VisitCommon;
>
> I used eclipse plugin as mentioned below.
> http://avro.apache.org/docs/1.7.6/mr.html
>
>
>
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Fri, Jan 31, 2014 at 8:27 AM, AnilKumar B <ak...@gmail.com>wrote:
>
>> Thanks Yong.
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>>
>> On Fri, Jan 31, 2014 at 12:44 AM, java8964 <ja...@hotmail.com> wrote:
>>
>>> In avro, you need to think about a schema to match your data. Avor's
>>> schema is very flexible and should be able to store all kinds of data.
>>>
>>> If you have a Json string, you have 2 options to generate the Avro
>>> schema for it:
>>>
>>> 1) Use "type: string" to store the whole Json string into Avro. This
>>> will be easiest, but you have to parse the data later when you use it.
>>> 2) Use Avro schema to match your json data, using matching structure
>>> from avro for your data, like 'record, array, map' etc.
>>>
>>> Yong
>>>
>>> ------------------------------
>>> Date: Fri, 31 Jan 2014 00:13:59 +0530
>>> Subject: shifting sequenceFileOutput format to Avro format
>>> From: akumarb2010@gmail.com
>>> To: user@hadoop.apache.org
>>>
>>>
>>> Hi,
>>>
>>> As of now in my jobs, I am using SequenceFileOutputFormat and I am
>>> emitting custom java objects as MR output.
>>>
>>> Now I am planning to emit it in avro format, I went through  few blogs
>>> but still have following doubts.
>>>
>>> 1) My current custom Writable objects has nested json format as
>>> toString(), So when I shift to avro format, should I just emit json string
>>> in avro format, instead of writable custom object?
>>>
>>> 2) If so, how can I create schema? My json string is nested and will
>>> have random key/value pairs.
>>>
>>> 3) Or can I still emit as custom objects?
>>>
>>>
>>>
>>> Thanks & Regards,
>>> B Anil Kumar.
>>>
>>
>>
>