You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Russell Jurney <ru...@gmail.com> on 2012/01/01 04:50:53 UTC

Re: Pig and MongoDB

I fixed MongoStorage.java to work with bags of tuples.
https://gist.github.com/1546174

I'll do a pull request on github tomorrow, and try to get it in the next
release.  In the meantime, replacing MongoStorage.java with that gist and
rebuilding should work.

High five!
Russ

On Thu, Dec 29, 2011 at 5:58 PM, Russell Jurney <ru...@gmail.com>wrote:

> For reference:
> http://groups.google.com/group/mongodb-user/browse_thread/thread/f137fe0a1a81a3cd
>
> Looks like it is not yet supported.  Ack, shabby work on my part I'm
> afraid :(
>
>
> On Thursday, December 29, 2011, Ayon Sinha <ay...@yahoo.com> wrote:
> > I'm getting IllegalArgumentException while storing a bag of tuples.
> >
> > 2011-12-29 13:28:32,576 [Thread-42] INFO
>  com.mongodb.hadoop.pig.MongoStorage - I: 2
> tuple:(hipster,foobar,{(halloween,21),(oprah,17),(motorcycles,15)})
> > 2011-12-29 13:28:32,579 [Thread-42] WARN
>  org.apache.hadoop.mapred.LocalJobRunner - job_local_0003
> > java.lang.IllegalArgumentException: can't serialize class
> org.apache.pig.data.BinSedesTuple
> > at org.bson.BSONEncoder._putObjectField(BSONEncoder.java:205)
> > at org.bson.BSONEncoder.putIterable(BSONEncoder.java:230)
> > at org.bson.BSONEncoder._putObjectField(BSONEncoder.java:179)
> > at org.bson.BSONEncoder.putObject(BSONEncoder.java:121)
> > at org.bson.BSONEncoder.putObject(BSONEncoder.java:67)
> > at com.mongodb.DBApiLayer$MyCollection.insert(DBApiLayer.java:231)
> > at com.mongodb.DBApiLayer$MyCollection.insert(DBApiLayer.java:197)
> > at com.mongodb.DBCollection.insert(DBCollection.java:73)
> > at com.mongodb.DBCollection.save(DBCollection.java:524)
> > at com.mongodb.DBCollection.save(DBCollection.java:504)
> > at
> com.mongodb.hadoop.output.MongoRecordWriter.write(MongoRecordWriter.java:98)
> > at com.mongodb.hadoop.pig.MongoStorage.putNext(MongoStorage.java:77)
> > at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
> > at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
> > at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:498)
> > at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> > at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
> > at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
> > at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
> > at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> > at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> > 2011-12-29 13:28:37,338 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - job job_local_0003 has failed! Stop running all dependent jobs
> >
> > -Ayon
> > See My Photos on Flickr
> > Also check out my Blog for answers to commonly asked questions.
> >
> >
> >
> > ________________________________
> >  From: Russell Jurney <ru...@gmail.com>
> > To: user@pig.apache.org
> > Sent: Thursday, December 22, 2011 1:36 PM
> > Subject: Pig and MongoDB
> >
> > I was impressed by MongoDB's Pig integration enough to write about it.
> > Thought it might be of interest to the list:
> >
> http://datasyndrome.com/post/14631249157/mongodb-is-web-scale-hadoop-mongodb
> >
> > --
> > Russell Jurney
> > twitter.com/rjurney
> > russell.jurney@gmail.com
> > datasyndrome.com
>
> --
> Russell Jurney
> twitter.com/rjurney
> russell.jurney@gmail.com
> datasyndrome.com
>



-- 
Russell Jurney
twitter.com/rjurney
russell.jurney@gmail.com
datasyndrome.com

Re: Pig and MongoDB

Posted by Russell Jurney <ru...@gmail.com>.
This was merged into mongo-db today.  It isn't well tested, so buyer beware.

Russ

On Sat, Dec 31, 2011 at 8:22 PM, Russell Jurney <ru...@gmail.com>wrote:

> Submitted a pull request: https://github.com/mongodb/mongo-hadoop/pull/29
>
>
> On Sat, Dec 31, 2011 at 7:50 PM, Russell Jurney <ru...@gmail.com>wrote:
>
>> I fixed MongoStorage.java to work with bags of tuples.
>> https://gist.github.com/1546174
>>
>> I'll do a pull request on github tomorrow, and try to get it in the next
>> release.  In the meantime, replacing MongoStorage.java with that gist and
>> rebuilding should work.
>>
>> High five!
>> Russ
>>
>> On Thu, Dec 29, 2011 at 5:58 PM, Russell Jurney <russell.jurney@gmail.com
>> > wrote:
>>
>>> For reference:
>>> http://groups.google.com/group/mongodb-user/browse_thread/thread/f137fe0a1a81a3cd
>>>
>>> Looks like it is not yet supported.  Ack, shabby work on my part I'm
>>> afraid :(
>>>
>>>
>>> On Thursday, December 29, 2011, Ayon Sinha <ay...@yahoo.com> wrote:
>>> > I'm getting IllegalArgumentException while storing a bag of tuples.
>>> >
>>> > 2011-12-29 13:28:32,576 [Thread-42] INFO
>>>  com.mongodb.hadoop.pig.MongoStorage - I: 2
>>> tuple:(hipster,foobar,{(halloween,21),(oprah,17),(motorcycles,15)})
>>> > 2011-12-29 13:28:32,579 [Thread-42] WARN
>>>  org.apache.hadoop.mapred.LocalJobRunner - job_local_0003
>>> > java.lang.IllegalArgumentException: can't serialize class
>>> org.apache.pig.data.BinSedesTuple
>>> > at org.bson.BSONEncoder._putObjectField(BSONEncoder.java:205)
>>> > at org.bson.BSONEncoder.putIterable(BSONEncoder.java:230)
>>> > at org.bson.BSONEncoder._putObjectField(BSONEncoder.java:179)
>>> > at org.bson.BSONEncoder.putObject(BSONEncoder.java:121)
>>> > at org.bson.BSONEncoder.putObject(BSONEncoder.java:67)
>>> > at com.mongodb.DBApiLayer$MyCollection.insert(DBApiLayer.java:231)
>>> > at com.mongodb.DBApiLayer$MyCollection.insert(DBApiLayer.java:197)
>>> > at com.mongodb.DBCollection.insert(DBCollection.java:73)
>>> > at com.mongodb.DBCollection.save(DBCollection.java:524)
>>> > at com.mongodb.DBCollection.save(DBCollection.java:504)
>>> > at
>>> com.mongodb.hadoop.output.MongoRecordWriter.write(MongoRecordWriter.java:98)
>>> > at com.mongodb.hadoop.pig.MongoStorage.putNext(MongoStorage.java:77)
>>> > at
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
>>> > at
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
>>> > at
>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:498)
>>> > at
>>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>> > at
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
>>> > at
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
>>> > at
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
>>> > at
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>>> > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>>> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>> > at
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>>> > 2011-12-29 13:28:37,338 [main] INFO
>>>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>>> - job job_local_0003 has failed! Stop running all dependent jobs
>>> >
>>> > -Ayon
>>> > See My Photos on Flickr
>>> > Also check out my Blog for answers to commonly asked questions.
>>> >
>>> >
>>> >
>>> > ________________________________
>>> >  From: Russell Jurney <ru...@gmail.com>
>>> > To: user@pig.apache.org
>>> > Sent: Thursday, December 22, 2011 1:36 PM
>>> > Subject: Pig and MongoDB
>>> >
>>> > I was impressed by MongoDB's Pig integration enough to write about it.
>>> > Thought it might be of interest to the list:
>>> >
>>> http://datasyndrome.com/post/14631249157/mongodb-is-web-scale-hadoop-mongodb
>>> >
>>> > --
>>> > Russell Jurney
>>> > twitter.com/rjurney
>>> > russell.jurney@gmail.com
>>> > datasyndrome.com
>>>
>>> --
>>> Russell Jurney
>>> twitter.com/rjurney
>>> russell.jurney@gmail.com
>>> datasyndrome.com
>>>
>>
>>
>>
>> --
>> Russell Jurney
>> twitter.com/rjurney
>> russell.jurney@gmail.com
>> datasyndrome.com
>>
>
>
>
> --
> Russell Jurney
> twitter.com/rjurney
> russell.jurney@gmail.com
> datasyndrome.com
>



-- 
Russell Jurney
twitter.com/rjurney
russell.jurney@gmail.com
datasyndrome.com

Re: Pig and MongoDB

Posted by Russell Jurney <ru...@gmail.com>.
Submitted a pull request: https://github.com/mongodb/mongo-hadoop/pull/29

On Sat, Dec 31, 2011 at 7:50 PM, Russell Jurney <ru...@gmail.com>wrote:

> I fixed MongoStorage.java to work with bags of tuples.
> https://gist.github.com/1546174
>
> I'll do a pull request on github tomorrow, and try to get it in the next
> release.  In the meantime, replacing MongoStorage.java with that gist and
> rebuilding should work.
>
> High five!
> Russ
>
> On Thu, Dec 29, 2011 at 5:58 PM, Russell Jurney <ru...@gmail.com>wrote:
>
>> For reference:
>> http://groups.google.com/group/mongodb-user/browse_thread/thread/f137fe0a1a81a3cd
>>
>> Looks like it is not yet supported.  Ack, shabby work on my part I'm
>> afraid :(
>>
>>
>> On Thursday, December 29, 2011, Ayon Sinha <ay...@yahoo.com> wrote:
>> > I'm getting IllegalArgumentException while storing a bag of tuples.
>> >
>> > 2011-12-29 13:28:32,576 [Thread-42] INFO
>>  com.mongodb.hadoop.pig.MongoStorage - I: 2
>> tuple:(hipster,foobar,{(halloween,21),(oprah,17),(motorcycles,15)})
>> > 2011-12-29 13:28:32,579 [Thread-42] WARN
>>  org.apache.hadoop.mapred.LocalJobRunner - job_local_0003
>> > java.lang.IllegalArgumentException: can't serialize class
>> org.apache.pig.data.BinSedesTuple
>> > at org.bson.BSONEncoder._putObjectField(BSONEncoder.java:205)
>> > at org.bson.BSONEncoder.putIterable(BSONEncoder.java:230)
>> > at org.bson.BSONEncoder._putObjectField(BSONEncoder.java:179)
>> > at org.bson.BSONEncoder.putObject(BSONEncoder.java:121)
>> > at org.bson.BSONEncoder.putObject(BSONEncoder.java:67)
>> > at com.mongodb.DBApiLayer$MyCollection.insert(DBApiLayer.java:231)
>> > at com.mongodb.DBApiLayer$MyCollection.insert(DBApiLayer.java:197)
>> > at com.mongodb.DBCollection.insert(DBCollection.java:73)
>> > at com.mongodb.DBCollection.save(DBCollection.java:524)
>> > at com.mongodb.DBCollection.save(DBCollection.java:504)
>> > at
>> com.mongodb.hadoop.output.MongoRecordWriter.write(MongoRecordWriter.java:98)
>> > at com.mongodb.hadoop.pig.MongoStorage.putNext(MongoStorage.java:77)
>> > at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
>> > at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
>> > at
>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:498)
>> > at
>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>> > at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
>> > at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
>> > at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
>> > at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>> > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> > at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>> > 2011-12-29 13:28:37,338 [main] INFO
>>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>> - job job_local_0003 has failed! Stop running all dependent jobs
>> >
>> > -Ayon
>> > See My Photos on Flickr
>> > Also check out my Blog for answers to commonly asked questions.
>> >
>> >
>> >
>> > ________________________________
>> >  From: Russell Jurney <ru...@gmail.com>
>> > To: user@pig.apache.org
>> > Sent: Thursday, December 22, 2011 1:36 PM
>> > Subject: Pig and MongoDB
>> >
>> > I was impressed by MongoDB's Pig integration enough to write about it.
>> > Thought it might be of interest to the list:
>> >
>> http://datasyndrome.com/post/14631249157/mongodb-is-web-scale-hadoop-mongodb
>> >
>> > --
>> > Russell Jurney
>> > twitter.com/rjurney
>> > russell.jurney@gmail.com
>> > datasyndrome.com
>>
>> --
>> Russell Jurney
>> twitter.com/rjurney
>> russell.jurney@gmail.com
>> datasyndrome.com
>>
>
>
>
> --
> Russell Jurney
> twitter.com/rjurney
> russell.jurney@gmail.com
> datasyndrome.com
>



-- 
Russell Jurney
twitter.com/rjurney
russell.jurney@gmail.com
datasyndrome.com