You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Berin Loritsch <be...@d-haven.org> on 2015/01/04 22:50:55 UTC

Schema detection problems with Pig 0.14.0

I'm trying to run through some examples from the Agile Data Science book,
but I'm running into some pretty fundamental roadblocks.  The book was
written against Pig 0.11.1, but because I'm hard headed I'd rather start
with a more modern stack.

Whether I read from an Avro collection or from MongoDB, the only exception
I get is that Pig doesn't know what the schema is.

I've attached the pig latin script for reference, but it's a pretty simple
count of times one person emails another.  I can run the equivalent
map-reduce directly in MondoDB, but the goal here is to get the
infrastructure set up so I can build on the simple foundation I have and
experiment beyond the simple examples in the book.

I also have Hadoop 2.6.0 installed, and I've had to fix a number of things
in the DOS scripts just so that it could find and execute Java from the
default install path.  It's painfully obvious to me now that Pig 0.14.0 was
not built against Hadoop 2.6.0, but I have no idea what it was built
against.

BTW, there's a number of things you'll have to change in the DOS scripts to
even find the hadoop-config.cmd