You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Zhu Wayne <zh...@gmail.com> on 2013/05/06 19:04:21 UTC

Pig 0.10 XmlLoader can't handle XML shorthand

Greetings! Did someone encounter the same issue?

Well-formated XML for <Sellers></Sellers> is fine:

grunt> register /usr/lib/pig/piggybank.jar;

grunt> a = load 'sample.xml' using
org.apache.pig.piggybank.storage.XMLLoader('Sellers') as (doc:chararray);

grunt> dump a;

(<Sellers>

            <Seller SellerName="Leebay-Brothers" SellerRating="3.9"
SellerPrice="3,499.99" ContactInfo="" ContactPhoneInfo=""/>

          </Sellers>)

 Short-hand XML for <Seller/> is NOT good:

grunt> a = load 'sample.xml' using
org.apache.pig.piggybank.storage.XMLLoader('Seller') as (doc:chararray);

grunt> dump a;

I got nothing here.

Re: Pig 0.10 XmlLoader can't handle XML shorthand

Posted by Zhu Wayne <zh...@gmail.com>.
Johnny,
Is there any way to get the attributes with XmlLoader()? And yes, <Seller/>
itself has no data.

<Seller SellerName="Leebay-Brothers" SellerRating="3.9"
> SellerPrice="3,499.99" ContactInfo="" ContactPhoneInfo=""/>




On Tue, May 7, 2013 at 12:44 AM, Johnny Zhang <xi...@cloudera.com> wrote:

> Hi, Zhu:
> Just want to clarify your requirement. Shorthand <Seller/> means there is
> no data. I guess nothing is normal. What would you expect if there is no
> data in XML?
>
> Thanks,
> Johnny
>
>
> On Mon, May 6, 2013 at 10:04 AM, Zhu Wayne <zh...@gmail.com> wrote:
>
> > Greetings! Did someone encounter the same issue?
> >
> > Well-formated XML for <Sellers></Sellers> is fine:
> >
> > grunt> register /usr/lib/pig/piggybank.jar;
> >
> > grunt> a = load 'sample.xml' using
> > org.apache.pig.piggybank.storage.XMLLoader('Sellers') as (doc:chararray);
> >
> > grunt> dump a;
> >
> > (<Sellers>
> >
> >             <Seller SellerName="Leebay-Brothers" SellerRating="3.9"
> > SellerPrice="3,499.99" ContactInfo="" ContactPhoneInfo=""/>
> >
> >           </Sellers>)
> >
> >  Short-hand XML for <Seller/> is NOT good:
> >
> > grunt> a = load 'sample.xml' using
> > org.apache.pig.piggybank.storage.XMLLoader('Seller') as (doc:chararray);
> >
> > grunt> dump a;
> >
> > I got nothing here.
> >
>



-- 
Wayne Zhu
847-282-0596 (Google Voice)

Re: Pig 0.10 XmlLoader can't handle XML shorthand

Posted by Johnny Zhang <xi...@cloudera.com>.
Hi, Zhu:
Just want to clarify your requirement. Shorthand <Seller/> means there is
no data. I guess nothing is normal. What would you expect if there is no
data in XML?

Thanks,
Johnny


On Mon, May 6, 2013 at 10:04 AM, Zhu Wayne <zh...@gmail.com> wrote:

> Greetings! Did someone encounter the same issue?
>
> Well-formated XML for <Sellers></Sellers> is fine:
>
> grunt> register /usr/lib/pig/piggybank.jar;
>
> grunt> a = load 'sample.xml' using
> org.apache.pig.piggybank.storage.XMLLoader('Sellers') as (doc:chararray);
>
> grunt> dump a;
>
> (<Sellers>
>
>             <Seller SellerName="Leebay-Brothers" SellerRating="3.9"
> SellerPrice="3,499.99" ContactInfo="" ContactPhoneInfo=""/>
>
>           </Sellers>)
>
>  Short-hand XML for <Seller/> is NOT good:
>
> grunt> a = load 'sample.xml' using
> org.apache.pig.piggybank.storage.XMLLoader('Seller') as (doc:chararray);
>
> grunt> dump a;
>
> I got nothing here.
>