You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Amit <am...@yahoo.com.INVALID> on 2015/02/02 01:50:39 UTC

> and < Comparison Operators not working

Hello,I am trying to run a Ad-hoc pig script on IBM Bluemix platform that has a arithmetic comparison.Suppose the data is ----f1-----10203040..
Let us say I would like to select the records where f1 > 20 . It is pretty easy operation, however I am not sure why I cannot see expected results in there.The data is initially loaded from a CSV file.Here is may pig script - ********************************************************************************************A = << Load from CSV file >> B = FOREACH A generate f1;C  = FILTER B by f1 > 20DUMP C;********************************************************************************************
 Appreciate if someone points out what I am doing wrong here.
I also tried to run this in local mode just to make sure I am doing this right.
Regards,Amit

Re: > and < Comparison Operators not working

Posted by Amit <am...@yahoo.com.INVALID>.
Thanks for the inputs.For now I have just got rid of quotes using REGEX and then casting it to int. Regards,
Amit 

     On Tuesday, February 3, 2015 12:41 AM, Arvind S <ar...@gmail.com> wrote:
   

 If you have quoted CSV .. Try using  CSVExcelStorage() loader

Cheers !!!
Arvind
On 03-Feb-2015 2:18 am, "Amit" <am...@yahoo.com.invalid> wrote:

> HelloIt looks like this is expected behavior. I presumed that whether the
> data comes in double quotes or does not make any difference.
> Please Refer Convert "3" to 3 with PigLatin
> |  |
> |  |  |  |  |  |  |  |
> | Convert "3" to 3 with PigLatinI read in a csv-file that contains fields
> with numbers like that: "3". Can I convert this fields from "3" to 3 with
> PigLatin? I need it to use the SUM() - Function.... |
> |  |
> | View on stackoverflow.com | Preview by Yahoo |
> |  |
> |  |
>
>    Regards,
> Amit
>
>      On Monday, February 2, 2015 2:18 PM, Amit <am...@yahoo.com.INVALID>
> wrote:
>
>
>  Thanks again for responses.I have indeed tried with explicit casting and
> using schema.
> I am now thinking that it has something to do with the integer value
> coming within double quotes in CSV format. ("10","Amit" ...)
> With double quotes -1) Tried below and PigStorage could not load f1
>
> A = LOAD '/local/amit/temp/data.csv' using PigStorage(',') AS
> (f1:int,name:chararray);DUMP A;
>
> Without Double Quotes i.e. CSV with data (10,Amit etc), the above Pig
> script works as expected and also Filters out the required rows as expected.
> Do you know if there are any existing issues with PigStorage trying to
> load int value which comes as double quoted string value in a CSV format.
> Appreciate your time to respond these questions. Regards,
> Amit
>
>    On Monday, February 2, 2015 11:28 AM, Pradeep Gollakota <
> pradeepg26@gmail.com> wrote:
>
>
>  Explicit casting will work, though you shouldn't need to use it. You
> should
> specify an input schema using the AS keyword. This will ensure that
> PigStorage will load your data using the appropriate types.
>
> On Mon, Feb 2, 2015 at 7:22 AM, Arvind S <ar...@gmail.com> wrote:
>
> > Use explicit casting during comparison
> >
> > Cheers !!!
> > Arvind
> > On 02-Feb-2015 8:39 pm, "Amit" <am...@yahoo.com.invalid> wrote:
> >
> > > Thanks for the response.The Pig script as such does not fail, it runs
> > > successfully ( trying in local mode), however when the run is finished
> it
> > > does not dump any tuples.Has it something to do with the CSV where the
> f1
> > > is stored as a string ?The CSV data would look like this -
> > >
> >
> *********************************************"10","abc""20","xyz""30,"lmn"...
> > > etc ***********************************************
> > > Thanks,Amit
> > >
> > >      On Monday, February 2, 2015 3:37 AM, Pradeep Gollakota <
> > > pradeepg26@gmail.com> wrote:
> > >
> > >
> > >  Just to clarify, do you have a semicolon after f1 > 20?
> > >
> > > A = LOAD 'data' USING PigStorage(',');
> > > B = FOREACH A GENERATE f1;
> > > C = FILTER B BY f1 > 20;
> > > DUMP C;
> > >
> > > This should be correct.
> > > ​
> > >
> > > On Sun, Feb 1, 2015 at 4:50 PM, Amit <am...@yahoo.com.invalid> wrote:
> > >
> > > > Hello,I am trying to run a Ad-hoc pig script on IBM Bluemix platform
> > that
> > > > has a arithmetic comparison.Suppose the data is ----f1-----10203040..
> > > > Let us say I would like to select the records where f1 > 20 . It is
> > > pretty
> > > > easy operation, however I am not sure why I cannot see expected
> results
> > > in
> > > > there.The data is initially loaded from a CSV file.Here is may pig
> > script
> > > > -
> > >
> >
> ********************************************************************************************A
> > > > = << Load from CSV file >> B = FOREACH A generate f1;C  = FILTER B by
> > f1
> > > >
> > > > 20DUMP
> > > >
> > >
> >
> C;********************************************************************************************
> > > >  Appreciate if someone points out what I am doing wrong here.
> > > > I also tried to run this in local mode just to make sure I am doing
> > this
> > > > right.
> > > > Regards,Amit
> > >
> > >
> >
>
>
>
>

   

Re: > and < Comparison Operators not working

Posted by Arvind S <ar...@gmail.com>.
If you have quoted CSV .. Try using  CSVExcelStorage() loader

Cheers !!!
Arvind
On 03-Feb-2015 2:18 am, "Amit" <am...@yahoo.com.invalid> wrote:

> HelloIt looks like this is expected behavior. I presumed that whether the
> data comes in double quotes or does not make any difference.
> Please Refer Convert "3" to 3 with PigLatin
> |   |
> |   |  |   |   |   |   |   |
> | Convert "3" to 3 with PigLatinI read in a csv-file that contains fields
> with numbers like that: "3". Can I convert this fields from "3" to 3 with
> PigLatin? I need it to use the SUM() - Function.... |
> |  |
> | View on stackoverflow.com | Preview by Yahoo |
> |  |
> |   |
>
>    Regards,
> Amit
>
>      On Monday, February 2, 2015 2:18 PM, Amit <am...@yahoo.com.INVALID>
> wrote:
>
>
>  Thanks again for responses.I have indeed tried with explicit casting and
> using schema.
> I am now thinking that it has something to do with the integer value
> coming within double quotes in CSV format. ("10","Amit" ...)
> With double quotes -1) Tried below and PigStorage could not load f1
>
> A = LOAD '/local/amit/temp/data.csv' using PigStorage(',') AS
> (f1:int,name:chararray);DUMP A;
>
> Without Double Quotes i.e. CSV with data (10,Amit etc), the above Pig
> script works as expected and also Filters out the required rows as expected.
> Do you know if there are any existing issues with PigStorage trying to
> load int value which comes as double quoted string value in a CSV format.
> Appreciate your time to respond these questions. Regards,
> Amit
>
>     On Monday, February 2, 2015 11:28 AM, Pradeep Gollakota <
> pradeepg26@gmail.com> wrote:
>
>
>  Explicit casting will work, though you shouldn't need to use it. You
> should
> specify an input schema using the AS keyword. This will ensure that
> PigStorage will load your data using the appropriate types.
>
> On Mon, Feb 2, 2015 at 7:22 AM, Arvind S <ar...@gmail.com> wrote:
>
> > Use explicit casting during comparison
> >
> > Cheers !!!
> > Arvind
> > On 02-Feb-2015 8:39 pm, "Amit" <am...@yahoo.com.invalid> wrote:
> >
> > > Thanks for the response.The Pig script as such does not fail, it runs
> > > successfully ( trying in local mode), however when the run is finished
> it
> > > does not dump any tuples.Has it something to do with the CSV where the
> f1
> > > is stored as a string ?The CSV data would look like this -
> > >
> >
> *********************************************"10","abc""20","xyz""30,"lmn"...
> > > etc ***********************************************
> > > Thanks,Amit
> > >
> > >      On Monday, February 2, 2015 3:37 AM, Pradeep Gollakota <
> > > pradeepg26@gmail.com> wrote:
> > >
> > >
> > >  Just to clarify, do you have a semicolon after f1 > 20?
> > >
> > > A = LOAD 'data' USING PigStorage(',');
> > > B = FOREACH A GENERATE f1;
> > > C = FILTER B BY f1 > 20;
> > > DUMP C;
> > >
> > > This should be correct.
> > > ​
> > >
> > > On Sun, Feb 1, 2015 at 4:50 PM, Amit <am...@yahoo.com.invalid> wrote:
> > >
> > > > Hello,I am trying to run a Ad-hoc pig script on IBM Bluemix platform
> > that
> > > > has a arithmetic comparison.Suppose the data is ----f1-----10203040..
> > > > Let us say I would like to select the records where f1 > 20 . It is
> > > pretty
> > > > easy operation, however I am not sure why I cannot see expected
> results
> > > in
> > > > there.The data is initially loaded from a CSV file.Here is may pig
> > script
> > > > -
> > >
> >
> ********************************************************************************************A
> > > > = << Load from CSV file >> B = FOREACH A generate f1;C  = FILTER B by
> > f1
> > > >
> > > > 20DUMP
> > > >
> > >
> >
> C;********************************************************************************************
> > > >  Appreciate if someone points out what I am doing wrong here.
> > > > I also tried to run this in local mode just to make sure I am doing
> > this
> > > > right.
> > > > Regards,Amit
> > >
> > >
> >
>
>
>
>

Re: > and < Comparison Operators not working

Posted by Amit <am...@yahoo.com.INVALID>.
HelloIt looks like this is expected behavior. I presumed that whether the data comes in double quotes or does not make any difference.
Please Refer Convert "3" to 3 with PigLatin
|   |
|   |  |   |   |   |   |   |
| Convert "3" to 3 with PigLatinI read in a csv-file that contains fields with numbers like that: "3". Can I convert this fields from "3" to 3 with PigLatin? I need it to use the SUM() - Function.... |
|  |
| View on stackoverflow.com | Preview by Yahoo |
|  |
|   |

   Regards,
Amit 

     On Monday, February 2, 2015 2:18 PM, Amit <am...@yahoo.com.INVALID> wrote:
   

 Thanks again for responses.I have indeed tried with explicit casting and using schema.
I am now thinking that it has something to do with the integer value coming within double quotes in CSV format. ("10","Amit" ...)
With double quotes -1) Tried below and PigStorage could not load f1 

A = LOAD '/local/amit/temp/data.csv' using PigStorage(',') AS (f1:int,name:chararray);DUMP A;

Without Double Quotes i.e. CSV with data (10,Amit etc), the above Pig script works as expected and also Filters out the required rows as expected.
Do you know if there are any existing issues with PigStorage trying to load int value which comes as double quoted string value in a CSV format.
Appreciate your time to respond these questions. Regards,
Amit 

    On Monday, February 2, 2015 11:28 AM, Pradeep Gollakota <pr...@gmail.com> wrote:
  

 Explicit casting will work, though you shouldn't need to use it. You should
specify an input schema using the AS keyword. This will ensure that
PigStorage will load your data using the appropriate types.

On Mon, Feb 2, 2015 at 7:22 AM, Arvind S <ar...@gmail.com> wrote:

> Use explicit casting during comparison
>
> Cheers !!!
> Arvind
> On 02-Feb-2015 8:39 pm, "Amit" <am...@yahoo.com.invalid> wrote:
>
> > Thanks for the response.The Pig script as such does not fail, it runs
> > successfully ( trying in local mode), however when the run is finished it
> > does not dump any tuples.Has it something to do with the CSV where the f1
> > is stored as a string ?The CSV data would look like this -
> >
> *********************************************"10","abc""20","xyz""30,"lmn"...
> > etc ***********************************************
> > Thanks,Amit
> >
> >      On Monday, February 2, 2015 3:37 AM, Pradeep Gollakota <
> > pradeepg26@gmail.com> wrote:
> >
> >
> >  Just to clarify, do you have a semicolon after f1 > 20?
> >
> > A = LOAD 'data' USING PigStorage(',');
> > B = FOREACH A GENERATE f1;
> > C = FILTER B BY f1 > 20;
> > DUMP C;
> >
> > This should be correct.
> > ​
> >
> > On Sun, Feb 1, 2015 at 4:50 PM, Amit <am...@yahoo.com.invalid> wrote:
> >
> > > Hello,I am trying to run a Ad-hoc pig script on IBM Bluemix platform
> that
> > > has a arithmetic comparison.Suppose the data is ----f1-----10203040..
> > > Let us say I would like to select the records where f1 > 20 . It is
> > pretty
> > > easy operation, however I am not sure why I cannot see expected results
> > in
> > > there.The data is initially loaded from a CSV file.Here is may pig
> script
> > > -
> >
> ********************************************************************************************A
> > > = << Load from CSV file >> B = FOREACH A generate f1;C  = FILTER B by
> f1
> > >
> > > 20DUMP
> > >
> >
> C;********************************************************************************************
> > >  Appreciate if someone points out what I am doing wrong here.
> > > I also tried to run this in local mode just to make sure I am doing
> this
> > > right.
> > > Regards,Amit
> >
> >
>



   

Re: > and < Comparison Operators not working

Posted by Amit <am...@yahoo.com.INVALID>.
Thanks again for responses.I have indeed tried with explicit casting and using schema.
I am now thinking that it has something to do with the integer value coming within double quotes in CSV format. ("10","Amit" ...)
With double quotes -1) Tried below and PigStorage could not load f1 

A = LOAD '/local/amit/temp/data.csv' using PigStorage(',') AS (f1:int,name:chararray);DUMP A;

Without Double Quotes i.e. CSV with data (10,Amit etc), the above Pig script works as expected and also Filters out the required rows as expected.
Do you know if there are any existing issues with PigStorage trying to load int value which comes as double quoted string value in a CSV format.
Appreciate your time to respond these questions. Regards,
Amit 

     On Monday, February 2, 2015 11:28 AM, Pradeep Gollakota <pr...@gmail.com> wrote:
   

 Explicit casting will work, though you shouldn't need to use it. You should
specify an input schema using the AS keyword. This will ensure that
PigStorage will load your data using the appropriate types.

On Mon, Feb 2, 2015 at 7:22 AM, Arvind S <ar...@gmail.com> wrote:

> Use explicit casting during comparison
>
> Cheers !!!
> Arvind
> On 02-Feb-2015 8:39 pm, "Amit" <am...@yahoo.com.invalid> wrote:
>
> > Thanks for the response.The Pig script as such does not fail, it runs
> > successfully ( trying in local mode), however when the run is finished it
> > does not dump any tuples.Has it something to do with the CSV where the f1
> > is stored as a string ?The CSV data would look like this -
> >
> *********************************************"10","abc""20","xyz""30,"lmn"...
> > etc ***********************************************
> > Thanks,Amit
> >
> >      On Monday, February 2, 2015 3:37 AM, Pradeep Gollakota <
> > pradeepg26@gmail.com> wrote:
> >
> >
> >  Just to clarify, do you have a semicolon after f1 > 20?
> >
> > A = LOAD 'data' USING PigStorage(',');
> > B = FOREACH A GENERATE f1;
> > C = FILTER B BY f1 > 20;
> > DUMP C;
> >
> > This should be correct.
> > ​
> >
> > On Sun, Feb 1, 2015 at 4:50 PM, Amit <am...@yahoo.com.invalid> wrote:
> >
> > > Hello,I am trying to run a Ad-hoc pig script on IBM Bluemix platform
> that
> > > has a arithmetic comparison.Suppose the data is ----f1-----10203040..
> > > Let us say I would like to select the records where f1 > 20 . It is
> > pretty
> > > easy operation, however I am not sure why I cannot see expected results
> > in
> > > there.The data is initially loaded from a CSV file.Here is may pig
> script
> > > -
> >
> ********************************************************************************************A
> > > = << Load from CSV file >> B = FOREACH A generate f1;C  = FILTER B by
> f1
> > >
> > > 20DUMP
> > >
> >
> C;********************************************************************************************
> > >  Appreciate if someone points out what I am doing wrong here.
> > > I also tried to run this in local mode just to make sure I am doing
> this
> > > right.
> > > Regards,Amit
> >
> >
>

   

Re: > and < Comparison Operators not working

Posted by Pradeep Gollakota <pr...@gmail.com>.
Explicit casting will work, though you shouldn't need to use it. You should
specify an input schema using the AS keyword. This will ensure that
PigStorage will load your data using the appropriate types.

On Mon, Feb 2, 2015 at 7:22 AM, Arvind S <ar...@gmail.com> wrote:

> Use explicit casting during comparison
>
> Cheers !!!
> Arvind
> On 02-Feb-2015 8:39 pm, "Amit" <am...@yahoo.com.invalid> wrote:
>
> > Thanks for the response.The Pig script as such does not fail, it runs
> > successfully ( trying in local mode), however when the run is finished it
> > does not dump any tuples.Has it something to do with the CSV where the f1
> > is stored as a string ?The CSV data would look like this -
> >
> *********************************************"10","abc""20","xyz""30,"lmn"...
> > etc ***********************************************
> > Thanks,Amit
> >
> >      On Monday, February 2, 2015 3:37 AM, Pradeep Gollakota <
> > pradeepg26@gmail.com> wrote:
> >
> >
> >  Just to clarify, do you have a semicolon after f1 > 20?
> >
> > A = LOAD 'data' USING PigStorage(',');
> > B = FOREACH A GENERATE f1;
> > C = FILTER B BY f1 > 20;
> > DUMP C;
> >
> > This should be correct.
> > ​
> >
> > On Sun, Feb 1, 2015 at 4:50 PM, Amit <am...@yahoo.com.invalid> wrote:
> >
> > > Hello,I am trying to run a Ad-hoc pig script on IBM Bluemix platform
> that
> > > has a arithmetic comparison.Suppose the data is ----f1-----10203040..
> > > Let us say I would like to select the records where f1 > 20 . It is
> > pretty
> > > easy operation, however I am not sure why I cannot see expected results
> > in
> > > there.The data is initially loaded from a CSV file.Here is may pig
> script
> > > -
> >
> ********************************************************************************************A
> > > = << Load from CSV file >> B = FOREACH A generate f1;C  = FILTER B by
> f1
> > >
> > > 20DUMP
> > >
> >
> C;********************************************************************************************
> > >  Appreciate if someone points out what I am doing wrong here.
> > > I also tried to run this in local mode just to make sure I am doing
> this
> > > right.
> > > Regards,Amit
> >
> >
>

Re: > and < Comparison Operators not working

Posted by Arvind S <ar...@gmail.com>.
Use explicit casting during comparison

Cheers !!!
Arvind
On 02-Feb-2015 8:39 pm, "Amit" <am...@yahoo.com.invalid> wrote:

> Thanks for the response.The Pig script as such does not fail, it runs
> successfully ( trying in local mode), however when the run is finished it
> does not dump any tuples.Has it something to do with the CSV where the f1
> is stored as a string ?The CSV data would look like this -
> *********************************************"10","abc""20","xyz""30,"lmn"...
> etc ***********************************************
> Thanks,Amit
>
>      On Monday, February 2, 2015 3:37 AM, Pradeep Gollakota <
> pradeepg26@gmail.com> wrote:
>
>
>  Just to clarify, do you have a semicolon after f1 > 20?
>
> A = LOAD 'data' USING PigStorage(',');
> B = FOREACH A GENERATE f1;
> C = FILTER B BY f1 > 20;
> DUMP C;
>
> This should be correct.
> ​
>
> On Sun, Feb 1, 2015 at 4:50 PM, Amit <am...@yahoo.com.invalid> wrote:
>
> > Hello,I am trying to run a Ad-hoc pig script on IBM Bluemix platform that
> > has a arithmetic comparison.Suppose the data is ----f1-----10203040..
> > Let us say I would like to select the records where f1 > 20 . It is
> pretty
> > easy operation, however I am not sure why I cannot see expected results
> in
> > there.The data is initially loaded from a CSV file.Here is may pig script
> > -
> ********************************************************************************************A
> > = << Load from CSV file >> B = FOREACH A generate f1;C  = FILTER B by f1
> >
> > 20DUMP
> >
> C;********************************************************************************************
> >  Appreciate if someone points out what I am doing wrong here.
> > I also tried to run this in local mode just to make sure I am doing this
> > right.
> > Regards,Amit
>
>

Re: > and < Comparison Operators not working

Posted by Amit <am...@yahoo.com.INVALID>.
Thanks for the response.The Pig script as such does not fail, it runs successfully ( trying in local mode), however when the run is finished it does not dump any tuples.Has it something to do with the CSV where the f1 is stored as a string ?The CSV data would look like this - 
*********************************************"10","abc""20","xyz""30,"lmn"... etc ***********************************************
Thanks,Amit 

     On Monday, February 2, 2015 3:37 AM, Pradeep Gollakota <pr...@gmail.com> wrote:
   

 Just to clarify, do you have a semicolon after f1 > 20?

A = LOAD 'data' USING PigStorage(',');
B = FOREACH A GENERATE f1;
C = FILTER B BY f1 > 20;
DUMP C;

This should be correct.
​

On Sun, Feb 1, 2015 at 4:50 PM, Amit <am...@yahoo.com.invalid> wrote:

> Hello,I am trying to run a Ad-hoc pig script on IBM Bluemix platform that
> has a arithmetic comparison.Suppose the data is ----f1-----10203040..
> Let us say I would like to select the records where f1 > 20 . It is pretty
> easy operation, however I am not sure why I cannot see expected results in
> there.The data is initially loaded from a CSV file.Here is may pig script
> - ********************************************************************************************A
> = << Load from CSV file >> B = FOREACH A generate f1;C  = FILTER B by f1 >
> 20DUMP
> C;********************************************************************************************
>  Appreciate if someone points out what I am doing wrong here.
> I also tried to run this in local mode just to make sure I am doing this
> right.
> Regards,Amit

   

Re: > and < Comparison Operators not working

Posted by Pradeep Gollakota <pr...@gmail.com>.
Just to clarify, do you have a semicolon after f1 > 20?

A = LOAD 'data' USING PigStorage(',');
B = FOREACH A GENERATE f1;
C = FILTER B BY f1 > 20;
DUMP C;

This should be correct.
​

On Sun, Feb 1, 2015 at 4:50 PM, Amit <am...@yahoo.com.invalid> wrote:

> Hello,I am trying to run a Ad-hoc pig script on IBM Bluemix platform that
> has a arithmetic comparison.Suppose the data is ----f1-----10203040..
> Let us say I would like to select the records where f1 > 20 . It is pretty
> easy operation, however I am not sure why I cannot see expected results in
> there.The data is initially loaded from a CSV file.Here is may pig script
> - ********************************************************************************************A
> = << Load from CSV file >> B = FOREACH A generate f1;C  = FILTER B by f1 >
> 20DUMP
> C;********************************************************************************************
>  Appreciate if someone points out what I am doing wrong here.
> I also tried to run this in local mode just to make sure I am doing this
> right.
> Regards,Amit