You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Lin Guo <gu...@yahoo.com> on 2010/12/09 10:26:33 UTC

Review Request: Update pig parser to allow function arguments to contain multiple lines

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/154/
-----------------------------------------------------------

Review request for pig.


Summary
-------

Summary:

We want to extend pig parser to allow function arguments to contain multiple lines as well as string list, like

STORE data INTO 'testOut' 
USING storage.avro.AvroStorage (
'{"debug": 5,
  "data": "/user/lguo/testOut/ComponentActTracking4/part-m-00000.avro",
  "field0": "int",
  "field1": "def:browser_id",
  "field3": "def:act_content" } '
);


This addresses bug PIG-1749.
    https://issues.apache.org/jira/browse/PIG-1749


Diffs
-----

  trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt 1040872 
  trunk/test/org/apache/pig/test/TestParamSubPreproc.java 1040872 

Diff: https://reviews.apache.org/r/154/diff


Testing
-------

     [exec] 
     [exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec] 
     [exec] 


Thanks,

Lin


Re: Review Request: Update pig parser to allow function arguments to contain multiple lines

Posted by Lin Guo <gu...@yahoo.com>.

> On 2010-12-10 10:40:46, Daniel Dai wrote:
> > trunk/test/org/apache/pig/test/TestParamSubPreproc.java, line 1387
> > <https://reviews.apache.org/r/154/diff/1/?file=1105#file1105line1387>
> >
> >     Can you add a test case to test QueryParser itself?

I will add more test cases to TestQueryParser. 


> On 2010-12-10 10:40:46, Daniel Dai wrote:
> > trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt, line 1417
> > <https://reviews.apache.org/r/154/diff/1/?file=1104#file1104line1417>
> >
> >     I prefer to allow multiple line support to all strings. Only allow function arguments taking multiple lines seems confusing to the user.

Is it okay to have all other stuffs (e.g. paths, filenames) contain multiple lines? It doesn't seem very intuitive to me. 

In the parser definition, only "define clause" and "function argument" use StringList and I have changed "function arguments" to use MultiLinedStringList. 
I am not sure whether there is any side effect of updating "define clause" to take MultiLinedStringList. Any insights here? 


> On 2010-12-10 10:40:46, Daniel Dai wrote:
> > trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt, line 969
> > <https://reviews.apache.org/r/154/diff/1/?file=1104#file1104line969>
> >
> >     The QueryParser will go away in 0.9 release. We will switch the parser (https://issues.apache.org/jira/browse/PIG-1618) in couple of months. So shall we wait for the new parser?

Thanks a lot for your comments. 

Will the new QueryParser allow multi-lines in function arguments? If not, can we add this as a requirement? 


- Lin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/154/#review66
-----------------------------------------------------------


On 2010-12-09 01:26:33, Lin Guo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/154/
> -----------------------------------------------------------
> 
> (Updated 2010-12-09 01:26:33)
> 
> 
> Review request for pig.
> 
> 
> Summary
> -------
> 
> Summary:
> 
> We want to extend pig parser to allow function arguments to contain multiple lines as well as string list, like
> 
> STORE data INTO 'testOut' 
> USING storage.avro.AvroStorage (
> '{"debug": 5,
>   "data": "/user/lguo/testOut/ComponentActTracking4/part-m-00000.avro",
>   "field0": "int",
>   "field1": "def:browser_id",
>   "field3": "def:act_content" } '
> );
> 
> 
> This addresses bug PIG-1749.
>     https://issues.apache.org/jira/browse/PIG-1749
> 
> 
> Diffs
> -----
> 
>   trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt 1040872 
>   trunk/test/org/apache/pig/test/TestParamSubPreproc.java 1040872 
> 
> Diff: https://reviews.apache.org/r/154/diff
> 
> 
> Testing
> -------
> 
>      [exec] 
>      [exec] +1 overall.  
>      [exec] 
>      [exec]     +1 @author.  The patch does not contain any @author tags.
>      [exec] 
>      [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
>      [exec] 
>      [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
>      [exec] 
>      [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
>      [exec] 
>      [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
>      [exec] 
>      [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
>      [exec] 
>      [exec] 
> 
> 
> Thanks,
> 
> Lin
> 
>


Re: Review Request: Update pig parser to allow function arguments to contain multiple lines

Posted by Daniel Dai <ji...@yahoo-inc.com>.
I am fine commit to the trunk for now, but notice:
1. We will switch the parser before 0.9 release, so the patch will go 
away in 0.9 anyway
2. No document for now, use it "secretly"

Daniel

Dmitriy Ryaboy wrote:
>   
>> On 2010-12-10 10:40:46, Daniel Dai wrote:
>>     
>>> trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt, line 969
>>> <https://reviews.apache.org/r/154/diff/1/?file=1104#file1104line969>
>>>
>>>     The QueryParser will go away in 0.9 release. We will switch the parser (https://issues.apache.org/jira/browse/PIG-1618) in couple of months. So shall we wait for the new parser?
>>>       
>> Lin Guo wrote:
>>     Thanks a lot for your comments. 
>>     
>>     Will the new QueryParser allow multi-lines in function arguments? If not, can we add this as a requirement?
>>
>> Daniel Dai wrote:
>>     Yes, feel free to comment on that Jira
>>     
>
> I think it's worth having this change committed in the meantime, since it's something that can be easily backported into builds of 0.8 that are going to be used for a while. We'll use it, and I expect Cloudera will be happy to roll something like this in as well.
>
>
> - Dmitriy
>
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/154/#review66
> -----------------------------------------------------------
>
>
> On 2010-12-09 01:26:33, Lin Guo wrote:
>   
>> -----------------------------------------------------------
>> This is an automatically generated e-mail. To reply, visit:
>> https://reviews.apache.org/r/154/
>> -----------------------------------------------------------
>>
>> (Updated 2010-12-09 01:26:33)
>>
>>
>> Review request for pig.
>>
>>
>> Summary
>> -------
>>
>> Summary:
>>
>> We want to extend pig parser to allow function arguments to contain multiple lines as well as string list, like
>>
>> STORE data INTO 'testOut' 
>> USING storage.avro.AvroStorage (
>> '{"debug": 5,
>>   "data": "/user/lguo/testOut/ComponentActTracking4/part-m-00000.avro",
>>   "field0": "int",
>>   "field1": "def:browser_id",
>>   "field3": "def:act_content" } '
>> );
>>
>>
>> This addresses bug PIG-1749.
>>     https://issues.apache.org/jira/browse/PIG-1749
>>
>>
>> Diffs
>> -----
>>
>>   trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt 1040872 
>>   trunk/test/org/apache/pig/test/TestParamSubPreproc.java 1040872 
>>
>> Diff: https://reviews.apache.org/r/154/diff
>>
>>
>> Testing
>> -------
>>
>>      [exec] 
>>      [exec] +1 overall.  
>>      [exec] 
>>      [exec]     +1 @author.  The patch does not contain any @author tags.
>>      [exec] 
>>      [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
>>      [exec] 
>>      [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
>>      [exec] 
>>      [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
>>      [exec] 
>>      [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
>>      [exec] 
>>      [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
>>      [exec] 
>>      [exec] 
>>
>>
>> Thanks,
>>
>> Lin
>>
>>
>>     
>
>   


Re: Review Request: Update pig parser to allow function arguments to contain multiple lines

Posted by Dmitriy Ryaboy <dv...@gmail.com>.

> On 2010-12-10 10:40:46, Daniel Dai wrote:
> > trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt, line 969
> > <https://reviews.apache.org/r/154/diff/1/?file=1104#file1104line969>
> >
> >     The QueryParser will go away in 0.9 release. We will switch the parser (https://issues.apache.org/jira/browse/PIG-1618) in couple of months. So shall we wait for the new parser?
> 
> Lin Guo wrote:
>     Thanks a lot for your comments. 
>     
>     Will the new QueryParser allow multi-lines in function arguments? If not, can we add this as a requirement?
> 
> Daniel Dai wrote:
>     Yes, feel free to comment on that Jira

I think it's worth having this change committed in the meantime, since it's something that can be easily backported into builds of 0.8 that are going to be used for a while. We'll use it, and I expect Cloudera will be happy to roll something like this in as well.


- Dmitriy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/154/#review66
-----------------------------------------------------------


On 2010-12-09 01:26:33, Lin Guo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/154/
> -----------------------------------------------------------
> 
> (Updated 2010-12-09 01:26:33)
> 
> 
> Review request for pig.
> 
> 
> Summary
> -------
> 
> Summary:
> 
> We want to extend pig parser to allow function arguments to contain multiple lines as well as string list, like
> 
> STORE data INTO 'testOut' 
> USING storage.avro.AvroStorage (
> '{"debug": 5,
>   "data": "/user/lguo/testOut/ComponentActTracking4/part-m-00000.avro",
>   "field0": "int",
>   "field1": "def:browser_id",
>   "field3": "def:act_content" } '
> );
> 
> 
> This addresses bug PIG-1749.
>     https://issues.apache.org/jira/browse/PIG-1749
> 
> 
> Diffs
> -----
> 
>   trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt 1040872 
>   trunk/test/org/apache/pig/test/TestParamSubPreproc.java 1040872 
> 
> Diff: https://reviews.apache.org/r/154/diff
> 
> 
> Testing
> -------
> 
>      [exec] 
>      [exec] +1 overall.  
>      [exec] 
>      [exec]     +1 @author.  The patch does not contain any @author tags.
>      [exec] 
>      [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
>      [exec] 
>      [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
>      [exec] 
>      [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
>      [exec] 
>      [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
>      [exec] 
>      [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
>      [exec] 
>      [exec] 
> 
> 
> Thanks,
> 
> Lin
> 
>


Re: Review Request: Update pig parser to allow function arguments to contain multiple lines

Posted by Daniel Dai <da...@gmail.com>.

> On 2010-12-10 10:40:46, Daniel Dai wrote:
> > trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt, line 969
> > <https://reviews.apache.org/r/154/diff/1/?file=1104#file1104line969>
> >
> >     The QueryParser will go away in 0.9 release. We will switch the parser (https://issues.apache.org/jira/browse/PIG-1618) in couple of months. So shall we wait for the new parser?
> 
> Lin Guo wrote:
>     Thanks a lot for your comments. 
>     
>     Will the new QueryParser allow multi-lines in function arguments? If not, can we add this as a requirement?

Yes, feel free to comment on that Jira


> On 2010-12-10 10:40:46, Daniel Dai wrote:
> > trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt, line 1417
> > <https://reviews.apache.org/r/154/diff/1/?file=1104#file1104line1417>
> >
> >     I prefer to allow multiple line support to all strings. Only allow function arguments taking multiple lines seems confusing to the user.
> 
> Lin Guo wrote:
>     Is it okay to have all other stuffs (e.g. paths, filenames) contain multiple lines? It doesn't seem very intuitive to me. 
>     
>     In the parser definition, only "define clause" and "function argument" use StringList and I have changed "function arguments" to use MultiLinedStringList. 
>     I am not sure whether there is any side effect of updating "define clause" to take MultiLinedStringList. Any insights here?

Yes, but some places requires long strings might make sense. For example, string constant, path list, etc. The main issue is it may confuse user which string can be multiline. So IMHO it might be easier to make all string multilinable.


- Daniel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/154/#review66
-----------------------------------------------------------


On 2010-12-09 01:26:33, Lin Guo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/154/
> -----------------------------------------------------------
> 
> (Updated 2010-12-09 01:26:33)
> 
> 
> Review request for pig.
> 
> 
> Summary
> -------
> 
> Summary:
> 
> We want to extend pig parser to allow function arguments to contain multiple lines as well as string list, like
> 
> STORE data INTO 'testOut' 
> USING storage.avro.AvroStorage (
> '{"debug": 5,
>   "data": "/user/lguo/testOut/ComponentActTracking4/part-m-00000.avro",
>   "field0": "int",
>   "field1": "def:browser_id",
>   "field3": "def:act_content" } '
> );
> 
> 
> This addresses bug PIG-1749.
>     https://issues.apache.org/jira/browse/PIG-1749
> 
> 
> Diffs
> -----
> 
>   trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt 1040872 
>   trunk/test/org/apache/pig/test/TestParamSubPreproc.java 1040872 
> 
> Diff: https://reviews.apache.org/r/154/diff
> 
> 
> Testing
> -------
> 
>      [exec] 
>      [exec] +1 overall.  
>      [exec] 
>      [exec]     +1 @author.  The patch does not contain any @author tags.
>      [exec] 
>      [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
>      [exec] 
>      [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
>      [exec] 
>      [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
>      [exec] 
>      [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
>      [exec] 
>      [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
>      [exec] 
>      [exec] 
> 
> 
> Thanks,
> 
> Lin
> 
>


Re: Review Request: Update pig parser to allow function arguments to contain multiple lines

Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/154/#review66
-----------------------------------------------------------



trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt
<https://reviews.apache.org/r/154/#comment54>

    The QueryParser will go away in 0.9 release. We will switch the parser (https://issues.apache.org/jira/browse/PIG-1618) in couple of months. So shall we wait for the new parser?



trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt
<https://reviews.apache.org/r/154/#comment53>

    I prefer to allow multiple line support to all strings. Only allow function arguments taking multiple lines seems confusing to the user.



trunk/test/org/apache/pig/test/TestParamSubPreproc.java
<https://reviews.apache.org/r/154/#comment52>

    Can you add a test case to test QueryParser itself?


- Daniel


On 2010-12-09 01:26:33, Lin Guo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/154/
> -----------------------------------------------------------
> 
> (Updated 2010-12-09 01:26:33)
> 
> 
> Review request for pig.
> 
> 
> Summary
> -------
> 
> Summary:
> 
> We want to extend pig parser to allow function arguments to contain multiple lines as well as string list, like
> 
> STORE data INTO 'testOut' 
> USING storage.avro.AvroStorage (
> '{"debug": 5,
>   "data": "/user/lguo/testOut/ComponentActTracking4/part-m-00000.avro",
>   "field0": "int",
>   "field1": "def:browser_id",
>   "field3": "def:act_content" } '
> );
> 
> 
> This addresses bug PIG-1749.
>     https://issues.apache.org/jira/browse/PIG-1749
> 
> 
> Diffs
> -----
> 
>   trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt 1040872 
>   trunk/test/org/apache/pig/test/TestParamSubPreproc.java 1040872 
> 
> Diff: https://reviews.apache.org/r/154/diff
> 
> 
> Testing
> -------
> 
>      [exec] 
>      [exec] +1 overall.  
>      [exec] 
>      [exec]     +1 @author.  The patch does not contain any @author tags.
>      [exec] 
>      [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
>      [exec] 
>      [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
>      [exec] 
>      [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
>      [exec] 
>      [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
>      [exec] 
>      [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
>      [exec] 
>      [exec] 
> 
> 
> Thanks,
> 
> Lin
> 
>


Re: Review Request: Update pig parser to allow function arguments to contain multiple lines

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/154/#review72
-----------------------------------------------------------


So, it looks like the ANTLR parser is already in trunk. This might have to go into 0.8 after it's released, left as an integration option for those who roll their own.


trunk/test/org/apache/pig/test/TestParamSubPreproc.java
<https://reviews.apache.org/r/154/#comment65>

    Reliance on AvroStorage seems unnecessary. Use one of the engines already in trunk?


- Dmitriy


On 2010-12-09 01:26:33, Lin Guo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/154/
> -----------------------------------------------------------
> 
> (Updated 2010-12-09 01:26:33)
> 
> 
> Review request for pig.
> 
> 
> Summary
> -------
> 
> Summary:
> 
> We want to extend pig parser to allow function arguments to contain multiple lines as well as string list, like
> 
> STORE data INTO 'testOut' 
> USING storage.avro.AvroStorage (
> '{"debug": 5,
>   "data": "/user/lguo/testOut/ComponentActTracking4/part-m-00000.avro",
>   "field0": "int",
>   "field1": "def:browser_id",
>   "field3": "def:act_content" } '
> );
> 
> 
> This addresses bug PIG-1749.
>     https://issues.apache.org/jira/browse/PIG-1749
> 
> 
> Diffs
> -----
> 
>   trunk/src/org/apache/pig/impl/logicalLayer/parser/QueryParser.jjt 1040872 
>   trunk/test/org/apache/pig/test/TestParamSubPreproc.java 1040872 
> 
> Diff: https://reviews.apache.org/r/154/diff
> 
> 
> Testing
> -------
> 
>      [exec] 
>      [exec] +1 overall.  
>      [exec] 
>      [exec]     +1 @author.  The patch does not contain any @author tags.
>      [exec] 
>      [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
>      [exec] 
>      [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
>      [exec] 
>      [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
>      [exec] 
>      [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
>      [exec] 
>      [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
>      [exec] 
>      [exec] 
> 
> 
> Thanks,
> 
> Lin
> 
>