You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Cheolsoo Park (JIRA)" <ji...@apache.org> on 2013/08/02 21:29:49 UTC

[jira] [Commented] (PIG-3359) Register Statements and Param Substitution in Macros

    [ https://issues.apache.org/jira/browse/PIG-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13727988#comment-13727988 ] 

Cheolsoo Park commented on PIG-3359:
------------------------------------

All the unit tests pass. I also tested with some of production scripts and verified that "pig -dryrun" generates the same output. Awesome!

1. The only difference that I see is that a lot more warnings are printed when there are many macro files. For example,
{code}
2013-08-02 19:05:46,158 [main] WARN  org.apache.pig.tools.parameters.PreprocessorContext - Warning : Multiple values found for GCI_SOURCE_NETFLIX_SEASONS. Using value 1
2013-08-02 19:05:46,158 [main] WARN  org.apache.pig.tools.parameters.PreprocessorContext - Warning : Multiple values found for GCI_ATTRIBUTE_ORIGINAL_COUNTRY. Using value 1
2013-08-02 19:05:46,158 [main] WARN  org.apache.pig.tools.parameters.PreprocessorContext - Warning : Multiple values found for GCI_ATTRIBUTE_SEASON_SEQUENCE_NBR. Using value 5
2013-08-02 19:05:46,158 [main] WARN  org.apache.pig.tools.parameters.PreprocessorContext - Warning : Multiple values found for GCI_ATTRIBUTE_RELEASE_YEAR. Using value 6
2013-08-02 19:05:46,158 [main] WARN  org.apache.pig.tools.parameters.PreprocessorContext - Warning : Multiple values found for GCI_ATTRIBUTE_EPISODE_COUNT. Using value 8
2013-08-02 19:05:46,158 [main] WARN  org.apache.pig.tools.parameters.PreprocessorContext - Warning : Multiple values found for GCI_ATTRIBUTE_TITLE_TYPE. Using value 20
{code}
This makes sense because you load params every time when importing a macro file. Can you please lower the log level to debug for these messages? This may unnecessarily scare the user.

2. Can you update the doc? I think you can simply remove the following lines from the [macro section|http://pig.apache.org/docs/r0.11.0/cont.html#macros]:
{code}
- Macros can only contain Pig Latin statements. The REGISTER statement is not supported. The shell commands (used with Grunt) are not supported.
- Parameter substitution cannot be used inside of macros. Parameters should be explicitly passed to macros and parameter substitution used only at the top level.
{code}
You can add your examples too, but I will let you decide on that. :-)

I will commit this as soon as you update your patch. Thanks a lot! 
                
> Register Statements and Param Substitution in Macros
> ----------------------------------------------------
>
>                 Key: PIG-3359
>                 URL: https://issues.apache.org/jira/browse/PIG-3359
>             Project: Pig
>          Issue Type: Bug
>          Components: parser
>            Reporter: Jonathan Packer
>            Assignee: Jonathan Packer
>         Attachments: PIG-3359_test.tar.gz, PIG-3359-v1.diff, PIG-3359-v2.diff, PIG-3359-v3.diff, PIG-3359-v3-test-failures.txt, PIG-3359-v4.diff, PIG-3359-v5.diff, PIG-3359-v6.diff
>
>
> There are some gaps in the functionality of macros that I've made a patch to address. The goal is to provide everything you'd need to make reusable algorithms libraries.
> 1. You can't register udfs inside a macro
> 2. Paramater substitutions aren't done inside macros
> 3. Resources (including macros) should not be redundantly acquired if they are already present.
> Rohini's patch https://issues.apache.org/jira/browse/PIG-3204 should address problem 3 where Pig reparses everything every time it reads a line, but there still would be a problem if two separate files import the same macro / udf file.
> To get this working, I moved methods for registering jars/udfs and param substitution from PigServer to PigContext so they can be accessed in QueryParserDriver which processes macros (QPD was already passed a PigContext reference). Is that ok?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira