You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2009/07/14 00:22:15 UTC

[jira] Created: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
------------------------------------------------------------------------------------------------

                 Key: PIG-885
                 URL: https://issues.apache.org/jira/browse/PIG-885
             Project: Pig
          Issue Type: New Feature
            Reporter: Daniel Dai
            Priority: Minor


Bunch of UDFs:
1. Bin -- Converts a continuous value into discrete values
2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
3. LookupInFiles -- Check for the existence of an expression in a serial of text files
4. RegexExtract and RegexMatch -- Similar to perl regexes
5. HashFVN -- An implementation of FNV hash
6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Status: In Progress  (was: Patch Available)

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733376#action_12733376 ] 

Hadoop QA commented on PIG-885:
-------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12414031/PIG-885-3.patch
  against trunk revision 795931.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 19 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/136/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/136/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/136/console

This message is automatically generated.

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737315#action_12737315 ] 

Olga Natkovich commented on PIG-885:
------------------------------------

The latest patch looks good. Couple of comments:

(1) RegexExtract - input.get(1).equals(mExpression)) - need to check for null return from get(1). The same for get(2)
(2) RegexpMatch - the same

Once they are addressed, please, commit the patch






> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885-6.patch, PIG-885-7.patch, PIG-885-8.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Status: Patch Available  (was: In Progress)

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Amr Awadallah (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730585#action_12730585 ] 

Amr Awadallah commented on PIG-885:
-----------------------------------

very nice collection, reminds me of myna :)

-- amr


> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFVN -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730622#action_12730622 ] 

Hadoop QA commented on PIG-885:
-------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12413351/PIG-885.patch
  against trunk revision 793660.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 19 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 release audit.  The applied patch generated 176 release audit warnings (more than the trunk's current 163 warnings).

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/126/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/126/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/126/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/126/console

This message is automatically generated.

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Attachment: PIG-885-2.patch

Some misspell on the function names.

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Status: Patch Available  (was: In Progress)

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885-6.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733798#action_12733798 ] 

Olga Natkovich commented on PIG-885:
------------------------------------

+1, please, commit

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885-6.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Attachment: PIG-885-5.patch

Rework on error handling part.

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Attachment: PIG-885-7.patch

Add null checking to all applicable UDFs

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885-6.patch, PIG-885-7.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Attachment: PIG-885-8.patch

Add NullPointerException check

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885-6.patch, PIG-885-7.patch, PIG-885-8.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Attachment: PIG-885.patch

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>         Attachments: PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFVN -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

        Fix Version/s: 0.4.0
    Affects Version/s: 0.3.0
               Status: Patch Available  (was: Open)

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFVN -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733694#action_12733694 ] 

Olga Natkovich commented on PIG-885:
------------------------------------

The code looks good.

Comments:

(1) LookupInFile - I think it would make sense to require that files are provided in a constructor (via define) rather than checking on every exec.
(2) In LookupInFile.exec - you get first element of the tuple without checking that it exists. I think you need to check for that and give an error. 
(3) LookupInFile.init - There are also some comments there that seems unrelated to the code - please remove
(4) RegexpExtract.exec, RegexpMatch.exec - you refer to elements in the tuple without checking that they exist. We should give meaningful errors when we don't get all expected parameters
(5) HashFNV.exec - needs to check size of the tuple. 
(6) HashFNV - needs the mapping function that that Pig insert implicit cast
(7) DiffDate.exec - needs to check input tuple size before getting fields out
(8) DiffDate - needs mapping function so that Pig inserts casts



> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12735734#action_12735734 ] 

Hadoop QA commented on PIG-885:
-------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12414123/PIG-885-6.patch
  against trunk revision 797290.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 19 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/140/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/140/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/140/console

This message is automatically generated.

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885-6.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai reassigned PIG-885:
------------------------------

    Assignee: Daniel Dai

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>         Attachments: PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFVN -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Attachment: PIG-885-4.patch

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Description: 
Bunch of UDFs:
1. Bin -- Converts a continuous value into discrete values
2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
3. LookupInFiles -- Check for the existence of an expression in a serial of text files
4. RegexExtract and RegexMatch -- Similar to perl regexes
5. HashFNV -- An implementation of FNV hash
6. DiffDate -- Caculate the number of days in between

  was:
Bunch of UDFs:
1. Bin -- Converts a continuous value into discrete values
2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
3. LookupInFiles -- Check for the existence of an expression in a serial of text files
4. RegexExtract and RegexMatch -- Similar to perl regexes
5. HashFVN -- An implementation of FNV hash
6. DiffDate -- Caculate the number of days in between


> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Patch committed.

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885-6.patch, PIG-885-7.patch, PIG-885-8.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Status: In Progress  (was: Patch Available)

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Attachment: PIG-885-3.patch

Attach patch again to solve release audit warnings.

> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-885) New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-885:
---------------------------

    Attachment: PIG-885-6.patch

New patch addresses most of problems in the comments except for these two:
(1) LookupInFile takes arbitrary number input files. It cannot be put into define. There is a single file version called INSETFROMFILE already in internal piggybank. It makes use of construct via define
(6) Second input parameter of HashFNV is optional, so we cannot specify input schema using the existing mechanism.


> New UDFs for piggybank (Bin, Decode, LookupInFiles, RegexExtract, RegexMatch, HashFVN, DiffDate)
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-885
>                 URL: https://issues.apache.org/jira/browse/PIG-885
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: PIG-885-2.patch, PIG-885-3.patch, PIG-885-4.patch, PIG-885-5.patch, PIG-885-6.patch, PIG-885.patch
>
>
> Bunch of UDFs:
> 1. Bin -- Converts a continuous value into discrete values
> 2. Decode -- Converts a given attribute or expression into another string value, based on the value of the source attribute
> 3. LookupInFiles -- Check for the existence of an expression in a serial of text files
> 4. RegexExtract and RegexMatch -- Similar to perl regexes
> 5. HashFNV -- An implementation of FNV hash
> 6. DiffDate -- Caculate the number of days in between

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.