You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2008/10/07 18:32:44 UTC

[jira] Reopened: (PIG-472) load files based on user provided regular expressions

     [ https://issues.apache.org/jira/browse/PIG-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates reopened PIG-472:
----------------------------


In general the patch looks good.  A couple of comments and a question:

1) You need to add the Apache License comment to the header of some of the files.  You put it in some files, but not others.

2) When you submit a patch mark the JIRA as patch available.  The committer will mark it as resolved when it's checked in.  I'm reopening all three and setting them to patch available.

The question, in RegExLoader.getNext(), you construct a new Matcher for every line.  Would it be faster to construct one Matcher and call reset() on it for each line?

> load files based on user provided regular expressions
> -----------------------------------------------------
>
>                 Key: PIG-472
>                 URL: https://issues.apache.org/jira/browse/PIG-472
>             Project: Pig
>          Issue Type: New Feature
>          Components: data, grunt
>    Affects Versions: 0.1.0
>            Reporter: Earl Cahill
>             Fix For: 0.1.0
>
>         Attachments: RegExLoader-PIG-472
>
>
> Want to be able to load files based on regular expressions.  Each group specified in parenthesis should end up as a DataAtom, and the list of DataAtoms should end up in a Tuple.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.