You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Michael Brauwerman (JIRA)" <ji...@apache.org> on 2011/06/06 06:37:47 UTC

[jira] [Created] (PIG-2110) NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor

NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor
-------------------------------------------------------------------------------------

                 Key: PIG-2110
                 URL: https://issues.apache.org/jira/browse/PIG-2110
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.8.0
            Reporter: Michael Brauwerman


When processing a large log file, I get an exception in SearchTermExtractor.exec

I don't have a specific log line with a repro yet, but I assume the error occurs when the input URL is null, or maybe just has no query string:

I think a fix would be to be add a guard after creating queryString:

        String queryString = urlObject.getQuery();
        if (queryString == null) { return null; }

Stack Trace:
<code>
Caused by: java.io.IOException: Caught exception processing input row
        at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:195)
        at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:64)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
Caused by: java.lang.NullPointerException
        at java.util.regex.Matcher.getTextLength(Matcher.java:1140)
        at java.util.regex.Matcher.reset(Matcher.java:291)
        at java.util.regex.Matcher.reset(Matcher.java:311)
        at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:170)
</code>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (PIG-2110) NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-2110.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.10
         Assignee: Dale Jin
     Hadoop Flags: [Reviewed]

> NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor
> -------------------------------------------------------------------------------------
>
>                 Key: PIG-2110
>                 URL: https://issues.apache.org/jira/browse/PIG-2110
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Michael Brauwerman
>            Assignee: Dale Jin
>             Fix For: 0.10
>
>         Attachments: SearchTermExtractor.diff
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When processing a large log file, I get an exception in SearchTermExtractor.exec
> I don't have a specific log line with a repro yet, but I assume the error occurs when the input URL is null, or maybe just has no query string:
> I think a fix would be to be add a guard after creating queryString:
>         String queryString = urlObject.getQuery();
>         if (queryString == null) { return null; }
> Stack Trace:
> <code>
> Caused by: java.io.IOException: Caught exception processing input row
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:195)
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:64)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
> Caused by: java.lang.NullPointerException
>         at java.util.regex.Matcher.getTextLength(Matcher.java:1140)
>         at java.util.regex.Matcher.reset(Matcher.java:291)
>         at java.util.regex.Matcher.reset(Matcher.java:311)
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:170)
> </code>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2110) NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060978#comment-13060978 ] 

Daniel Dai commented on PIG-2110:
---------------------------------

Patch committed to trunk. Thanks Dale for contributing!

> NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor
> -------------------------------------------------------------------------------------
>
>                 Key: PIG-2110
>                 URL: https://issues.apache.org/jira/browse/PIG-2110
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Michael Brauwerman
>            Assignee: Dale Jin
>             Fix For: 0.10
>
>         Attachments: SearchTermExtractor.diff
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When processing a large log file, I get an exception in SearchTermExtractor.exec
> I don't have a specific log line with a repro yet, but I assume the error occurs when the input URL is null, or maybe just has no query string:
> I think a fix would be to be add a guard after creating queryString:
>         String queryString = urlObject.getQuery();
>         if (queryString == null) { return null; }
> Stack Trace:
> <code>
> Caused by: java.io.IOException: Caught exception processing input row
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:195)
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:64)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
> Caused by: java.lang.NullPointerException
>         at java.util.regex.Matcher.getTextLength(Matcher.java:1140)
>         at java.util.regex.Matcher.reset(Matcher.java:291)
>         at java.util.regex.Matcher.reset(Matcher.java:311)
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:170)
> </code>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2110) NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059322#comment-13059322 ] 

Daniel Dai commented on PIG-2110:
---------------------------------

Patch looks good. Will commit if test pass.

> NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor
> -------------------------------------------------------------------------------------
>
>                 Key: PIG-2110
>                 URL: https://issues.apache.org/jira/browse/PIG-2110
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Michael Brauwerman
>         Attachments: SearchTermExtractor.diff
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When processing a large log file, I get an exception in SearchTermExtractor.exec
> I don't have a specific log line with a repro yet, but I assume the error occurs when the input URL is null, or maybe just has no query string:
> I think a fix would be to be add a guard after creating queryString:
>         String queryString = urlObject.getQuery();
>         if (queryString == null) { return null; }
> Stack Trace:
> <code>
> Caused by: java.io.IOException: Caught exception processing input row
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:195)
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:64)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
> Caused by: java.lang.NullPointerException
>         at java.util.regex.Matcher.getTextLength(Matcher.java:1140)
>         at java.util.regex.Matcher.reset(Matcher.java:291)
>         at java.util.regex.Matcher.reset(Matcher.java:311)
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:170)
> </code>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2110) NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor

Posted by "Dale Jin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dale Jin updated PIG-2110:
--------------------------

    Attachment: SearchTermExtractor.diff

> NullPointerException in piggybank.evaluation.util.apachelogparser.SearchTermExtractor
> -------------------------------------------------------------------------------------
>
>                 Key: PIG-2110
>                 URL: https://issues.apache.org/jira/browse/PIG-2110
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Michael Brauwerman
>         Attachments: SearchTermExtractor.diff
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When processing a large log file, I get an exception in SearchTermExtractor.exec
> I don't have a specific log line with a repro yet, but I assume the error occurs when the input URL is null, or maybe just has no query string:
> I think a fix would be to be add a guard after creating queryString:
>         String queryString = urlObject.getQuery();
>         if (queryString == null) { return null; }
> Stack Trace:
> <code>
> Caused by: java.io.IOException: Caught exception processing input row
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:195)
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:64)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
> Caused by: java.lang.NullPointerException
>         at java.util.regex.Matcher.getTextLength(Matcher.java:1140)
>         at java.util.regex.Matcher.reset(Matcher.java:291)
>         at java.util.regex.Matcher.reset(Matcher.java:311)
>         at org.apache.pig.piggybank.evaluation.util.apachelogparser.SearchTermExtractor.exec(SearchTermExtractor.java:170)
> </code>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira