You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Nicolas Lalevée (JIRA)" <ji...@apache.org> on 2012/07/11 11:58:35 UTC

[jira] [Created] (HIVE-3250) ArrayIndexOutOfBoundsException in ColumnPrunerProcFactory$ColumnPrunerSelectProc

Nicolas Lalevée created HIVE-3250:
-------------------------------------

             Summary: ArrayIndexOutOfBoundsException in ColumnPrunerProcFactory$ColumnPrunerSelectProc
                 Key: HIVE-3250
                 URL: https://issues.apache.org/jira/browse/HIVE-3250
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.8.0
            Reporter: Nicolas Lalevée


I have a query which was not selecting field and the optimizer fails to evict them with the following stack trace:
{noformat}
FAILED: Hive Internal Error: java.lang.ArrayIndexOutOfBoundsException(-1)
java.lang.ArrayIndexOutOfBoundsException: -1
    at java.util.ArrayList.get(ArrayList.java:324)
    at org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerSelectProc.process(ColumnPrunerProcFactory.java:397)
    at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
    at org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:143)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
    at org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:106)
    at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7306)
    at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
    at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
    at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
{noformat}

The failing query reduced to the only failing part:
{noformat}
SELECT explodedUrls FROM
  (
    SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS user_lid FROM
      (
        SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
      ) pve
    GROUP BY userid
  ) userViewData
LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS explodedUrls
{noformat}

Adding fields make it work:
{noformat}
SELECT userid, explodedUrls, user_lid FROM
  (
    SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS user_lid FROM
      (
        SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
      ) pve
    GROUP BY userid
  ) userViewData
LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS explodedUrls
{noformat}

And s_explode_pageflow is a custom function which take an array of struct and split them into arrays of struct


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HIVE-3250) ArrayIndexOutOfBoundsException in ColumnPrunerProcFactory$ColumnPrunerSelectProc

Posted by "Navis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413638#comment-13413638 ] 

Navis commented on HIVE-3250:
-----------------------------

Sorry, I've attached the patch.
                
> ArrayIndexOutOfBoundsException in ColumnPrunerProcFactory$ColumnPrunerSelectProc
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-3250
>                 URL: https://issues.apache.org/jira/browse/HIVE-3250
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Nicolas Lalevée
>
> I have a query which was not selecting field and the optimizer fails to evict them with the following stack trace:
> {noformat}
> FAILED: Hive Internal Error: java.lang.ArrayIndexOutOfBoundsException(-1)
> java.lang.ArrayIndexOutOfBoundsException: -1
>     at java.util.ArrayList.get(ArrayList.java:324)
>     at org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerSelectProc.process(ColumnPrunerProcFactory.java:397)
>     at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
>     at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
>     at org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:143)
>     at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
>     at org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:106)
>     at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
>     at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7306)
>     at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
>     at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>     at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
>     at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
>     at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
> {noformat}
> The failing query reduced to the only failing part:
> {noformat}
> SELECT explodedUrls FROM
>   (
>     SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS user_lid FROM
>       (
>         SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
>       ) pve
>     GROUP BY userid
>   ) userViewData
> LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS explodedUrls
> {noformat}
> Adding fields make it work:
> {noformat}
> SELECT userid, explodedUrls, user_lid FROM
>   (
>     SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS user_lid FROM
>       (
>         SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
>       ) pve
>     GROUP BY userid
>   ) userViewData
> LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS explodedUrls
> {noformat}
> And s_explode_pageflow is a custom function which take an array of struct and split them into arrays of struct

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HIVE-3250) ArrayIndexOutOfBoundsException in ColumnPrunerProcFactory$ColumnPrunerSelectProc

Posted by "Nicolas Lalevée (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412830#comment-13412830 ] 

Nicolas Lalevée commented on HIVE-3250:
---------------------------------------

Sorry, I cannot find a proper link to download the patch
                
> ArrayIndexOutOfBoundsException in ColumnPrunerProcFactory$ColumnPrunerSelectProc
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-3250
>                 URL: https://issues.apache.org/jira/browse/HIVE-3250
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Nicolas Lalevée
>
> I have a query which was not selecting field and the optimizer fails to evict them with the following stack trace:
> {noformat}
> FAILED: Hive Internal Error: java.lang.ArrayIndexOutOfBoundsException(-1)
> java.lang.ArrayIndexOutOfBoundsException: -1
>     at java.util.ArrayList.get(ArrayList.java:324)
>     at org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerSelectProc.process(ColumnPrunerProcFactory.java:397)
>     at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
>     at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
>     at org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:143)
>     at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
>     at org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:106)
>     at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
>     at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7306)
>     at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
>     at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>     at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
>     at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
>     at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
> {noformat}
> The failing query reduced to the only failing part:
> {noformat}
> SELECT explodedUrls FROM
>   (
>     SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS user_lid FROM
>       (
>         SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
>       ) pve
>     GROUP BY userid
>   ) userViewData
> LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS explodedUrls
> {noformat}
> Adding fields make it work:
> {noformat}
> SELECT userid, explodedUrls, user_lid FROM
>   (
>     SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS user_lid FROM
>       (
>         SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
>       ) pve
>     GROUP BY userid
>   ) userViewData
> LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS explodedUrls
> {noformat}
> And s_explode_pageflow is a custom function which take an array of struct and split them into arrays of struct

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HIVE-3250) ArrayIndexOutOfBoundsException in ColumnPrunerProcFactory$ColumnPrunerSelectProc

Posted by "Navis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412410#comment-13412410 ] 

Navis commented on HIVE-3250:
-----------------------------

I think this is related with HIVE-3226. Could you try again with the patch applied?
                
> ArrayIndexOutOfBoundsException in ColumnPrunerProcFactory$ColumnPrunerSelectProc
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-3250
>                 URL: https://issues.apache.org/jira/browse/HIVE-3250
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Nicolas Lalevée
>
> I have a query which was not selecting field and the optimizer fails to evict them with the following stack trace:
> {noformat}
> FAILED: Hive Internal Error: java.lang.ArrayIndexOutOfBoundsException(-1)
> java.lang.ArrayIndexOutOfBoundsException: -1
>     at java.util.ArrayList.get(ArrayList.java:324)
>     at org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerSelectProc.process(ColumnPrunerProcFactory.java:397)
>     at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
>     at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
>     at org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:143)
>     at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
>     at org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:106)
>     at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
>     at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7306)
>     at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
>     at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>     at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
>     at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
>     at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
> {noformat}
> The failing query reduced to the only failing part:
> {noformat}
> SELECT explodedUrls FROM
>   (
>     SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS user_lid FROM
>       (
>         SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
>       ) pve
>     GROUP BY userid
>   ) userViewData
> LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS explodedUrls
> {noformat}
> Adding fields make it work:
> {noformat}
> SELECT userid, explodedUrls, user_lid FROM
>   (
>     SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS user_lid FROM
>       (
>         SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
>       ) pve
>     GROUP BY userid
>   ) userViewData
> LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS explodedUrls
> {noformat}
> And s_explode_pageflow is a custom function which take an array of struct and split them into arrays of struct

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira