You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hari Sankar Sivarama Subramaniyan (JIRA)" <ji...@apache.org> on 2015/06/29 21:30:04 UTC

[jira] [Updated] (HIVE-11141) Improve RuleRegExp when the Expression node stack gets huge

     [ https://issues.apache.org/jira/browse/HIVE-11141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11141:
-----------------------------------------------------
    Description: 
More and more complex workloads are migrated to Hive from Sql Server, Terradata etc.. 
And occasionally Hive gets bottlenecked on generating plans for large queries, the majority of the cases time is spent in fetching metadata, partitions and other optimizer transformation related rules

I have attached the query for the test case which needs to be tested after we setup database as shown below.
{code}
create database dataset_3;
use database dataset_3;
{code}

createtable.rtf - create table command
SQLQuery10.sql.mssql - explain query

It seems that the most problematic part of the code as the stack gets arbitrary long, in RuleRegExp.java
{code}
  @Override
  public int cost(Stack<Node> stack) throws SemanticException {
    int numElems = (stack != null ? stack.size() : 0);
    String name = "";
    for (int pos = numElems - 1; pos >= 0; pos--) {
      name = stack.get(pos).getName() + "%" + name;
      Matcher m = pattern.matcher(name);
      if (m.matches()) {
        return m.group().length();
      }
    }
    return -1;
  }
{code}


  was:
More and more complex workloads are migrated to Hive from Sql Server, Terradata etc.. 
And occasionally Hive gets bottlenecked on generating plans for large queries, the majority of the cases time is spent in fetching metadata, partitions and other optimizer transformation related rules

I have attached the query for the test case which needs to be tested after we setup database as shown below.
{code}
create database dataset_3;
use database dataset_3;
{code}

It seems that the most problematic part of the code as the stack gets arbitrary long, in RuleRegExp.java
{code}
  @Override
  public int cost(Stack<Node> stack) throws SemanticException {
    int numElems = (stack != null ? stack.size() : 0);
    String name = "";
    for (int pos = numElems - 1; pos >= 0; pos--) {
      name = stack.get(pos).getName() + "%" + name;
      Matcher m = pattern.matcher(name);
      if (m.matches()) {
        return m.group().length();
      }
    }
    return -1;
  }
{code}



> Improve RuleRegExp when the Expression node stack gets huge
> -----------------------------------------------------------
>
>                 Key: HIVE-11141
>                 URL: https://issues.apache.org/jira/browse/HIVE-11141
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Hari Sankar Sivarama Subramaniyan
>            Assignee: Hari Sankar Sivarama Subramaniyan
>         Attachments: SQLQuery10.sql.mssql, createtable.rtf
>
>
> More and more complex workloads are migrated to Hive from Sql Server, Terradata etc.. 
> And occasionally Hive gets bottlenecked on generating plans for large queries, the majority of the cases time is spent in fetching metadata, partitions and other optimizer transformation related rules
> I have attached the query for the test case which needs to be tested after we setup database as shown below.
> {code}
> create database dataset_3;
> use database dataset_3;
> {code}
> createtable.rtf - create table command
> SQLQuery10.sql.mssql - explain query
> It seems that the most problematic part of the code as the stack gets arbitrary long, in RuleRegExp.java
> {code}
>   @Override
>   public int cost(Stack<Node> stack) throws SemanticException {
>     int numElems = (stack != null ? stack.size() : 0);
>     String name = "";
>     for (int pos = numElems - 1; pos >= 0; pos--) {
>       name = stack.get(pos).getName() + "%" + name;
>       Matcher m = pattern.matcher(name);
>       if (m.matches()) {
>         return m.group().length();
>       }
>     }
>     return -1;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)