You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Naresh P R (Jira)" <ji...@apache.org> on 2022/03/18 03:42:00 UTC

[jira] [Updated] (HIVE-26047) Vectorized LIKE UDF should use Re2J regex to address JDK-8203458

     [ https://issues.apache.org/jira/browse/HIVE-26047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Naresh P R updated HIVE-26047:
------------------------------
    Description: 
Below pattern is taking a long time to validate regex in java8 with same trace as shown in java bug

[JDK-8203458|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458]

 
{code:java}
import java.util.regex.Pattern;
public class Test {
  public static void main(String args[]) {
    String pattern = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b"; 
    Pattern CHAIN_PATTERN = Pattern.compile("(%?[^%_\\\\]+%?)+");
    CHAIN_PATTERN.matcher(pattern).matches(); 
  }
}
{code}
Same is reproducible with following SQL
{code:java}
create table table1(name string);
insert into table1 (name) values ('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b');
select * from table1 where name like "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b";{code}

  was:
Below pattern is taking a long time to validate regex in java8 with same trace as shown in java bug [[JDK-8203458||https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458] [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458] []|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458]
import java.util.regex.Pattern;

public class ABCD {

  public static void main(String args[]) {
    String pattern = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b";
    Pattern CHAIN_PATTERN = Pattern.compile("(%?[^%_\\\\]+%?)+");
    CHAIN_PATTERN.matcher(pattern).matches();
  }
}
Same is reproducible with following SQL
{code:java}
create table table1(name string);
insert into table1 (name) values ('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b');
select * from table1 where name like "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b";{code}


> Vectorized LIKE UDF should use Re2J regex to address JDK-8203458
> ----------------------------------------------------------------
>
>                 Key: HIVE-26047
>                 URL: https://issues.apache.org/jira/browse/HIVE-26047
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Naresh P R
>            Assignee: Naresh P R
>            Priority: Major
>
> Below pattern is taking a long time to validate regex in java8 with same trace as shown in java bug
> [JDK-8203458|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458]
>  
> {code:java}
> import java.util.regex.Pattern;
> public class Test {
>   public static void main(String args[]) {
>     String pattern = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b"; 
>     Pattern CHAIN_PATTERN = Pattern.compile("(%?[^%_\\\\]+%?)+");
>     CHAIN_PATTERN.matcher(pattern).matches(); 
>   }
> }
> {code}
> Same is reproducible with following SQL
> {code:java}
> create table table1(name string);
> insert into table1 (name) values ('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b');
> select * from table1 where name like "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b";{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)