You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2016/08/18 20:28:20 UTC

[jira] [Created] (HIVE-14573) Vectorization: Implement StringExpr::find()

Gopal V created HIVE-14573:
------------------------------

             Summary: Vectorization: Implement StringExpr::find() 
                 Key: HIVE-14573
                 URL: https://issues.apache.org/jira/browse/HIVE-14573
             Project: Hive
          Issue Type: Bug
            Reporter: Gopal V


Currently, the LIKE expression implementation is a dump StringExpr::equals() loop.

For an input of N bytes and a pattern of M bytes, this has the complexity of ((N-M)*M), which is not an issue with small patterns or small inputs.

The pattern matching is currently optimized for matches, while in clickstream data the opposite is true in general.

From the common crawl data, the following run will go through the same

{code}
select count(1) from uservisits_orc_data where useragent like "%Opera%" and searchword LIKE "%fruit%";
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)