You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/07/02 03:28:00 UTC

[jira] [Commented] (IMPALA-9615) Make re2's max_mem option configurable via an Impala startup flag.

    [ https://issues.apache.org/jira/browse/IMPALA-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17561691#comment-17561691 ] 

ASF subversion and git services commented on IMPALA-9615:
---------------------------------------------------------

Commit a625a95dbd347d5a5e64566c77bcb27e991ce352 in impala's branch refs/heads/dependabot/pip/infra/python/deps/urllib3-1.26.5 from Omid Shahidi
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a625a95db ]

IMPALA-9615: re2's max_mem opt configurable via an Impala startup flag

Some regex patterns require more memory to be compiled and pattern matched
using different string functions and like predicate available.
For more memory consuming patterns this can cause the following error:
"re2/re2.cc:667: DFA out of memory:
	size xxxxx, bytemap range xx, list count xxxxx".

To avoid such errors in Impalad's ERROR log, a global flag can
be added to impala cluster startup. The re2_mem_limit flag will
accept a memory specification string to set the re2 max_mem parameter for
memory used to store regexps in Bytes.

Testing:
 - Use a long regex pattern to use up all the memory in the
   case of allocating less or the same amount of memory as default for re2.
   By using a greater value for re2_mem_limit flag, the regexp can be
   consumed with no error.

Change-Id: Idf28d2f7217b1322ab8fdfb2c02fff0608078571
Reviewed-on: http://gerrit.cloudera.org:8080/18602
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Make re2's max_mem option configurable via an Impala startup flag.
> ------------------------------------------------------------------
>
>                 Key: IMPALA-9615
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9615
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 3.4.0
>            Reporter: Attila Jeges
>            Assignee: Omid Shahidi
>            Priority: Major
>              Labels: backend, ramp-up
>
> Right now Impala always uses the default max_mem value for re2 regexp pattern matching.
> For more memory consuming patterns this can cause the following error:
> "re2/re2.cc:667: DFA out of memory: size xxxxx, bytemap range xx, list count xxxxx".
> It would be nice if re2's max_mem option would be configurable via an Impala startup flag.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org