You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Daniel Becker (Code Review)" <ge...@cloudera.org> on 2022/04/06 14:30:15 UTC

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Daniel Becker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18387


Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................

IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props

The huge values clause of the insert SQL statement in
TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
could cause an OutOfMemory error in the FE.

We use a SQL statement with a huge values clause (more than 40 000
elements) to insert values into a parquet table in some tests, and the
size of the SQL statement string sometimes causes an OOM error.

After this change, when creating these parquet tables, instead of using
a huge SQL statement to insert the values, we first create a text table
with the same schema, write the values to a text file, copy the file
into the HDFS directory of the text table and create a parquet table
from the text table.

Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
---
M tests/query_test/test_parquet_bloom_filter.py
1 file changed, 30 insertions(+), 16 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/18387/1
-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker <da...@cloudera.com>

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18387/1/tests/query_test/test_parquet_bloom_filter.py
File tests/query_test/test_parquet_bloom_filter.py:

http://gerrit.cloudera.org:8080/#/c/18387/1/tests/query_test/test_parquet_bloom_filter.py@214
PS1, Line 214: bl_create_stmt = \
             :         'create table {db}.{tbl} ({col_name} BIGINT) stored as textfile' .forma
> optional: I think that this could be made simpler by using an existing tabl
Ah, this is a cleaner approach! Maybe need an ORDER BY clause as well.



-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Apr 2022 01:58:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10406/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Wed, 06 Apr 2022 14:50:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8032/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 7
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 11 Apr 2022 00:09:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10414/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Apr 2022 14:07:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10418/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Fri, 08 Apr 2022 10:19:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................

IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props

The huge values clause of the insert SQL statement in
TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
could cause an OutOfMemory error in the FE.

We use a SQL statement with a huge values clause (more than 40 000
elements) to insert values into a parquet table in some tests, and the
size of the SQL statement string sometimes causes an OOM error.

After this change, when creating these parquet tables, instead of using
a huge SQL statement to insert the values, we first create a text table
with the same schema, write the values to a text file, copy the file
into the HDFS directory of the text table and create a parquet table
from the text table.

Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
---
M tests/query_test/test_parquet_bloom_filter.py
1 file changed, 16 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/18387/3
-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18387/5/tests/query_test/test_parquet_bloom_filter.py
File tests/query_test/test_parquet_bloom_filter.py:

http://gerrit.cloudera.org:8080/#/c/18387/5/tests/query_test/test_parquet_bloom_filter.py@219
PS5, Line 219:         as (select row_number() over (order by o_orderkey) * 2 as {col} \
please add ( ) around the expression before * 2



-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Apr 2022 16:01:28 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................

IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props

The huge values clause of the insert SQL statement in
TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
could cause an OutOfMemory error in the FE.

We use a SQL statement with a huge values clause (more than 40 000
elements) to insert values into a parquet table in some tests, and the
size of the SQL statement string sometimes causes an OOM error.

After this change, we create these parquet tables with a CTAS from an
existing table, avoiding any long SQL statements.

Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
---
M tests/query_test/test_parquet_bloom_filter.py
1 file changed, 15 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/18387/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18387/3/tests/query_test/test_parquet_bloom_filter.py
File tests/query_test/test_parquet_bloom_filter.py:

http://gerrit.cloudera.org:8080/#/c/18387/3/tests/query_test/test_parquet_bloom_filter.py@214
PS3, Line 214: c
flake8: E122 continuation line missing indentation or outdented



-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Apr 2022 13:53:13 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................

IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props

The huge values clause of the insert SQL statement in
TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
could cause an OutOfMemory error in the FE.

We use a SQL statement with a huge values clause (more than 40 000
elements) to insert values into a parquet table in some tests, and the
size of the SQL statement string sometimes causes an OOM error.

After this change, we create these parquet tables with a CTAS from an
existing table, avoiding any long SQL statements.

Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
---
M tests/query_test/test_parquet_bloom_filter.py
1 file changed, 13 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/18387/6
-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18387/2/tests/query_test/test_parquet_bloom_filter.py
File tests/query_test/test_parquet_bloom_filter.py:

http://gerrit.cloudera.org:8080/#/c/18387/2/tests/query_test/test_parquet_bloom_filter.py@20
PS2, Line 20: import tempfile
flake8: F401 'tempfile' imported but unused


http://gerrit.cloudera.org:8080/#/c/18387/2/tests/query_test/test_parquet_bloom_filter.py@215
PS2, Line 215: c
flake8: E122 continuation line missing indentation or outdented



-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Apr 2022 13:10:12 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................

IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props

The huge values clause of the insert SQL statement in
TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
could cause an OutOfMemory error in the FE.

We use a SQL statement with a huge values clause (more than 40 000
elements) to insert values into a parquet table in some tests, and the
size of the SQL statement string sometimes causes an OOM error.

After this change, when creating these parquet tables, instead of using
a huge SQL statement to insert the values, we first create a text table
with the same schema, write the values to a text file, copy the file
into the HDFS directory of the text table and create a parquet table
from the text table.

Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
---
M tests/query_test/test_parquet_bloom_filter.py
1 file changed, 17 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/18387/2
-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10415/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Apr 2022 14:12:23 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18387/5/tests/query_test/test_parquet_bloom_filter.py
File tests/query_test/test_parquet_bloom_filter.py:

http://gerrit.cloudera.org:8080/#/c/18387/5/tests/query_test/test_parquet_bloom_filter.py@136
PS5, Line 136: tmpdir
I think we should remove this as well.


http://gerrit.cloudera.org:8080/#/c/18387/5/tests/query_test/test_parquet_bloom_filter.py@176
PS5, Line 176: tmpdir
This needs to be removed as well.



-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Fri, 08 Apr 2022 01:30:22 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Fri, 08 Apr 2022 10:15:23 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18387/1/tests/query_test/test_parquet_bloom_filter.py
File tests/query_test/test_parquet_bloom_filter.py:

http://gerrit.cloudera.org:8080/#/c/18387/1/tests/query_test/test_parquet_bloom_filter.py@216
PS1, Line 216: d
flake8: E122 continuation line missing indentation or outdented


http://gerrit.cloudera.org:8080/#/c/18387/1/tests/query_test/test_parquet_bloom_filter.py@236
PS1, Line 236: c
flake8: E122 continuation line missing indentation or outdented



-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Wed, 06 Apr 2022 14:31:03 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................

IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props

The huge values clause of the insert SQL statement in
TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
could cause an OutOfMemory error in the FE.

We use a SQL statement with a huge values clause (more than 40 000
elements) to insert values into a parquet table in some tests, and the
size of the SQL statement string sometimes causes an OOM error.

After this change, we create these parquet tables with a CTAS from an
existing table, avoiding any long SQL statements.

Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
---
M tests/query_test/test_parquet_bloom_filter.py
1 file changed, 16 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/18387/4
-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18387/3/tests/query_test/test_parquet_bloom_filter.py
File tests/query_test/test_parquet_bloom_filter.py:

http://gerrit.cloudera.org:8080/#/c/18387/3/tests/query_test/test_parquet_bloom_filter.py@214
PS3, Line 214: c
> flake8: E122 continuation line missing indentation or outdented
Sorry, corrected in P5.



-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Apr 2022 13:55:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10413/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Apr 2022 13:27:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18387/1/tests/query_test/test_parquet_bloom_filter.py
File tests/query_test/test_parquet_bloom_filter.py:

http://gerrit.cloudera.org:8080/#/c/18387/1/tests/query_test/test_parquet_bloom_filter.py@214
PS1, Line 214: bl_create_stmt = \
             :         'create table {db}.{tbl} ({col_name} BIGINT) stored as textfile' .forma
optional: I think that this could be made simpler by using an existing table as source, e.g.:
select o_orderkey * 2 from tpch_parquet.orders limit 40001;
(all values of o_orderkey  are unique)



-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Wed, 06 Apr 2022 19:32:52 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10416/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Apr 2022 14:13:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................

IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props

The huge values clause of the insert SQL statement in
TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
could cause an OutOfMemory error in the FE.

We use a SQL statement with a huge values clause (more than 40 000
elements) to insert values into a parquet table in some tests, and the
size of the SQL statement string sometimes causes an OOM error.

After this change, we create these parquet tables with a CTAS from an
existing table, avoiding any long SQL statements.

Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Reviewed-on: http://gerrit.cloudera.org:8080/18387
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M tests/query_test/test_parquet_bloom_filter.py
1 file changed, 13 insertions(+), 15 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 8
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 7: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 7
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 11 Apr 2022 04:32:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11227: FE OOM in TestParquetBloomFilter.test fallback from dict if no bloom tbl props

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18387 )

Change subject: IMPALA-11227: FE OOM in TestParquetBloomFilter.test_fallback_from_dict_if_no_bloom_tbl_props
......................................................................


Patch Set 7: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I923cc9ba4b6829a2f15e93365f2849b89248598b
Gerrit-Change-Number: 18387
Gerrit-PatchSet: 7
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 11 Apr 2022 00:09:23 +0000
Gerrit-HasComments: No