You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Navin Bhawsar <na...@gmail.com> on 2020/04/29 16:35:17 UTC
Parquet Predicate Push down not working
Hi
We are trying to do a simple where clause query with Predicate .Parquet
files are created using python and stored on hdfs.
Apache Drill version used is 1.17 .
Below options are set as default required for Predicate Push Down
[image: image.png]
Drill query is scanning directory with multiple parquet files (total size 1
GB).
We are expecting if predicate push down works it will help reduce scan time
which is currently 97 %.
If Predicate push down works row group scan should only fetch 70,840
records instead of 14162187.
[image: image.png]
*Minor Fragment*
*NUM_ROWGROUPS*
*ROWGROUPS_PRUNED*
*NUM_DICT_PAGE_LOADS*
*NUM_DATA_PAGE_lOADS*
*NUM_DATA_PAGES_DECODED*
*NUM_DICT_PAGES_DECOMPRESSED*
*NUM_DATA_PAGES_DECOMPRESSED*
*TOTAL_DICT_PAGE_READ_BYTES*
*TOTAL_DATA_PAGE_READ_BYTES*
*TOTAL_DICT_DECOMPRESSED_BYTES*
*TOTAL_DATA_DECOMPRESSED_BYTES*
*TIME_DICT_PAGE_LOADS*
*TIME_DATA_PAGE_LOADS*
*TIME_DATA_PAGE_DECODE*
*TIME_DICT_PAGE_DECODE*
*TIME_DICT_PAGES_DECOMPRESSED*
*TIME_DATA_PAGES_DECOMPRESSED*
*TIME_DISK_SCAN_WAIT*
*TIME_DISK_SCAN*
*TIME_FIXEDCOLUMN_READ*
*TIME_VARCOLUMN_READ*
*TIME_PROCESS*
01-00-04
7
0
77
0
77
77
77
0
0
7,147,852
8,884,071
598,070
0
97,822
11,440,739
2,081,514
17,694,740
598,070
0
112,108,259
703,103,096
815,245,307
01-01-04
6
0
66
0
66
66
66
0
0
2,115,860
4,316,153
1,778,468
0
144,320
3,665,957
775,403
8,693,618
1,778,468
0
105,066,657
776,807,232
882,070,408
01-02-04
6
0
66
0
66
66
66
0
0
6,835,560
8,630,174
337,404
0
100,190
10,876,145
1,970,521
11,789,061
337,404
0
102,833,433
655,338,696
758,203,357
01-03-04
6
0
66
0
66
66
66
0
0
2,242,112
4,516,183
1,586,562
0
164,398
3,827,371
877,814
8,604,307
1,586,562
0
112,745,628
758,634,132
871,586,588
01-04-04
6
0
66
2
66
66
64
0
1,420
5,407,178
7,175,446
2,216,935
3,181
74,956
8,754,425
1,650,970
11,241,636
2,216,935
0
97,180,713
668,249,966
765,461,684
01-05-04
6
0
66
1
66
66
65
0
92
1,378,260
3,595,638
3,394,196
1,571
204,833
2,726,005
1,357,297
6,843,717
3,394,196
0
150,560,569
704,154,215
854,928,393
01-06-04
6
0
66
0
66
66
66
0
0
4,748,302
6,547,215
471,679
0
114,270
7,739,335
1,537,805
10,571,215
471,679
0
97,392,926
667,056,499
764,478,811
01-07-04
6
0
68
0
66
64
66
180
0
769,746
3,128,730
292,603
0
130,814
1,574,574
425,133
6,563,457
286,300
0
168,501,325
716,135,483
884,850,308
01-08-04
6
0
66
0
66
66
66
0
0
8,356,637
9,264,223
582,946
0
101,103
13,332,669
2,422,705
13,340,100
582,946
0
109,932,913
691,400,457
801,374,949
01-09-04
6
0
66
2
66
66
64
0
133
1,453,953
2,953,546
19,563,820
1,920
149,257
2,553,666
632,461
5,886,238
19,563,820
0
81,854,819
557,612,832
639,664,370
01-10-04
6
0
66
0
66
66
66
0
0
6,634,676
8,081,684
Please advise if there is any specific options required to enable predicate
push down.
Also we expect Filter should filter out records but its done later by
SELECTION_VECTOR_REMOVER operator.
There is not enough details on documentation site ,when this operation is
triggered.
Thanks,
Navin
RE: Parquet Predicate Push down not working
Posted by "Jaimes, Rafael - 0993 - MITLL" <Ra...@ll.mit.edu>.
Hi Navin,
I don’t think inline screenshots work on the mailing list so they are not showing up for me. I don’t think you have to do anything in Drill 1.17 to enable predicate pushdown for Parquet.
1 GB total dataset is really small. If that’s spread across multiple parquet files the row group is going to be tiny and performance will be poor. How many files do you have now?
I would aim for 1-2 GB row groups for best Parquet performance. Maybe 512 MB if the computers building them have low RAM.
Do all the parquet files have 100% identical schema?
Can you post your query?
- Raf
From: Navin Bhawsar <na...@gmail.com>
Sent: Wednesday, April 29, 2020 12:35 PM
To: user@drill.apache.org
Cc: arun.ns@gmail.com; Navin Bhawsar <na...@gmail.com>
Subject: Parquet Predicate Push down not working
Hi
We are trying to do a simple where clause query with Predicate .Parquet files are created using python and stored on hdfs.
Apache Drill version used is 1.17 .
Below options are set as default required for Predicate Push Down
Drill query is scanning directory with multiple parquet files (total size 1 GB).
We are expecting if predicate push down works it will help reduce scan time which is currently 97 %.
If Predicate push down works row group scan should only fetch 70,840 records instead of 14162187.
Minor Fragment
NUM_ROWGROUPS
ROWGROUPS_PRUNED
NUM_DICT_PAGE_LOADS
NUM_DATA_PAGE_lOADS
NUM_DATA_PAGES_DECODED
NUM_DICT_PAGES_DECOMPRESSED
NUM_DATA_PAGES_DECOMPRESSED
TOTAL_DICT_PAGE_READ_BYTES
TOTAL_DATA_PAGE_READ_BYTES
TOTAL_DICT_DECOMPRESSED_BYTES
TOTAL_DATA_DECOMPRESSED_BYTES
TIME_DICT_PAGE_LOADS
TIME_DATA_PAGE_LOADS
TIME_DATA_PAGE_DECODE
TIME_DICT_PAGE_DECODE
TIME_DICT_PAGES_DECOMPRESSED
TIME_DATA_PAGES_DECOMPRESSED
TIME_DISK_SCAN_WAIT
TIME_DISK_SCAN
TIME_FIXEDCOLUMN_READ
TIME_VARCOLUMN_READ
TIME_PROCESS
01-00-04
7
0
77
0
77
77
77
0
0
7,147,852
8,884,071
598,070
0
97,822
11,440,739
2,081,514
17,694,740
598,070
0
112,108,259
703,103,096
815,245,307
01-01-04
6
0
66
0
66
66
66
0
0
2,115,860
4,316,153
1,778,468
0
144,320
3,665,957
775,403
8,693,618
1,778,468
0
105,066,657
776,807,232
882,070,408
01-02-04
6
0
66
0
66
66
66
0
0
6,835,560
8,630,174
337,404
0
100,190
10,876,145
1,970,521
11,789,061
337,404
0
102,833,433
655,338,696
758,203,357
01-03-04
6
0
66
0
66
66
66
0
0
2,242,112
4,516,183
1,586,562
0
164,398
3,827,371
877,814
8,604,307
1,586,562
0
112,745,628
758,634,132
871,586,588
01-04-04
6
0
66
2
66
66
64
0
1,420
5,407,178
7,175,446
2,216,935
3,181
74,956
8,754,425
1,650,970
11,241,636
2,216,935
0
97,180,713
668,249,966
765,461,684
01-05-04
6
0
66
1
66
66
65
0
92
1,378,260
3,595,638
3,394,196
1,571
204,833
2,726,005
1,357,297
6,843,717
3,394,196
0
150,560,569
704,154,215
854,928,393
01-06-04
6
0
66
0
66
66
66
0
0
4,748,302
6,547,215
471,679
0
114,270
7,739,335
1,537,805
10,571,215
471,679
0
97,392,926
667,056,499
764,478,811
01-07-04
6
0
68
0
66
64
66
180
0
769,746
3,128,730
292,603
0
130,814
1,574,574
425,133
6,563,457
286,300
0
168,501,325
716,135,483
884,850,308
01-08-04
6
0
66
0
66
66
66
0
0
8,356,637
9,264,223
582,946
0
101,103
13,332,669
2,422,705
13,340,100
582,946
0
109,932,913
691,400,457
801,374,949
01-09-04
6
0
66
2
66
66
64
0
133
1,453,953
2,953,546
19,563,820
1,920
149,257
2,553,666
632,461
5,886,238
19,563,820
0
81,854,819
557,612,832
639,664,370
01-10-04
6
0
66
0
66
66
66
0
0
6,634,676
8,081,684
Please advise if there is any specific options required to enable predicate push down.
Also we expect Filter should filter out records but its done later by SELECTION_VECTOR_REMOVER operator.
There is not enough details on documentation site ,when this operation is triggered.
Thanks,
Navin
Re: Parquet Predicate Push down not working
Posted by Paul Rogers <pa...@yahoo.com.INVALID>.
Hi Navin,
You raise some good questions. I don't have a complete answer, but I can tackle some of the basics.
Rafael noted that images are blocked on Apache mail lists. I believe you can post images in the Drill Slack channel. Better, perhaps is to open a JIRA ticket with your images and information so it is easier for us to track these specific questions & issues.
Drill supports two forms of Parquet predicate push-down. The first is partition pruning, which removes files based on their directory names. (Let's say you have files in the 2019 and 2020 directories, and have a WHERE clause that limits the query to just the 2020 directory). Partition pruning should work as long as you explicitly mention the directories:
... WHERE dir0 = "2020"
(Unfortunately, since Drill has no schema, Drill cannot map directories to column names the way Hive can.)
The simplest, least-fuss way to enable filter push-down is to filter based on directories: doing so requires no extra schema information be provided to Drill, nor does it require Drill to do extra work (reading files) when planning a query. Directory pruning works for Parquet and all other file types as well.
The second form of pruning occurs at the row group level. Here I'll need some help from the folks that have worked with that code. I'm not sure if the planner will open every file at plan time to read this information. I do seem to recall that Drill does (did?) gather and cache the info. There is also a newly-added metadata feature to gather this information once to avoid per-query scans. Perhaps someone with more current knowledge can fill in the details.
You noted that the filter does not remove records. This is correct. The filter simply tags records as matching the filter or not. The Selection Vector Remover (SVR) does the actual removal. The SVR operator is used in other places as well. It is the combination of (Filter --> SVR) that performs the full filter operation. The (Filter --> SVR) combination will always run in the same minor fragment, so no extra network I/O occurs.
Another question asked about parallelism. Drill parallelizes based on HDFS file blocks which are commonly 256 MB or 512 MB. This is classic HDFS "data locality" behavior and is why Rafael suggests having larger Parquet files. (That said, Drill also parallelizes based on files, so having many small files should also work, ignoring the classic HDFS "small file problem", this was a big advantage of the MapR file system, and of S3.)
Your note does suggest another approach, which might work better on "blockless" systems such as S3 or local disk: parallelize at the row group level. Parquet is complex, we'd have to understand the costs and benefits of such an approach.
Thanks,
- Paul
On Wednesday, April 29, 2020, 9:35:40 AM PDT, Navin Bhawsar <na...@gmail.com> wrote:
Hi
We are trying to do a simple where clause query with Predicate .Parquet files are created using python and stored on hdfs.Apache Drill version used is 1.17 .
Below options are set as default required for Predicate Push Down
Drill query is scanning directory with multiple parquet files (total size 1 GB).We are expecting if predicate push down works it will help reduce scan time which is currently 97 %.If Predicate push down works row group scan should only fetch 70,840 records instead of 14162187.
|
Minor Fragment
|
NUM_ROWGROUPS
|
ROWGROUPS_PRUNED
|
NUM_DICT_PAGE_LOADS
|
NUM_DATA_PAGE_lOADS
|
NUM_DATA_PAGES_DECODED
|
NUM_DICT_PAGES_DECOMPRESSED
|
NUM_DATA_PAGES_DECOMPRESSED
|
TOTAL_DICT_PAGE_READ_BYTES
|
TOTAL_DATA_PAGE_READ_BYTES
|
TOTAL_DICT_DECOMPRESSED_BYTES
|
TOTAL_DATA_DECOMPRESSED_BYTES
|
TIME_DICT_PAGE_LOADS
|
TIME_DATA_PAGE_LOADS
|
TIME_DATA_PAGE_DECODE
|
TIME_DICT_PAGE_DECODE
|
TIME_DICT_PAGES_DECOMPRESSED
|
TIME_DATA_PAGES_DECOMPRESSED
|
TIME_DISK_SCAN_WAIT
|
TIME_DISK_SCAN
|
TIME_FIXEDCOLUMN_READ
|
TIME_VARCOLUMN_READ
|
TIME_PROCESS
|
|
01-00-04
|
7
|
0
|
77
|
0
|
77
|
77
|
77
|
0
|
0
|
7,147,852
|
8,884,071
|
598,070
|
0
|
97,822
|
11,440,739
|
2,081,514
|
17,694,740
|
598,070
|
0
|
112,108,259
|
703,103,096
|
815,245,307
|
|
01-01-04
|
6
|
0
|
66
|
0
|
66
|
66
|
66
|
0
|
0
|
2,115,860
|
4,316,153
|
1,778,468
|
0
|
144,320
|
3,665,957
|
775,403
|
8,693,618
|
1,778,468
|
0
|
105,066,657
|
776,807,232
|
882,070,408
|
|
01-02-04
|
6
|
0
|
66
|
0
|
66
|
66
|
66
|
0
|
0
|
6,835,560
|
8,630,174
|
337,404
|
0
|
100,190
|
10,876,145
|
1,970,521
|
11,789,061
|
337,404
|
0
|
102,833,433
|
655,338,696
|
758,203,357
|
|
01-03-04
|
6
|
0
|
66
|
0
|
66
|
66
|
66
|
0
|
0
|
2,242,112
|
4,516,183
|
1,586,562
|
0
|
164,398
|
3,827,371
|
877,814
|
8,604,307
|
1,586,562
|
0
|
112,745,628
|
758,634,132
|
871,586,588
|
|
01-04-04
|
6
|
0
|
66
|
2
|
66
|
66
|
64
|
0
|
1,420
|
5,407,178
|
7,175,446
|
2,216,935
|
3,181
|
74,956
|
8,754,425
|
1,650,970
|
11,241,636
|
2,216,935
|
0
|
97,180,713
|
668,249,966
|
765,461,684
|
|
01-05-04
|
6
|
0
|
66
|
1
|
66
|
66
|
65
|
0
|
92
|
1,378,260
|
3,595,638
|
3,394,196
|
1,571
|
204,833
|
2,726,005
|
1,357,297
|
6,843,717
|
3,394,196
|
0
|
150,560,569
|
704,154,215
|
854,928,393
|
|
01-06-04
|
6
|
0
|
66
|
0
|
66
|
66
|
66
|
0
|
0
|
4,748,302
|
6,547,215
|
471,679
|
0
|
114,270
|
7,739,335
|
1,537,805
|
10,571,215
|
471,679
|
0
|
97,392,926
|
667,056,499
|
764,478,811
|
|
01-07-04
|
6
|
0
|
68
|
0
|
66
|
64
|
66
|
180
|
0
|
769,746
|
3,128,730
|
292,603
|
0
|
130,814
|
1,574,574
|
425,133
|
6,563,457
|
286,300
|
0
|
168,501,325
|
716,135,483
|
884,850,308
|
|
01-08-04
|
6
|
0
|
66
|
0
|
66
|
66
|
66
|
0
|
0
|
8,356,637
|
9,264,223
|
582,946
|
0
|
101,103
|
13,332,669
|
2,422,705
|
13,340,100
|
582,946
|
0
|
109,932,913
|
691,400,457
|
801,374,949
|
|
01-09-04
|
6
|
0
|
66
|
2
|
66
|
66
|
64
|
0
|
133
|
1,453,953
|
2,953,546
|
19,563,820
|
1,920
|
149,257
|
2,553,666
|
632,461
|
5,886,238
|
19,563,820
|
0
|
81,854,819
|
557,612,832
|
639,664,370
|
|
01-10-04
|
6
|
0
|
66
|
0
|
66
|
66
|
66
|
0
|
0
|
6,634,676
|
8,081,684
|
Please advise if there is any specific options required to enable predicate push down.
Also we expect Filter should filter out records but its done later by SELECTION_VECTOR_REMOVER operator.There is not enough details on documentation site ,when this operation is triggered.
Thanks,Navin