You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Peter Edike <pe...@interswitchgroup.com> on 2018/08/01 10:24:10 UTC
Apache Drill Queries Take so Long after which Webconsole dies and I
don't see results
Hello everyone,
I am currently using apache drill to query a directory of parquet files (total size 64gb with each file averaging 6.4Gb per size). I am running the following query
select * from dfs.rj.vw_reports_201805 where unique_key in
(
select unique_key from dfs.rj.vw_reports_201805 group by unique_key having count(unique_key) > 1
)
limit 40
Now this query takes about 42 minutes after which it fails because one of the drillbits in the drillbit cluster crashes. I check the logs and I see several errors summarized as follows
20:35:18.006 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState - Connection timed out for connection string (172.35.15.129:5181,172.35.15.151:5181,172.35.15.152:5181) and timeout (5000) / elapsed (13347)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197) [curator-client-2.7.1.jar:na]
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:87) [curator-client-2.7.1.jar:na]
at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115) [curator-client-2.7.1.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:806) [curator-framework-2.7.1.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:792) [curator-framework-2.7.1.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:62) [curator-framework-2.7.1.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:257) [curator-framework-2.7.1.jar:na]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
AND
Aug 1, 2018 7:54:18 AM WARNING: org.apache.parquet.CorruptStatistics: Ignoring statistics because created_by could not be parsed (see PARQUET-251): parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb)
org.apache.parquet.VersionParser$VersionParseException: Could not parse created_by: parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb) using format: (.+) version ((.*) )?\(build ?(.*)\)
at org.apache.parquet.VersionParser.parse(VersionParser.java:112)
at org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:66)
at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:264)
at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:568)
at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:545)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:455)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:404)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:390)
at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:117)
at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:53)
at org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:146)
at org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:142)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1633)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:142)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:182)
at org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:110)
at org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:87)
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:206)
at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
The Schema for the data I am querying in question is as follows
index_no
ANY
YES
bank_code
ANY
YES
trxn_category
ANY
YES
Debit_Account_Type
ANY
YES
Credit_Account_Type
ANY
YES
trxn_amount
ANY
YES
trxn_fee
ANY
YES
trxn_date
ANY
YES
Currency
ANY
YES
Late_reversal
ANY
YES
card_type
ANY
YES
terminal_type
ANY
YES
Acquirer
ANY
YES
Issuer
ANY
YES
Value_SettleAmount
ANY
YES
Value_RequestedAmount
ANY
YES
Volume
ANY
YES
unique_key
ANY
YES
source_node_name
ANY
YES
business_date
ANY
YES
PT_datetime_tran_local
ANY
YES
PTC_pan
ANY
YES
PTC_terminal_id
ANY
YES
PTC_card_acceptor_id_code
ANY
YES
PTC_card_acceptor_name_loc
ANY
YES
PT_system_trace_audit_nr
ANY
YES
PT_message_type
ANY
YES
PT_from_account_id
ANY
YES
PT_to_account_id
ANY
YES
PT_tran_type
ANY
YES
PT_rsp_code_req
ANY
YES
PT_rsp_code_rsp
ANY
YES
PT_settle_amount_req
ANY
YES
PT_settle_amount_rsp
ANY
YES
PTC_merchant_type
ANY
YES
PT_settle_amount_impact
ANY
YES
PT_settle_tran_fee_req
ANY
YES
PT_settle_tran_fee_rsp
ANY
YES
PT_auth_id_rsp
ANY
YES
PT_retrieval_reference_nr
ANY
YES
PTC_totals_group
ANY
YES
PTC_card_product
ANY
YES
PT_tran_currency_code
ANY
YES
PT_payee
ANY
YES
PT_tran_amount_req
ANY
YES
PT_tran_amount_rsp
ANY
YES
PT_extended_tran_type
ANY
YES
PT_tran_postilion_originated
ANY
YES
PT_tran_completed
ANY
YES
PT_tran_nr
ANY
YES
PT_retention_data
ANY
YES
PT_acquiring_inst_id_code
ANY
YES
PT_message_reason_code
ANY
YES
PT_sponsor_bank
ANY
YES
PT_datetime_tran_gmt
ANY
YES
PT_datetime_req
ANY
YES
PT_datetime_rsp
ANY
YES
PT_realtime_business_date
ANY
YES
PT_recon_business_date
ANY
YES
PT_from_account_type
ANY
YES
PT_to_account_type
ANY
YES
PT_tran_cash_req
ANY
YES
PT_tran_cash_rsp
ANY
YES
PT_tran_tran_fee_req
ANY
YES
PT_tran_tran_fee_rsp
ANY
YES
PT_tran_tran_fee_currency_code
ANY
YES
PT_tran_proc_fee_req
ANY
YES
PT_tran_proc_fee_rsp
ANY
YES
PT_tran_proc_fee_currency_code
ANY
YES
PT_settle_currency_code
ANY
YES
PT_pos_entry_mode
ANY
YES
PT_pos_condition_code
ANY
YES
PT_tran_reversed
ANY
YES
PT_card_verification_result
ANY
YES
PT_online_system_id
ANY
YES
PT_participant_id
ANY
YES
PT_receiving_inst_id_code
ANY
YES
PT_routing_type
ANY
YES
PT_pt_pos_card_input_mode
ANY
YES
PT_source_node_key
ANY
YES
PT_proc_online_system_id
ANY
YES
PTC_post_tran_cust_id
ANY
YES
PTC_source_node_name
ANY
YES
PTC_card_seq_nr
ANY
YES
PTC_expiry_date
ANY
YES
PTC_service_restriction_code
ANY
YES
PTC_terminal_owner
ANY
YES
PTC_mapped_card_acceptor_id_code
ANY
YES
PTC_pos_terminal_type
ANY
YES
PTC_pan_encrypted
ANY
YES
PTSP_Account_Nr
ANY
YES
PTSP_Code
ANY
YES
account_PTSP_Code
ANY
YES
PTSP_Name
ANY
YES
Sort_Code
ANY
YES
rdm_amt
ANY
YES
Reward_Code
ANY
YES
Reward_discount
ANY
YES
ptsp_terminal_id
ANY
YES
reward_terminal_id
ANY
YES
txn_id
ANY
YES
web_category_code
ANY
YES
web_category_name
ANY
YES
web_fee_type
ANY
YES
web_merchant_disc
ANY
YES
web_fee_cap
ANY
YES
Account_Name
ANY
YES
account_nr
ANY
YES
Acquiring_bank
ANY
YES
journal_amount
ANY
YES
xls_amount
ANY
YES
merch_cat_amount_cap
ANY
YES
merch_cat_visa_amount_cap
ANY
YES
reward_amount_cap
ANY
YES
Amount_config_state
ANY
YES
Amount_description
ANY
YES
Authorized_Person
ANY
YES
ACC_BANK_CODE
ANY
YES
merch_cat_bearer
ANY
YES
merch_cat_visa_bearer
ANY
YES
merch_cat_visa_category_code
ANY
YES
merch_cat_category_code
ANY
YES
merch_cat_visa_category_name
ANY
YES
merch_cat_category_name
ANY
YES
credit_acc_id
ANY
YES
credit_acc_nr_id
ANY
YES
credit_cardholder_acc_id
ANY
YES
credit_cardholder_acc_type
ANY
YES
CreditAccNr_acc_id
ANY
YES
CreditAccNr_acc_nr
ANY
YES
CreditAccNr_acc_nr_id
ANY
YES
CreditAccNr_se_id
ANY
YES
CreditAccNr_state
ANY
YES
Date_Modified
ANY
YES
debit_acc_id
ANY
YES
debit_acc_nr_id
ANY
YES
debit_cardholder_acc_id
ANY
YES
debit_cardholder_acc_type
ANY
YES
DebitAccNr_acc_id
ANY
YES
DebitAccNr_acc_nr
ANY
YES
DebitAccNr_acc_nr_id
ANY
YES
extended_trans_type
ANY
YES
fee
ANY
YES
Fee_amount_id
ANY
YES
merch_cat_fee_cap
ANY
YES
merch_cat_visa_fee_cap
ANY
YES
reward_fee_cap
ANY
YES
Fee_description
ANY
YES
Fee_Discount
ANY
YES
Fee_fee_id
ANY
YES
fee_id
ANY
YES
Fee_name
ANY
YES
Fee_se_id
ANY
YES
merch_cat_category_fee_type
ANY
YES
merch_cat_category_visa_fee_type
ANY
YES
merch_cat_category_merch_discount
ANY
YES
merch_cat_category_visa_merch_discount
ANY
YES
PT_sink_node_name
ANY
YES
post_tran_id
ANY
YES
Miscellaneous
ANY
YES
terminal_owner
ANY
YES
terminal_owner_name
ANY
YES
REF_rpt_account_type
ANY
YES
REF_IsPoolAccount
ANY
YES
From_Account_Type
ANY
YES
To_Account_Type
ANY
YES
Tran_type_description
ANY
YES
Region
ANY
YES
Transaction_Status
ANY
YES
Transaction_type_Impact
ANY
YES
Message_Type_Desc
ANY
YES
Response_Code_description
ANY
YES
REF_BatchId
ANY
YES
Beneficiary_Account
ANY
YES
Rate
ANY
YES
final_fee
ANY
YES
final_fee_type
ANY
YES
settlement_date
ANY
YES
Please what can I do to improve performance of this query
Re: Apache Drill Queries Take so Long after which Webconsole dies and
I don't see results
Posted by Khurram Faraaz <kf...@mapr.com>.
Hi,
1. What version of Drill are you running your query on ?
2. How were your parquet files created ? (were they created on Drill or
another system, like Hive etc.)
3. Can you please share the parquet schema of the parquet file being
queried ?
Thanks,
Khurram
On Wed, Aug 1, 2018 at 3:24 AM, Peter Edike <
peter.edike@interswitchgroup.com> wrote:
> Hello everyone,
>
> I am currently using apache drill to query a directory of parquet files
> (total size 64gb with each file averaging 6.4Gb per size). I am running
> the following query
>
> select * from dfs.rj.vw_reports_201805 where unique_key in
> (
> select unique_key from dfs.rj.vw_reports_201805 group by
> unique_key having count(unique_key) > 1
> )
> limit 40
>
> Now this query takes about 42 minutes after which it fails because one of
> the drillbits in the drillbit cluster crashes. I check the logs and I see
> several errors summarized as follows
>
> 20:35:18.006 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState
> - Connection timed out for connection string (172.35.15.129:5181,
> 172.35.15.151:5181,172.35.15.152:5181) and timeout (5000) / elapsed
> (13347)
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
> ConnectionLoss
> at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197)
> [curator-client-2.7.1.jar:na]
> at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:87)
> [curator-client-2.7.1.jar:na]
> at org.apache.curator.CuratorZookeeperClient.getZooKeeper(
> CuratorZookeeperClient.java:115) [curator-client-2.7.1.jar:na]
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> performBackgroundOperation(CuratorFrameworkImpl.java:806)
> [curator-framework-2.7.1.jar:na]
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> backgroundOperationsLoop(CuratorFrameworkImpl.java:792)
> [curator-framework-2.7.1.jar:na]
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> access$300(CuratorFrameworkImpl.java:62) [curator-framework-2.7.1.jar:na]
> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
> call(CuratorFrameworkImpl.java:257) [curator-framework-2.7.1.jar:na]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_171]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [na:1.8.0_171]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [na:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
> AND
>
>
> Aug 1, 2018 7:54:18 AM WARNING: org.apache.parquet.CorruptStatistics:
> Ignoring statistics because created_by could not be parsed (see
> PARQUET-251): parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb)
>
> org.apache.parquet.VersionParser$VersionParseException: Could not parse
> created_by: parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb)
> using format: (.+) version ((.*) )?\(build ?(.*)\)
>
> at org.apache.parquet.VersionParser.parse(VersionParser.java:112)
>
> at org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(
> CorruptStatistics.java:66)
>
> at org.apache.parquet.format.converter.ParquetMetadataConverter.
> fromParquetStatistics(ParquetMetadataConverter.java:264)
>
> at org.apache.parquet.format.converter.ParquetMetadataConverter.
> fromParquetMetadata(ParquetMetadataConverter.java:568)
>
> at org.apache.parquet.format.converter.ParquetMetadataConverter.
> readParquetMetadata(ParquetMetadataConverter.java:545)
>
> at org.apache.parquet.hadoop.ParquetFileReader.readFooter(
> ParquetFileReader.java:455)
>
> at org.apache.parquet.hadoop.ParquetFileReader.readFooter(
> ParquetFileReader.java:404)
>
> at org.apache.parquet.hadoop.ParquetFileReader.readFooter(
> ParquetFileReader.java:390)
>
> at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.
> getBatch(ParquetScanBatchCreator.java:117)
>
> at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.
> getBatch(ParquetScanBatchCreator.java:53)
>
> at org.apache.drill.exec.physical.impl.ImplCreator$2.
> run(ImplCreator.java:146)
>
> at org.apache.drill.exec.physical.impl.ImplCreator$2.
> run(ImplCreator.java:142)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1633)
>
> at org.apache.drill.exec.physical.impl.ImplCreator.
> getRecordBatch(ImplCreator.java:142)
>
> at org.apache.drill.exec.physical.impl.ImplCreator.
> getChildren(ImplCreator.java:182)
>
> at org.apache.drill.exec.physical.impl.ImplCreator.
> getRootExec(ImplCreator.java:110)
>
> at org.apache.drill.exec.physical.impl.ImplCreator.
> getExec(ImplCreator.java:87)
>
> at org.apache.drill.exec.work.fragment.FragmentExecutor.run(
> FragmentExecutor.java:206)
>
> at org.apache.drill.common.SelfCleaningRunnable.run(
> SelfCleaningRunnable.java:38)
>
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1149)
>
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
> The Schema for the data I am querying in question is as follows
>
> index_no
>
> ANY
>
> YES
>
> bank_code
>
> ANY
>
> YES
>
> trxn_category
>
> ANY
>
> YES
>
> Debit_Account_Type
>
> ANY
>
> YES
>
> Credit_Account_Type
>
> ANY
>
> YES
>
> trxn_amount
>
> ANY
>
> YES
>
> trxn_fee
>
> ANY
>
> YES
>
> trxn_date
>
> ANY
>
> YES
>
> Currency
>
> ANY
>
> YES
>
> Late_reversal
>
> ANY
>
> YES
>
> card_type
>
> ANY
>
> YES
>
> terminal_type
>
> ANY
>
> YES
>
> Acquirer
>
> ANY
>
> YES
>
> Issuer
>
> ANY
>
> YES
>
> Value_SettleAmount
>
> ANY
>
> YES
>
> Value_RequestedAmount
>
> ANY
>
> YES
>
> Volume
>
> ANY
>
> YES
>
> unique_key
>
> ANY
>
> YES
>
> source_node_name
>
> ANY
>
> YES
>
> business_date
>
> ANY
>
> YES
>
> PT_datetime_tran_local
>
> ANY
>
> YES
>
> PTC_pan
>
> ANY
>
> YES
>
> PTC_terminal_id
>
> ANY
>
> YES
>
> PTC_card_acceptor_id_code
>
> ANY
>
> YES
>
> PTC_card_acceptor_name_loc
>
> ANY
>
> YES
>
> PT_system_trace_audit_nr
>
> ANY
>
> YES
>
> PT_message_type
>
> ANY
>
> YES
>
> PT_from_account_id
>
> ANY
>
> YES
>
> PT_to_account_id
>
> ANY
>
> YES
>
> PT_tran_type
>
> ANY
>
> YES
>
> PT_rsp_code_req
>
> ANY
>
> YES
>
> PT_rsp_code_rsp
>
> ANY
>
> YES
>
> PT_settle_amount_req
>
> ANY
>
> YES
>
> PT_settle_amount_rsp
>
> ANY
>
> YES
>
> PTC_merchant_type
>
> ANY
>
> YES
>
> PT_settle_amount_impact
>
> ANY
>
> YES
>
> PT_settle_tran_fee_req
>
> ANY
>
> YES
>
> PT_settle_tran_fee_rsp
>
> ANY
>
> YES
>
> PT_auth_id_rsp
>
> ANY
>
> YES
>
> PT_retrieval_reference_nr
>
> ANY
>
> YES
>
> PTC_totals_group
>
> ANY
>
> YES
>
> PTC_card_product
>
> ANY
>
> YES
>
> PT_tran_currency_code
>
> ANY
>
> YES
>
> PT_payee
>
> ANY
>
> YES
>
> PT_tran_amount_req
>
> ANY
>
> YES
>
> PT_tran_amount_rsp
>
> ANY
>
> YES
>
> PT_extended_tran_type
>
> ANY
>
> YES
>
> PT_tran_postilion_originated
>
> ANY
>
> YES
>
> PT_tran_completed
>
> ANY
>
> YES
>
> PT_tran_nr
>
> ANY
>
> YES
>
> PT_retention_data
>
> ANY
>
> YES
>
> PT_acquiring_inst_id_code
>
> ANY
>
> YES
>
> PT_message_reason_code
>
> ANY
>
> YES
>
> PT_sponsor_bank
>
> ANY
>
> YES
>
> PT_datetime_tran_gmt
>
> ANY
>
> YES
>
> PT_datetime_req
>
> ANY
>
> YES
>
> PT_datetime_rsp
>
> ANY
>
> YES
>
> PT_realtime_business_date
>
> ANY
>
> YES
>
> PT_recon_business_date
>
> ANY
>
> YES
>
> PT_from_account_type
>
> ANY
>
> YES
>
> PT_to_account_type
>
> ANY
>
> YES
>
> PT_tran_cash_req
>
> ANY
>
> YES
>
> PT_tran_cash_rsp
>
> ANY
>
> YES
>
> PT_tran_tran_fee_req
>
> ANY
>
> YES
>
> PT_tran_tran_fee_rsp
>
> ANY
>
> YES
>
> PT_tran_tran_fee_currency_code
>
> ANY
>
> YES
>
> PT_tran_proc_fee_req
>
> ANY
>
> YES
>
> PT_tran_proc_fee_rsp
>
> ANY
>
> YES
>
> PT_tran_proc_fee_currency_code
>
> ANY
>
> YES
>
> PT_settle_currency_code
>
> ANY
>
> YES
>
> PT_pos_entry_mode
>
> ANY
>
> YES
>
> PT_pos_condition_code
>
> ANY
>
> YES
>
> PT_tran_reversed
>
> ANY
>
> YES
>
> PT_card_verification_result
>
> ANY
>
> YES
>
> PT_online_system_id
>
> ANY
>
> YES
>
> PT_participant_id
>
> ANY
>
> YES
>
> PT_receiving_inst_id_code
>
> ANY
>
> YES
>
> PT_routing_type
>
> ANY
>
> YES
>
> PT_pt_pos_card_input_mode
>
> ANY
>
> YES
>
> PT_source_node_key
>
> ANY
>
> YES
>
> PT_proc_online_system_id
>
> ANY
>
> YES
>
> PTC_post_tran_cust_id
>
> ANY
>
> YES
>
> PTC_source_node_name
>
> ANY
>
> YES
>
> PTC_card_seq_nr
>
> ANY
>
> YES
>
> PTC_expiry_date
>
> ANY
>
> YES
>
> PTC_service_restriction_code
>
> ANY
>
> YES
>
> PTC_terminal_owner
>
> ANY
>
> YES
>
> PTC_mapped_card_acceptor_id_code
>
> ANY
>
> YES
>
> PTC_pos_terminal_type
>
> ANY
>
> YES
>
> PTC_pan_encrypted
>
> ANY
>
> YES
>
> PTSP_Account_Nr
>
> ANY
>
> YES
>
> PTSP_Code
>
> ANY
>
> YES
>
> account_PTSP_Code
>
> ANY
>
> YES
>
> PTSP_Name
>
> ANY
>
> YES
>
> Sort_Code
>
> ANY
>
> YES
>
> rdm_amt
>
> ANY
>
> YES
>
> Reward_Code
>
> ANY
>
> YES
>
> Reward_discount
>
> ANY
>
> YES
>
> ptsp_terminal_id
>
> ANY
>
> YES
>
> reward_terminal_id
>
> ANY
>
> YES
>
>
> txn_id
>
> ANY
>
> YES
>
> web_category_code
>
> ANY
>
> YES
>
> web_category_name
>
> ANY
>
> YES
>
> web_fee_type
>
> ANY
>
> YES
>
> web_merchant_disc
>
> ANY
>
> YES
>
> web_fee_cap
>
> ANY
>
> YES
>
> Account_Name
>
> ANY
>
> YES
>
> account_nr
>
> ANY
>
> YES
>
> Acquiring_bank
>
> ANY
>
> YES
>
> journal_amount
>
> ANY
>
> YES
>
> xls_amount
>
> ANY
>
> YES
>
> merch_cat_amount_cap
>
> ANY
>
> YES
>
> merch_cat_visa_amount_cap
>
> ANY
>
> YES
>
> reward_amount_cap
>
> ANY
>
> YES
>
> Amount_config_state
>
> ANY
>
> YES
>
> Amount_description
>
> ANY
>
> YES
>
> Authorized_Person
>
> ANY
>
> YES
>
> ACC_BANK_CODE
>
> ANY
>
> YES
>
> merch_cat_bearer
>
> ANY
>
> YES
>
> merch_cat_visa_bearer
>
> ANY
>
> YES
>
> merch_cat_visa_category_code
>
> ANY
>
> YES
>
> merch_cat_category_code
>
> ANY
>
> YES
>
> merch_cat_visa_category_name
>
> ANY
>
> YES
>
> merch_cat_category_name
>
> ANY
>
> YES
>
> credit_acc_id
>
> ANY
>
> YES
>
> credit_acc_nr_id
>
> ANY
>
> YES
>
> credit_cardholder_acc_id
>
> ANY
>
> YES
>
> credit_cardholder_acc_type
>
> ANY
>
> YES
>
> CreditAccNr_acc_id
>
> ANY
>
> YES
>
> CreditAccNr_acc_nr
>
> ANY
>
> YES
>
> CreditAccNr_acc_nr_id
>
> ANY
>
> YES
>
> CreditAccNr_se_id
>
> ANY
>
> YES
>
> CreditAccNr_state
>
> ANY
>
> YES
>
> Date_Modified
>
> ANY
>
> YES
>
> debit_acc_id
>
> ANY
>
> YES
>
> debit_acc_nr_id
>
> ANY
>
> YES
>
> debit_cardholder_acc_id
>
> ANY
>
> YES
>
> debit_cardholder_acc_type
>
> ANY
>
> YES
>
> DebitAccNr_acc_id
>
> ANY
>
> YES
>
> DebitAccNr_acc_nr
>
> ANY
>
> YES
>
> DebitAccNr_acc_nr_id
>
> ANY
>
> YES
>
> extended_trans_type
>
> ANY
>
> YES
>
> fee
>
> ANY
>
> YES
>
> Fee_amount_id
>
> ANY
>
> YES
>
> merch_cat_fee_cap
>
> ANY
>
> YES
>
> merch_cat_visa_fee_cap
>
> ANY
>
> YES
>
> reward_fee_cap
>
> ANY
>
> YES
>
> Fee_description
>
> ANY
>
> YES
>
> Fee_Discount
>
> ANY
>
> YES
>
> Fee_fee_id
>
> ANY
>
> YES
>
> fee_id
>
> ANY
>
> YES
>
> Fee_name
>
> ANY
>
> YES
>
> Fee_se_id
>
> ANY
>
> YES
>
> merch_cat_category_fee_type
>
> ANY
>
> YES
>
> merch_cat_category_visa_fee_type
>
> ANY
>
> YES
>
> merch_cat_category_merch_discount
>
> ANY
>
> YES
>
> merch_cat_category_visa_merch_discount
>
> ANY
>
> YES
>
> PT_sink_node_name
>
> ANY
>
> YES
>
> post_tran_id
>
> ANY
>
> YES
>
> Miscellaneous
>
> ANY
>
> YES
>
> terminal_owner
>
> ANY
>
> YES
>
> terminal_owner_name
>
> ANY
>
> YES
>
> REF_rpt_account_type
>
> ANY
>
> YES
>
> REF_IsPoolAccount
>
> ANY
>
> YES
>
> From_Account_Type
>
> ANY
>
> YES
>
> To_Account_Type
>
> ANY
>
> YES
>
> Tran_type_description
>
> ANY
>
> YES
>
> Region
>
> ANY
>
> YES
>
> Transaction_Status
>
> ANY
>
> YES
>
> Transaction_type_Impact
>
> ANY
>
> YES
>
> Message_Type_Desc
>
> ANY
>
> YES
>
> Response_Code_description
>
> ANY
>
> YES
>
> REF_BatchId
>
> ANY
>
> YES
>
> Beneficiary_Account
>
> ANY
>
> YES
>
> Rate
>
> ANY
>
> YES
>
> final_fee
>
> ANY
>
> YES
>
> final_fee_type
>
> ANY
>
> YES
>
> settlement_date
>
> ANY
>
> YES
>
>
> Please what can I do to improve performance of this query
>
>
Re: Apache Drill Queries Take so Long after which Webconsole dies and
I don't see results
Posted by Kunal Khatua <ku...@apache.org>.
Hi Peter
From the logs you shared, the Parquet-related messages are a warning and has nothing to do with the unresponsiveness and subsequent crash of the Drillbit.
The Zookeeper timeout indicates that either the ZK or your DBit was possibly overloaded. Since th Drillbit is reporting the timeout, the lack of a response is because of the Zookeeper. That, however, should not crash that Drillbit.
Do you have GC logs of the Drillbit? You'll need to edit the conf/drill-env.sh file to start the Drillbit JVM with additional parameters:
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -verbose:gc -Xloggc:/var/log/drill/gc.log
The fact that your Drillbit's webconsole is unresponsive makes me believe that it most likely ran out of (heap) memory. The GC log would indicate that.
Now, if that is the case, heap memory would need to be increased. However, this is assuming you are NOT running your query via the WebConsole. If you are, I'd recommend you use a JDBC tool like DBeaver or SQuirreL. With the WebConsole, the Drillbit uses the heap to collect the query's result set and would also involve the scarce webserver threads to be busy building this into a single JSON response to your HTTP based query submission. This results in those threads being unavailable to address other relatively quick HTTP requests (like your other general interactions with the WebConsole).
If you MUST use the WebConsole to submit, it would make sense to view the progress from another Drillbit's WebConsole.
In the meanwhile, you can mitigate the ZK timeout with these params:
drill.exec.zk.timeout
drill.exec.zk.retry.delay
drill.exec.zk.retry.count
Hope this helps! Let us know how things turn out.
Kunal
On 8/1/2018 3:25:40 AM, Peter Edike <pe...@interswitchgroup.com> wrote:
Hello everyone,
I am currently using apache drill to query a directory of parquet files (total size 64gb with each file averaging 6.4Gb per size). I am running the following query
select * from dfs.rj.vw_reports_201805 where unique_key in
(
select unique_key from dfs.rj.vw_reports_201805 group by unique_key having count(unique_key) > 1
)
limit 40
Now this query takes about 42 minutes after which it fails because one of the drillbits in the drillbit cluster crashes. I check the logs and I see several errors summarized as follows
20:35:18.006 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState - Connection timed out for connection string (172.35.15.129:5181,172.35.15.151:5181,172.35.15.152:5181) and timeout (5000) / elapsed (13347)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197) [curator-client-2.7.1.jar:na]
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:87) [curator-client-2.7.1.jar:na]
at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115) [curator-client-2.7.1.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:806) [curator-framework-2.7.1.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:792) [curator-framework-2.7.1.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:62) [curator-framework-2.7.1.jar:na]
at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:257) [curator-framework-2.7.1.jar:na]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
AND
Aug 1, 2018 7:54:18 AM WARNING: org.apache.parquet.CorruptStatistics: Ignoring statistics because created_by could not be parsed (see PARQUET-251): parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb)
org.apache.parquet.VersionParser$VersionParseException: Could not parse created_by: parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb) using format: (.+) version ((.*) )?\(build ?(.*)\)
at org.apache.parquet.VersionParser.parse(VersionParser.java:112)
at org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:66)
at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:264)
at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:568)
at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:545)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:455)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:404)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:390)
at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:117)
at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:53)
at org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:146)
at org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:142)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1633)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:142)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:182)
at org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:110)
at org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:87)
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:206)
at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
The Schema for the data I am querying in question is as follows
index_no
ANY
YES
bank_code
ANY
YES
trxn_category
ANY
YES
Debit_Account_Type
ANY
YES
Credit_Account_Type
ANY
YES
trxn_amount
ANY
YES
trxn_fee
ANY
YES
trxn_date
ANY
YES
Currency
ANY
YES
Late_reversal
ANY
YES
card_type
ANY
YES
terminal_type
ANY
YES
Acquirer
ANY
YES
Issuer
ANY
YES
Value_SettleAmount
ANY
YES
Value_RequestedAmount
ANY
YES
Volume
ANY
YES
unique_key
ANY
YES
source_node_name
ANY
YES
business_date
ANY
YES
PT_datetime_tran_local
ANY
YES
PTC_pan
ANY
YES
PTC_terminal_id
ANY
YES
PTC_card_acceptor_id_code
ANY
YES
PTC_card_acceptor_name_loc
ANY
YES
PT_system_trace_audit_nr
ANY
YES
PT_message_type
ANY
YES
PT_from_account_id
ANY
YES
PT_to_account_id
ANY
YES
PT_tran_type
ANY
YES
PT_rsp_code_req
ANY
YES
PT_rsp_code_rsp
ANY
YES
PT_settle_amount_req
ANY
YES
PT_settle_amount_rsp
ANY
YES
PTC_merchant_type
ANY
YES
PT_settle_amount_impact
ANY
YES
PT_settle_tran_fee_req
ANY
YES
PT_settle_tran_fee_rsp
ANY
YES
PT_auth_id_rsp
ANY
YES
PT_retrieval_reference_nr
ANY
YES
PTC_totals_group
ANY
YES
PTC_card_product
ANY
YES
PT_tran_currency_code
ANY
YES
PT_payee
ANY
YES
PT_tran_amount_req
ANY
YES
PT_tran_amount_rsp
ANY
YES
PT_extended_tran_type
ANY
YES
PT_tran_postilion_originated
ANY
YES
PT_tran_completed
ANY
YES
PT_tran_nr
ANY
YES
PT_retention_data
ANY
YES
PT_acquiring_inst_id_code
ANY
YES
PT_message_reason_code
ANY
YES
PT_sponsor_bank
ANY
YES
PT_datetime_tran_gmt
ANY
YES
PT_datetime_req
ANY
YES
PT_datetime_rsp
ANY
YES
PT_realtime_business_date
ANY
YES
PT_recon_business_date
ANY
YES
PT_from_account_type
ANY
YES
PT_to_account_type
ANY
YES
PT_tran_cash_req
ANY
YES
PT_tran_cash_rsp
ANY
YES
PT_tran_tran_fee_req
ANY
YES
PT_tran_tran_fee_rsp
ANY
YES
PT_tran_tran_fee_currency_code
ANY
YES
PT_tran_proc_fee_req
ANY
YES
PT_tran_proc_fee_rsp
ANY
YES
PT_tran_proc_fee_currency_code
ANY
YES
PT_settle_currency_code
ANY
YES
PT_pos_entry_mode
ANY
YES
PT_pos_condition_code
ANY
YES
PT_tran_reversed
ANY
YES
PT_card_verification_result
ANY
YES
PT_online_system_id
ANY
YES
PT_participant_id
ANY
YES
PT_receiving_inst_id_code
ANY
YES
PT_routing_type
ANY
YES
PT_pt_pos_card_input_mode
ANY
YES
PT_source_node_key
ANY
YES
PT_proc_online_system_id
ANY
YES
PTC_post_tran_cust_id
ANY
YES
PTC_source_node_name
ANY
YES
PTC_card_seq_nr
ANY
YES
PTC_expiry_date
ANY
YES
PTC_service_restriction_code
ANY
YES
PTC_terminal_owner
ANY
YES
PTC_mapped_card_acceptor_id_code
ANY
YES
PTC_pos_terminal_type
ANY
YES
PTC_pan_encrypted
ANY
YES
PTSP_Account_Nr
ANY
YES
PTSP_Code
ANY
YES
account_PTSP_Code
ANY
YES
PTSP_Name
ANY
YES
Sort_Code
ANY
YES
rdm_amt
ANY
YES
Reward_Code
ANY
YES
Reward_discount
ANY
YES
ptsp_terminal_id
ANY
YES
reward_terminal_id
ANY
YES
txn_id
ANY
YES
web_category_code
ANY
YES
web_category_name
ANY
YES
web_fee_type
ANY
YES
web_merchant_disc
ANY
YES
web_fee_cap
ANY
YES
Account_Name
ANY
YES
account_nr
ANY
YES
Acquiring_bank
ANY
YES
journal_amount
ANY
YES
xls_amount
ANY
YES
merch_cat_amount_cap
ANY
YES
merch_cat_visa_amount_cap
ANY
YES
reward_amount_cap
ANY
YES
Amount_config_state
ANY
YES
Amount_description
ANY
YES
Authorized_Person
ANY
YES
ACC_BANK_CODE
ANY
YES
merch_cat_bearer
ANY
YES
merch_cat_visa_bearer
ANY
YES
merch_cat_visa_category_code
ANY
YES
merch_cat_category_code
ANY
YES
merch_cat_visa_category_name
ANY
YES
merch_cat_category_name
ANY
YES
credit_acc_id
ANY
YES
credit_acc_nr_id
ANY
YES
credit_cardholder_acc_id
ANY
YES
credit_cardholder_acc_type
ANY
YES
CreditAccNr_acc_id
ANY
YES
CreditAccNr_acc_nr
ANY
YES
CreditAccNr_acc_nr_id
ANY
YES
CreditAccNr_se_id
ANY
YES
CreditAccNr_state
ANY
YES
Date_Modified
ANY
YES
debit_acc_id
ANY
YES
debit_acc_nr_id
ANY
YES
debit_cardholder_acc_id
ANY
YES
debit_cardholder_acc_type
ANY
YES
DebitAccNr_acc_id
ANY
YES
DebitAccNr_acc_nr
ANY
YES
DebitAccNr_acc_nr_id
ANY
YES
extended_trans_type
ANY
YES
fee
ANY
YES
Fee_amount_id
ANY
YES
merch_cat_fee_cap
ANY
YES
merch_cat_visa_fee_cap
ANY
YES
reward_fee_cap
ANY
YES
Fee_description
ANY
YES
Fee_Discount
ANY
YES
Fee_fee_id
ANY
YES
fee_id
ANY
YES
Fee_name
ANY
YES
Fee_se_id
ANY
YES
merch_cat_category_fee_type
ANY
YES
merch_cat_category_visa_fee_type
ANY
YES
merch_cat_category_merch_discount
ANY
YES
merch_cat_category_visa_merch_discount
ANY
YES
PT_sink_node_name
ANY
YES
post_tran_id
ANY
YES
Miscellaneous
ANY
YES
terminal_owner
ANY
YES
terminal_owner_name
ANY
YES
REF_rpt_account_type
ANY
YES
REF_IsPoolAccount
ANY
YES
From_Account_Type
ANY
YES
To_Account_Type
ANY
YES
Tran_type_description
ANY
YES
Region
ANY
YES
Transaction_Status
ANY
YES
Transaction_type_Impact
ANY
YES
Message_Type_Desc
ANY
YES
Response_Code_description
ANY
YES
REF_BatchId
ANY
YES
Beneficiary_Account
ANY
YES
Rate
ANY
YES
final_fee
ANY
YES
final_fee_type
ANY
YES
settlement_date
ANY
YES
Please what can I do to improve performance of this query
Re: Apache Drill Queries Take so Long after which Webconsole dies and
I don't see results
Posted by Charles Givre <cg...@gmail.com>.
Hi Peter,
To add to the discussion, I suspect the reason that this query is taking so long is the number of times you are asking Drill to do a “full file scan”. You might want to try using COUNT() as a windowing function instead of using the subquery.
—C
> On Aug 1, 2018, at 06:24, Peter Edike <pe...@interswitchgroup.com> wrote:
>
> Hello everyone,
>
> I am currently using apache drill to query a directory of parquet files (total size 64gb with each file averaging 6.4Gb per size). I am running the following query
>
> select * from dfs.rj.vw_reports_201805 where unique_key in
> (
> select unique_key from dfs.rj.vw_reports_201805 group by unique_key having count(unique_key) > 1
> )
> limit 40
>
> Now this query takes about 42 minutes after which it fails because one of the drillbits in the drillbit cluster crashes. I check the logs and I see several errors summarized as follows
>
> 20:35:18.006 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState - Connection timed out for connection string (172.35.15.129:5181,172.35.15.151:5181,172.35.15.152:5181) and timeout (5000) / elapsed (13347)
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
> at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197) [curator-client-2.7.1.jar:na]
> at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:87) [curator-client-2.7.1.jar:na]
> at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115) [curator-client-2.7.1.jar:na]
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:806) [curator-framework-2.7.1.jar:na]
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:792) [curator-framework-2.7.1.jar:na]
> at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:62) [curator-framework-2.7.1.jar:na]
> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:257) [curator-framework-2.7.1.jar:na]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_171]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_171]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
> AND
>
>
> Aug 1, 2018 7:54:18 AM WARNING: org.apache.parquet.CorruptStatistics: Ignoring statistics because created_by could not be parsed (see PARQUET-251): parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb)
>
> org.apache.parquet.VersionParser$VersionParseException: Could not parse created_by: parquet-mr (build 6aa21f8776625b5fa6b18059cfebe7549f2e00cb) using format: (.+) version ((.*) )?\(build ?(.*)\)
>
> at org.apache.parquet.VersionParser.parse(VersionParser.java:112)
>
> at org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:66)
>
> at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:264)
>
> at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:568)
>
> at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:545)
>
> at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:455)
>
> at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:404)
>
> at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:390)
>
> at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:117)
>
> at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:53)
>
> at org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:146)
>
> at org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:142)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
>
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1633)
>
> at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:142)
>
> at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:182)
>
> at org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:110)
>
> at org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:87)
>
> at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:206)
>
> at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
> The Schema for the data I am querying in question is as follows
>
> index_no
>
> ANY
>
> YES
>
> bank_code
>
> ANY
>
> YES
>
> trxn_category
>
> ANY
>
> YES
>
> Debit_Account_Type
>
> ANY
>
> YES
>
> Credit_Account_Type
>
> ANY
>
> YES
>
> trxn_amount
>
> ANY
>
> YES
>
> trxn_fee
>
> ANY
>
> YES
>
> trxn_date
>
> ANY
>
> YES
>
> Currency
>
> ANY
>
> YES
>
> Late_reversal
>
> ANY
>
> YES
>
> card_type
>
> ANY
>
> YES
>
> terminal_type
>
> ANY
>
> YES
>
> Acquirer
>
> ANY
>
> YES
>
> Issuer
>
> ANY
>
> YES
>
> Value_SettleAmount
>
> ANY
>
> YES
>
> Value_RequestedAmount
>
> ANY
>
> YES
>
> Volume
>
> ANY
>
> YES
>
> unique_key
>
> ANY
>
> YES
>
> source_node_name
>
> ANY
>
> YES
>
> business_date
>
> ANY
>
> YES
>
> PT_datetime_tran_local
>
> ANY
>
> YES
>
> PTC_pan
>
> ANY
>
> YES
>
> PTC_terminal_id
>
> ANY
>
> YES
>
> PTC_card_acceptor_id_code
>
> ANY
>
> YES
>
> PTC_card_acceptor_name_loc
>
> ANY
>
> YES
>
> PT_system_trace_audit_nr
>
> ANY
>
> YES
>
> PT_message_type
>
> ANY
>
> YES
>
> PT_from_account_id
>
> ANY
>
> YES
>
> PT_to_account_id
>
> ANY
>
> YES
>
> PT_tran_type
>
> ANY
>
> YES
>
> PT_rsp_code_req
>
> ANY
>
> YES
>
> PT_rsp_code_rsp
>
> ANY
>
> YES
>
> PT_settle_amount_req
>
> ANY
>
> YES
>
> PT_settle_amount_rsp
>
> ANY
>
> YES
>
> PTC_merchant_type
>
> ANY
>
> YES
>
> PT_settle_amount_impact
>
> ANY
>
> YES
>
> PT_settle_tran_fee_req
>
> ANY
>
> YES
>
> PT_settle_tran_fee_rsp
>
> ANY
>
> YES
>
> PT_auth_id_rsp
>
> ANY
>
> YES
>
> PT_retrieval_reference_nr
>
> ANY
>
> YES
>
> PTC_totals_group
>
> ANY
>
> YES
>
> PTC_card_product
>
> ANY
>
> YES
>
> PT_tran_currency_code
>
> ANY
>
> YES
>
> PT_payee
>
> ANY
>
> YES
>
> PT_tran_amount_req
>
> ANY
>
> YES
>
> PT_tran_amount_rsp
>
> ANY
>
> YES
>
> PT_extended_tran_type
>
> ANY
>
> YES
>
> PT_tran_postilion_originated
>
> ANY
>
> YES
>
> PT_tran_completed
>
> ANY
>
> YES
>
> PT_tran_nr
>
> ANY
>
> YES
>
> PT_retention_data
>
> ANY
>
> YES
>
> PT_acquiring_inst_id_code
>
> ANY
>
> YES
>
> PT_message_reason_code
>
> ANY
>
> YES
>
> PT_sponsor_bank
>
> ANY
>
> YES
>
> PT_datetime_tran_gmt
>
> ANY
>
> YES
>
> PT_datetime_req
>
> ANY
>
> YES
>
> PT_datetime_rsp
>
> ANY
>
> YES
>
> PT_realtime_business_date
>
> ANY
>
> YES
>
> PT_recon_business_date
>
> ANY
>
> YES
>
> PT_from_account_type
>
> ANY
>
> YES
>
> PT_to_account_type
>
> ANY
>
> YES
>
> PT_tran_cash_req
>
> ANY
>
> YES
>
> PT_tran_cash_rsp
>
> ANY
>
> YES
>
> PT_tran_tran_fee_req
>
> ANY
>
> YES
>
> PT_tran_tran_fee_rsp
>
> ANY
>
> YES
>
> PT_tran_tran_fee_currency_code
>
> ANY
>
> YES
>
> PT_tran_proc_fee_req
>
> ANY
>
> YES
>
> PT_tran_proc_fee_rsp
>
> ANY
>
> YES
>
> PT_tran_proc_fee_currency_code
>
> ANY
>
> YES
>
> PT_settle_currency_code
>
> ANY
>
> YES
>
> PT_pos_entry_mode
>
> ANY
>
> YES
>
> PT_pos_condition_code
>
> ANY
>
> YES
>
> PT_tran_reversed
>
> ANY
>
> YES
>
> PT_card_verification_result
>
> ANY
>
> YES
>
> PT_online_system_id
>
> ANY
>
> YES
>
> PT_participant_id
>
> ANY
>
> YES
>
> PT_receiving_inst_id_code
>
> ANY
>
> YES
>
> PT_routing_type
>
> ANY
>
> YES
>
> PT_pt_pos_card_input_mode
>
> ANY
>
> YES
>
> PT_source_node_key
>
> ANY
>
> YES
>
> PT_proc_online_system_id
>
> ANY
>
> YES
>
> PTC_post_tran_cust_id
>
> ANY
>
> YES
>
> PTC_source_node_name
>
> ANY
>
> YES
>
> PTC_card_seq_nr
>
> ANY
>
> YES
>
> PTC_expiry_date
>
> ANY
>
> YES
>
> PTC_service_restriction_code
>
> ANY
>
> YES
>
> PTC_terminal_owner
>
> ANY
>
> YES
>
> PTC_mapped_card_acceptor_id_code
>
> ANY
>
> YES
>
> PTC_pos_terminal_type
>
> ANY
>
> YES
>
> PTC_pan_encrypted
>
> ANY
>
> YES
>
> PTSP_Account_Nr
>
> ANY
>
> YES
>
> PTSP_Code
>
> ANY
>
> YES
>
> account_PTSP_Code
>
> ANY
>
> YES
>
> PTSP_Name
>
> ANY
>
> YES
>
> Sort_Code
>
> ANY
>
> YES
>
> rdm_amt
>
> ANY
>
> YES
>
> Reward_Code
>
> ANY
>
> YES
>
> Reward_discount
>
> ANY
>
> YES
>
> ptsp_terminal_id
>
> ANY
>
> YES
>
> reward_terminal_id
>
> ANY
>
> YES
>
>
> txn_id
>
> ANY
>
> YES
>
> web_category_code
>
> ANY
>
> YES
>
> web_category_name
>
> ANY
>
> YES
>
> web_fee_type
>
> ANY
>
> YES
>
> web_merchant_disc
>
> ANY
>
> YES
>
> web_fee_cap
>
> ANY
>
> YES
>
> Account_Name
>
> ANY
>
> YES
>
> account_nr
>
> ANY
>
> YES
>
> Acquiring_bank
>
> ANY
>
> YES
>
> journal_amount
>
> ANY
>
> YES
>
> xls_amount
>
> ANY
>
> YES
>
> merch_cat_amount_cap
>
> ANY
>
> YES
>
> merch_cat_visa_amount_cap
>
> ANY
>
> YES
>
> reward_amount_cap
>
> ANY
>
> YES
>
> Amount_config_state
>
> ANY
>
> YES
>
> Amount_description
>
> ANY
>
> YES
>
> Authorized_Person
>
> ANY
>
> YES
>
> ACC_BANK_CODE
>
> ANY
>
> YES
>
> merch_cat_bearer
>
> ANY
>
> YES
>
> merch_cat_visa_bearer
>
> ANY
>
> YES
>
> merch_cat_visa_category_code
>
> ANY
>
> YES
>
> merch_cat_category_code
>
> ANY
>
> YES
>
> merch_cat_visa_category_name
>
> ANY
>
> YES
>
> merch_cat_category_name
>
> ANY
>
> YES
>
> credit_acc_id
>
> ANY
>
> YES
>
> credit_acc_nr_id
>
> ANY
>
> YES
>
> credit_cardholder_acc_id
>
> ANY
>
> YES
>
> credit_cardholder_acc_type
>
> ANY
>
> YES
>
> CreditAccNr_acc_id
>
> ANY
>
> YES
>
> CreditAccNr_acc_nr
>
> ANY
>
> YES
>
> CreditAccNr_acc_nr_id
>
> ANY
>
> YES
>
> CreditAccNr_se_id
>
> ANY
>
> YES
>
> CreditAccNr_state
>
> ANY
>
> YES
>
> Date_Modified
>
> ANY
>
> YES
>
> debit_acc_id
>
> ANY
>
> YES
>
> debit_acc_nr_id
>
> ANY
>
> YES
>
> debit_cardholder_acc_id
>
> ANY
>
> YES
>
> debit_cardholder_acc_type
>
> ANY
>
> YES
>
> DebitAccNr_acc_id
>
> ANY
>
> YES
>
> DebitAccNr_acc_nr
>
> ANY
>
> YES
>
> DebitAccNr_acc_nr_id
>
> ANY
>
> YES
>
> extended_trans_type
>
> ANY
>
> YES
>
> fee
>
> ANY
>
> YES
>
> Fee_amount_id
>
> ANY
>
> YES
>
> merch_cat_fee_cap
>
> ANY
>
> YES
>
> merch_cat_visa_fee_cap
>
> ANY
>
> YES
>
> reward_fee_cap
>
> ANY
>
> YES
>
> Fee_description
>
> ANY
>
> YES
>
> Fee_Discount
>
> ANY
>
> YES
>
> Fee_fee_id
>
> ANY
>
> YES
>
> fee_id
>
> ANY
>
> YES
>
> Fee_name
>
> ANY
>
> YES
>
> Fee_se_id
>
> ANY
>
> YES
>
> merch_cat_category_fee_type
>
> ANY
>
> YES
>
> merch_cat_category_visa_fee_type
>
> ANY
>
> YES
>
> merch_cat_category_merch_discount
>
> ANY
>
> YES
>
> merch_cat_category_visa_merch_discount
>
> ANY
>
> YES
>
> PT_sink_node_name
>
> ANY
>
> YES
>
> post_tran_id
>
> ANY
>
> YES
>
> Miscellaneous
>
> ANY
>
> YES
>
> terminal_owner
>
> ANY
>
> YES
>
> terminal_owner_name
>
> ANY
>
> YES
>
> REF_rpt_account_type
>
> ANY
>
> YES
>
> REF_IsPoolAccount
>
> ANY
>
> YES
>
> From_Account_Type
>
> ANY
>
> YES
>
> To_Account_Type
>
> ANY
>
> YES
>
> Tran_type_description
>
> ANY
>
> YES
>
> Region
>
> ANY
>
> YES
>
> Transaction_Status
>
> ANY
>
> YES
>
> Transaction_type_Impact
>
> ANY
>
> YES
>
> Message_Type_Desc
>
> ANY
>
> YES
>
> Response_Code_description
>
> ANY
>
> YES
>
> REF_BatchId
>
> ANY
>
> YES
>
> Beneficiary_Account
>
> ANY
>
> YES
>
> Rate
>
> ANY
>
> YES
>
> final_fee
>
> ANY
>
> YES
>
> final_fee_type
>
> ANY
>
> YES
>
> settlement_date
>
> ANY
>
> YES
>
>
> Please what can I do to improve performance of this query
>