You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Will Berkeley (JIRA)" <ji...@apache.org> on 2018/09/18 17:49:00 UTC

[jira] [Created] (IMPALA-7586) Incorrect results when querying primary = "\"" in Kudu

Will Berkeley created IMPALA-7586:
-------------------------------------

             Summary: Incorrect results when querying primary = "\"" in Kudu
                 Key: IMPALA-7586
                 URL: https://issues.apache.org/jira/browse/IMPALA-7586
             Project: IMPALA
          Issue Type: Bug
            Reporter: Will Berkeley
         Attachments: impalakudu_pred_bug.profile

Version string from catalogd web ui:
{noformat}
catalogd version 3.1.0-cdh6.x-SNAPSHOT RELEASE (build 8baac7f5849b6bacb02fedeb9b3fe2b2ee9450ee)
{noformat}
A reproduction script for the impala-shell:
{noformat}
create table test(name string, primary key(name) ) stored as kudu;

insert into test values ("\"");
-- Modified 1 row(s), 0 row error(s) in 4.01s

-- row found in full table scan
select * from test;
-- Fetched 1 row(s) in 0.15s

-- row not found on = predicate (pushed to kudu)
select * from test where name="\"";
-- Fetched 0 row(s) in 0.13s

-- row found when predicate cannot be pushed to kudu
select * from test where name like "\"";
-- Fetched 1 row(s) in 0.13s
{noformat}
This was originally reported as KUDU-2575. I tried to reproduce directly against Kudu using the python client but got the expected result.

From the plan and profile, Impala is pushing down the predicate, but Kudu is not being scanned, possibly because the Kudu client short-circuits the scan as having no results based on the predicate Impala pushes down.
{noformat}
00:SCAN KUDU [default.test]
   kudu predicates: name = '"'
   mem-estimate=0B mem-reservation=0B thread-reservation=1
   tuple-ids=0 row-size=15B cardinality=unavailable
   in pipelines: 00(GETNEXT)
{noformat}
{noformat}
KUDU_SCAN_NODE (id=0)
          - AverageScannerThreadConcurrency: 0.00 (0.0)
          - InactiveTotalTime: 0ns (0)
          - KuduRemoteScanTokens: 0 (0)
          - MaterializeTupleTime(*): 0ns (0)
          - NumScannerThreadMemUnavailable: 0 (0)
          - NumScannerThreadsStarted: 1 (1)
          - PeakMemoryUsage: 24.0 KiB (24576)
          - PeakScannerThreadConcurrency: 1 (1)
          - RowBatchBytesEnqueued: 16.0 KiB (16384)
          - RowBatchQueueGetWaitTime: 0ns (0)
          - RowBatchQueuePeakMemoryUsage: 0 B (0)
          - RowBatchQueuePutWaitTime: 0ns (0)
          - RowBatchesEnqueued: 1 (1)
          - RowsRead: 0 (0)
===>  - RowsReturned: 0 (0)
          - RowsReturnedRate: 0 per second (0)
          - ScanRangesComplete: 1 (1)
          - ScannerThreadsInvoluntaryContextSwitches: 0 (0)
          - ScannerThreadsTotalWallClockTime: 0ns (0)
            - ScannerThreadsSysTime: 158.00us (158000)
            - ScannerThreadsUserTime: 0ns (0)
          - ScannerThreadsVoluntaryContextSwitches: 2 (2)
===>  - TotalKuduScanRoundTrips: 0 (0)
          - TotalTime: 1ms (1999972)
{noformat}
I also confirmed Kudu sees no scan from Impala for this query using the /scans page of the tablet servers.

Full profile attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org