You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Pooja Nilangekar (JIRA)" <ji...@apache.org> on 2018/10/30 20:42:00 UTC

[jira] [Comment Edited] (IMPALA-7363) Spurious error generated by sequence file scanner with weird scan range length

    [ https://issues.apache.org/jira/browse/IMPALA-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669233#comment-16669233 ] 

Pooja Nilangekar edited comment on IMPALA-7363 at 10/30/18 8:41 PM:
--------------------------------------------------------------------

This seems to be a non-deterministic bug, executing the same query multiple times produces different results. A stream on the same file, at the same file_offset() and same bytes_left() value, (i.e., These two streams are exactly identical locations) return different long values. In case of the error, the ScannerContext::ReadVLong() function returns -10434 while in the non-error case it returns 10433. However, the bytes are the exact same in each case. 0x8e for the firstbyte and the value is always 10433. -The bug is somewhere in the ReadWriteUtil::IsNegativeVInt(). Not sure how "return byte < -120 || (byte >= -112 && byte < 0);" can be non-deterministic. Ideally, it should always return false for byte = -114 (0x8e). -

So it looks like the firstbyte value is overwritten during subsequent calls to GetBytes because the output buffer is owned by the stream and can be cleared to read more bytes. 


was (Author: poojanilangekar):
This seems to be a non-deterministic bug, executing the same query multiple times produces different results. A stream on the same file, at the same file_offset() and same bytes_left() value, (i.e., These two streams are exactly identical locations) return different long values. In case of the error, the ScannerContext::ReadVLong() function returns -10434 while in the non-error case it returns 10433. However, the bytes are the exact same in each case. 0x8e for the firstbyte and the value is always 10433. The bug is somewhere in the ReadWriteUtil::IsNegativeVInt(). Not sure how "return byte < -120 || (byte >= -112 && byte < 0);" can be non-deterministic. Ideally, it should always return false for byte = -114 (0x8e). 

[~tarmstrong] Do you have any idea what might be going on here?


> Spurious error generated by sequence file scanner with weird scan range length
> ------------------------------------------------------------------------------
>
>                 Key: IMPALA-7363
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7363
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.1.0
>            Reporter: Tim Armstrong
>            Assignee: Pooja Nilangekar
>            Priority: Major
>              Labels: avro
>
> Repro on master
> {noformat}
> tarmstrong@tarmstrong-box:~/Impala/incubator-impala$ impala-shell.sh
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build cec33fa0ae75392668273d40b5a1bc4bbd7e9e2e)
> ***********************************************************************************
> Welcome to the Impala shell.
> (Impala Shell v3.1.0-SNAPSHOT (cec33fa) built on Thu Jul 26 09:50:10 PDT 2018)
> To see a summary of a query's progress that updates in real-time, run 'set
> LIVE_PROGRESS=1;'.
> ***********************************************************************************
> [localhost:21000] default> use tpch_seq_snap;
> Query: use tpch_seq_snap
> [localhost:21000] tpch_seq_snap> SET max_scan_range_length=5377;
> MAX_SCAN_RANGE_LENGTH set to 5377
> [localhost:21000] tpch_seq_snap> select count(*)
>                                > from lineitem;
> Query: select count(*)
> from lineitem
> Query submitted at: 2018-07-26 14:10:18 (Coordinator: http://tarmstrong-box:25000)
> Query progress can be monitored at: http://tarmstrong-box:25000/query_plan?query_id=e9428efe173ad2f4:84b66bdb00000000
> +----------+
> | count(*) |
> +----------+
> | 5993651  |
> +----------+
> WARNINGS: SkipText: length is negative
> Problem parsing file hdfs://localhost:20500/test-warehouse/tpch.lineitem_seq_snap/000000_0 at 36472193
> {noformat}
> Found while adding a test for IMPALA-7360



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org