You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Chao (JIRA)" <ji...@apache.org> on 2015/04/03 07:59:52 UTC

[jira] [Created] (HIVE-10209) FetchTask with VC may fail because of ExecMapper.done is true

Chao created HIVE-10209:
---------------------------

             Summary: FetchTask with VC may fail because of ExecMapper.done is true
                 Key: HIVE-10209
                 URL: https://issues.apache.org/jira/browse/HIVE-10209
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 1.1.0
            Reporter: Chao
            Assignee: Chao


ExecMapper.done is a static variable, and may cause issues in the following example:

{code}
set hive.fetch.task.conversion=minimal;
select * from src where key < 10 limit 1;
set hive.fetch.task.conversion=more;
select *, BLOCK__OFFSET_INSIDE__FILE from src where key < 10;
{code}

The second select won't return any result.

The issue is, the first select query will be converted to a MapRedTask with only a mapper. And, when the task is done, because of the limit operator, ExecMapper.done will be set to true.

Then, when the second select query begin to execute, it will call {{FetchOperator::getRecordReader()}}, and since here we have virtual column, an instance of {{HiveRecordReader}} will be returned. The problem is, {{HiveRecordReader::doNext()}} will check ExecMapper.done. In this case, since the value is true, it will quit immediately.

In short, I think making ExecMapper.done static is a bad idea. The first query should in no way affect the second one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)