You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Siying Dong (JIRA)" <ji...@apache.org> on 2011/03/22 02:06:05 UTC
[jira] [Created] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
-------------------------------------------------------------------------------
Key: HIVE-2068
URL: https://issues.apache.org/jira/browse/HIVE-2068
Project: Hive
Issue Type: Improvement
Reporter: Siying Dong
Assignee: Siying Dong
Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016518#comment-13016518 ]
Namit Jain commented on HIVE-2068:
----------------------------------
can you update the review-board entry ?
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020386#comment-13020386 ]
Namit Jain commented on HIVE-2068:
----------------------------------
FetchTask: return false if number of rows found.
Else, it looks good
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Attachment: HIVE-2068.3.patch
previous patch missed a file.
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Attachment: HIVE-2068.4.patch
addressing Namit's comments.
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016597#comment-13016597 ]
Namit Jain commented on HIVE-2068:
----------------------------------
Siying, I dont see the new changes
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2068:
-----------------------------
Status: Open (was: Patch Available)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017092#comment-13017092 ]
jiraposter@reviews.apache.org commented on HIVE-2068:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/540/
-----------------------------------------------------------
Review request for hive and namit jain.
Summary
-------
For HIVE-2068
This addresses bug HIVE-2068.
https://issues.apache.org/jira/browse/HIVE-2068
Diffs
-----
trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 1086466
trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1086466
trunk/conf/hive-default.xml 1086466
trunk/hwi/src/java/org/apache/hadoop/hive/hwi/HWISessionItem.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/CommandNeedRetryException.java PRE-CREATION
trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/LimitOperator.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LimitDesc.java 1086466
trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessor.java 1086466
trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 1086466
trunk/ql/src/test/queries/clientpositive/global_limit.q PRE-CREATION
trunk/ql/src/test/results/clientpositive/global_limit.q.out PRE-CREATION
trunk/service/src/java/org/apache/hadoop/hive/service/HiveServer.java 1086466
Diff: https://reviews.apache.org/r/540/diff
Testing
-------
added a test to test suite.
Thanks,
Siying
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017897#comment-13017897 ]
Namit Jain commented on HIVE-2068:
----------------------------------
Can you regenerate the patch ?
I am getting a lot of conflicts
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2068:
-----------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
Committed. Thanks Siying
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch, HIVE-2068.6.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Attachment: HIVE-2068.6.patch
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch, HIVE-2068.6.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-2068:
---------------------------------
Component/s: Query Processor
Fix Version/s: 0.8.0
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Siying Dong
> Assignee: Siying Dong
> Fix For: 0.8.0
>
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch, HIVE-2068.6.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Status: Open (was: Patch Available)
found some problem with last modified piece of codes.
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch, HIVE-2068.6.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013774#comment-13013774 ]
Namit Jain commented on HIVE-2068:
----------------------------------
Can you add a review board entry ?
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013874#comment-13013874 ]
Siying Dong commented on HIVE-2068:
-----------------------------------
https://reviews.apache.org/r/540/diff/
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2068:
-----------------------------
Status: Open (was: Patch Available)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Status: Patch Available (was: Open)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2068:
-----------------------------
Status: Open (was: Patch Available)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016535#comment-13016535 ]
Siying Dong commented on HIVE-2068:
-----------------------------------
review-board updated.
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2068:
-----------------------------
Status: Open (was: Patch Available)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Attachment: HIVE-2068.2.patch
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Status: Patch Available (was: Open)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020762#comment-13020762 ]
Namit Jain commented on HIVE-2068:
----------------------------------
Can you rerun the tests ?
I am getting some failures - in global_limit.q
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Attachment: HIVE-2068.5.patch
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Status: Patch Available (was: Open)
deleted the latest patch. The fetchTask return part is actually OK.
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020390#comment-13020390 ]
jiraposter@reviews.apache.org commented on HIVE-2068:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/540/
-----------------------------------------------------------
(Updated 2011-04-15 18:37:21.441402)
Review request for hive and namit jain.
Changes
-------
fix a small logic bug.
Summary
-------
For HIVE-2068
This addresses bug HIVE-2068.
https://issues.apache.org/jira/browse/HIVE-2068
Diffs (updated)
-----
trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 1091258
trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1091258
trunk/conf/hive-default.xml 1091258
trunk/hwi/src/java/org/apache/hadoop/hive/hwi/HWISessionItem.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/CommandNeedRetryException.java PRE-CREATION
trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/LimitOperator.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/LimitDesc.java 1091258
trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessor.java 1091258
trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 1091258
trunk/ql/src/test/queries/clientpositive/global_limit.q PRE-CREATION
trunk/ql/src/test/results/clientpositive/global_limit.q.out PRE-CREATION
trunk/service/src/java/org/apache/hadoop/hive/service/HiveServer.java 1091258
Diff: https://reviews.apache.org/r/540/diff
Testing
-------
added a test to test suite.
Thanks,
Siying
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch, HIVE-2068.6.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Attachment: (was: HIVE-2068.6.patch)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2068:
-----------------------------
Status: Patch Available (was: Open)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Status: Patch Available (was: Open)
fix the issue. I think what Namit means is that the function should always return true(no more rows).
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch, HIVE-2068.6.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Status: Patch Available (was: Open)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Status: Patch Available (was: Open)
looks like simple "... limit ..." depends on the sequence of list files, which is not deterministic. I modify the test case to always put the 3 same files so that the results will be deterministic.
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch, HIVE-2068.6.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2068:
-----------------------------
Status: Open (was: Patch Available)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Attachment: HIVE-2068.1.patch
Features are mostly finished and I did some manual tests.
I'm still running all the tests. I'm also thinking of how to add tests to cover the Driver changes with retry.
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016030#comment-13016030 ]
Namit Jain commented on HIVE-2068:
----------------------------------
comments in review-board
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-2068:
------------------------------
Attachment: HIVE-2068.6.patch
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch, HIVE-2068.6.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2068) Speed up query "select xx,xx from
xxx LIMIT xxx" if no filtering or aggregation
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016604#comment-13016604 ]
Siying Dong commented on HIVE-2068:
-----------------------------------
Namit, you can't see trunk/conf/hive-default.xml is already included in the diff of the review board?
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2068:
-----------------------------
Status: Patch Available (was: Open)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query "select xx,xx from xxx
LIMIT xxx" if no filtering or aggregation
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-2068:
-----------------------------
Status: Open (was: Patch Available)
> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or aggregation
> -------------------------------------------------------------------------------
>
> Key: HIVE-2068
> URL: https://issues.apache.org/jira/browse/HIVE-2068
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Siying Dong
> Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch, HIVE-2068.5.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT xxx" will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira