You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Siying Dong (JIRA)" <ji...@apache.org> on 2011/07/14 00:55:00 UTC
[jira] [Created] (HIVE-2282) Local mode needs to work well with
block sampling
Local mode needs to work well with block sampling
-------------------------------------------------
Key: HIVE-2282
URL: https://issues.apache.org/jira/browse/HIVE-2282
Project: Hive
Issue Type: Improvement
Reporter: Siying Dong
Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wilfong updated HIVE-2282:
--------------------------------
Attachment: HIVE-2282.4.patch.txt
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt, HIVE-2282.4.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066299#comment-13066299 ]
Siying Dong commented on HIVE-2282:
-----------------------------------
+1, will commit after testing.
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066086#comment-13066086 ]
jiraposter@reviews.apache.org commented on HIVE-2282:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1132/#review1081
-----------------------------------------------------------
ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsLocalModeHook.java
<https://reviews.apache.org/r/1132/#comment2210>
We need a header for licensing.
- Siying
On 2011-07-15 02:16:34, Kevin Wilfong wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/1132/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-07-15 02:16:34)
bq.
bq.
bq. Review request for hive and Siying Dong.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. A query should run in local mode when block sampling is used and the sample is small enough. The size of the sample is currently being estimated, as it is done to estimate the number of reducers.
bq.
bq.
bq. This addresses bug HIVE-2282.
bq. https://issues.apache.org/jira/browse/HIVE-2282
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. ql/src/test/queries/clientpositive/sample_islocalmode_hook.q PRE-CREATION
bq. ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0
bq. ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd3de76
bq. ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsLocalModeHook.java PRE-CREATION
bq.
bq. Diff: https://reviews.apache.org/r/1132/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. TestCliDriver TestNegativeCliDriver, manually tested
bq.
bq.
bq. Thanks,
bq.
bq. Kevin
bq.
bq.
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065682#comment-13065682 ]
jiraposter@reviews.apache.org commented on HIVE-2282:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1132/
-----------------------------------------------------------
Review request for hive and Siying Dong.
Summary
-------
A query should run in local mode when block sampling is used and the sample is small enough. The size of the sample is currently being estimated, as it is done to estimate the number of reducers.
This addresses bug HIVE-2282.
https://issues.apache.org/jira/browse/HIVE-2282
Diffs
-----
ql/src/test/queries/clientpositive/sample_islocalmode_hook.q PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd3de76
ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsLocalModeHook.java PRE-CREATION
Diff: https://reviews.apache.org/r/1132/diff
Testing
-------
TestCliDriver TestNegativeCliDriver, manually tested
Thanks,
Kevin
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065681#comment-13065681 ]
Kevin Wilfong commented on HIVE-2282:
-------------------------------------
https://reviews.apache.org/r/1132/
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wilfong updated HIVE-2282:
--------------------------------
Status: Patch Available (was: Open)
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066196#comment-13066196 ]
jiraposter@reviews.apache.org commented on HIVE-2282:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1132/
-----------------------------------------------------------
(Updated 2011-07-15 20:48:38.625544)
Review request for hive and Siying Dong.
Changes
-------
I added comments to the estimateSampledInputSize function. This function does set the input size even if there is no sampling, but this means that we do not need to create two cases everywhere we might need to use an estimated input size or an actual input size. Instead, we can just run the function (which only does significant work the first time it is run thanks to a boolean flag) and the input size will be set to the appropriate values. It only estimates the input size if sampling is used.
I also added the header to VerifyIsLocalModeHook.java
Summary
-------
A query should run in local mode when block sampling is used and the sample is small enough. The size of the sample is currently being estimated, as it is done to estimate the number of reducers.
This addresses bug HIVE-2282.
https://issues.apache.org/jira/browse/HIVE-2282
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd3de76
ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsLocalModeHook.java PRE-CREATION
ql/src/test/queries/clientpositive/sample_islocalmode_hook.q PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0
Diff: https://reviews.apache.org/r/1132/diff
Testing
-------
TestCliDriver TestNegativeCliDriver, manually tested
Thanks,
Kevin
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong resolved HIVE-2282.
-------------------------------
Resolution: Fixed
Committed. Thanks Kevin!
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt, HIVE-2282.4.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wilfong updated HIVE-2282:
--------------------------------
Attachment: (was: HIVE-2282.2.patch.txt)
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "John Sichi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
John Sichi reassigned HIVE-2282:
--------------------------------
Assignee: Kevin Wilfong
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066230#comment-13066230 ]
jiraposter@reviews.apache.org commented on HIVE-2282:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1132/
-----------------------------------------------------------
(Updated 2011-07-15 21:45:16.168124)
Review request for hive and Siying Dong.
Changes
-------
That's a good point, sorry I misunderstood it originally.
Renamed estimateSampledInputSize to estimateInputSize.
Summary
-------
A query should run in local mode when block sampling is used and the sample is small enough. The size of the sample is currently being estimated, as it is done to estimate the number of reducers.
This addresses bug HIVE-2282.
https://issues.apache.org/jira/browse/HIVE-2282
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd3de76
ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsLocalModeHook.java PRE-CREATION
ql/src/test/queries/clientpositive/sample_islocalmode_hook.q PRE-CREATION
Diff: https://reviews.apache.org/r/1132/diff
Testing
-------
TestCliDriver TestNegativeCliDriver, manually tested
Thanks,
Kevin
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069635#comment-13069635 ]
jiraposter@reviews.apache.org commented on HIVE-2282:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1132/
-----------------------------------------------------------
(Updated 2011-07-22 17:40:44.736466)
Review request for hive and Siying Dong.
Changes
-------
I added the q.out file which I had forgotten for the new q file.
I also modified the test queries to select count(1) instead of selecting keys and values.
Summary
-------
A query should run in local mode when block sampling is used and the sample is small enough. The size of the sample is currently being estimated, as it is done to estimate the number of reducers.
This addresses bug HIVE-2282.
https://issues.apache.org/jira/browse/HIVE-2282
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd3de76
ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsLocalModeHook.java PRE-CREATION
ql/src/test/queries/clientpositive/sample_islocalmode_hook.q PRE-CREATION
ql/src/test/results/clientpositive/sample_islocalmode_hook.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/1132/diff
Testing
-------
TestCliDriver TestNegativeCliDriver, manually tested
Thanks,
Kevin
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wilfong updated HIVE-2282:
--------------------------------
Attachment: HIVE-2282.2.patch.txt
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.2.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069610#comment-13069610 ]
Siying Dong commented on HIVE-2282:
-----------------------------------
Kevin, you forgot to add file ql/src/test/results/clientpositive/sample_islocalmode_hook.q.out to the patch.
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-2282:
---------------------------------
Component/s: Query Processor
Fix Version/s: 0.8.0
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Fix For: 0.8.0
>
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt, HIVE-2282.4.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wilfong updated HIVE-2282:
--------------------------------
Attachment: HIVE-2282.3.patch.txt
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066198#comment-13066198 ]
jiraposter@reviews.apache.org commented on HIVE-2282:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1132/#review1084
-----------------------------------------------------------
I mean you can just change the function name to something like estimateInputSize().
- Siying
On 2011-07-15 20:48:38, Kevin Wilfong wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/1132/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-07-15 20:48:38)
bq.
bq.
bq. Review request for hive and Siying Dong.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. A query should run in local mode when block sampling is used and the sample is small enough. The size of the sample is currently being estimated, as it is done to estimate the number of reducers.
bq.
bq.
bq. This addresses bug HIVE-2282.
bq. https://issues.apache.org/jira/browse/HIVE-2282
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd3de76
bq. ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsLocalModeHook.java PRE-CREATION
bq. ql/src/test/queries/clientpositive/sample_islocalmode_hook.q PRE-CREATION
bq. ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0
bq.
bq. Diff: https://reviews.apache.org/r/1132/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. TestCliDriver TestNegativeCliDriver, manually tested
bq.
bq.
bq. Thanks,
bq.
bq. Kevin
bq.
bq.
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wilfong updated HIVE-2282:
--------------------------------
Attachment: HIVE-2282.1.patch.txt
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wilfong updated HIVE-2282:
--------------------------------
Attachment: HIVE-2282.1.patch.txt
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-2282) Local mode needs to work well
with block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HIVE-2282 started by Kevin Wilfong.
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070865#comment-13070865 ]
Siying Dong commented on HIVE-2282:
-----------------------------------
I don't know why but I ran the test suites twice and both failed. Can you rebase your codes and try to run the whole test suites and see whether all the tests pass? I'll try again too.
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt, HIVE-2282.4.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wilfong updated HIVE-2282:
--------------------------------
Status: Open (was: Patch Available)
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Wilfong updated HIVE-2282:
--------------------------------
Attachment: (was: HIVE-2282.1.patch.txt)
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "Siying Dong (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069611#comment-13069611 ]
Siying Dong commented on HIVE-2282:
-----------------------------------
Also, query like "select key, value from sih_src tablesample(1 percent)" actually doesn't generate stable result. You can use select count(1) instead. That will generate correct results.
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt, HIVE-2282.2.patch.txt, HIVE-2282.3.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2282) Local mode needs to work well with
block sampling
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066083#comment-13066083 ]
jiraposter@reviews.apache.org commented on HIVE-2282:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1132/#review1080
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
<https://reviews.apache.org/r/1132/#comment2209>
This function name seems to be confusing. Looks like the input size is set even if there is no sampling, right? Also, can you add comments to this function?
Other than that, the patch looks OK.
- Siying
On 2011-07-15 02:16:34, Kevin Wilfong wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/1132/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-07-15 02:16:34)
bq.
bq.
bq. Review request for hive and Siying Dong.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. A query should run in local mode when block sampling is used and the sample is small enough. The size of the sample is currently being estimated, as it is done to estimate the number of reducers.
bq.
bq.
bq. This addresses bug HIVE-2282.
bq. https://issues.apache.org/jira/browse/HIVE-2282
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. ql/src/test/queries/clientpositive/sample_islocalmode_hook.q PRE-CREATION
bq. ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 53769a0
bq. ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cd3de76
bq. ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyIsLocalModeHook.java PRE-CREATION
bq.
bq. Diff: https://reviews.apache.org/r/1132/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. TestCliDriver TestNegativeCliDriver, manually tested
bq.
bq.
bq. Thanks,
bq.
bq. Kevin
bq.
bq.
> Local mode needs to work well with block sampling
> -------------------------------------------------
>
> Key: HIVE-2282
> URL: https://issues.apache.org/jira/browse/HIVE-2282
> Project: Hive
> Issue Type: Improvement
> Reporter: Siying Dong
> Assignee: Kevin Wilfong
> Attachments: HIVE-2282.1.patch.txt
>
>
> Currently, if block sampling is enabled and large set of data are sampled to a small set, local mode needs to be kicked in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira