You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2018/06/02 17:13:00 UTC

[jira] [Commented] (HIVE-17896) TopNKey: Create a standalone vectorizable TopNKey operator

    [ https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499127#comment-16499127 ] 

Hive QA commented on HIVE-17896:
--------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m  5s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 28s{color} | {color:blue} common in master has 62 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 36s{color} | {color:blue} serde in master has 190 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 27s{color} | {color:blue} ql in master has 2278 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 37s{color} | {color:red} ql: The patch generated 37 new + 397 unchanged - 0 fixed = 434 total (was 397) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 44s{color} | {color:red} serde generated 1 new + 190 unchanged - 0 fixed = 191 total (was 190) {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 44s{color} | {color:red} ql generated 8 new + 2278 unchanged - 0 fixed = 2286 total (was 2278) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 14s{color} | {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:serde |
|  |  org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(Object[], ObjectInspector[], Object[], ObjectInspector[], boolean[]) negates the return value of org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(Object, ObjectInspector, Object, ObjectInspector)  At ObjectInspectorUtils.java:negates the return value of org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(Object, ObjectInspector, Object, ObjectInspector)  At ObjectInspectorUtils.java:[line 958] |
| FindBugs | module:ql |
|  |  new org.apache.hadoop.hive.ql.exec.TopNKeyOperator$KeyWrapperComparator(ObjectInspector[], ObjectInspector[], boolean[]) may expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparator.columnSortOrderIsDesc  At TopNKeyOperator.java:expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparator.columnSortOrderIsDesc  At TopNKeyOperator.java:[line 71] |
|  |  new org.apache.hadoop.hive.ql.exec.TopNKeyOperator$KeyWrapperComparator(ObjectInspector[], ObjectInspector[], boolean[]) may expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparator.objectInspectors1  At TopNKeyOperator.java:expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparator.objectInspectors1  At TopNKeyOperator.java:[line 69] |
|  |  new org.apache.hadoop.hive.ql.exec.TopNKeyOperator$KeyWrapperComparator(ObjectInspector[], ObjectInspector[], boolean[]) may expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparator.objectInspectors2  At TopNKeyOperator.java:expose internal representation by storing an externally mutable object into TopNKeyOperator$KeyWrapperComparator.objectInspectors2  At TopNKeyOperator.java:[line 70] |
|  |  Class org.apache.hadoop.hive.ql.exec.vector.VectorTopNKeyOperator defines non-transient non-serializable instance field vContext  In VectorTopNKeyOperator.java:instance field vContext  In VectorTopNKeyOperator.java |
|  |  org.apache.hadoop.hive.ql.plan.TopNKeyDesc.clone() does not call super.clone()  At TopNKeyDesc.java: At TopNKeyDesc.java:[lines 108-112] |
|  |  Should org.apache.hadoop.hive.ql.plan.TopNKeyDesc$TopNKeyDescExplainVectorization be a _static_ inner class?  At TopNKeyDesc.java:inner class?  At TopNKeyDesc.java:[lines 119-127] |
|  |  org.apache.hadoop.hive.ql.plan.VectorTopNKeyDesc.getKeyExpressions() may expose internal representation by returning VectorTopNKeyDesc.keyExpressions  At VectorTopNKeyDesc.java:by returning VectorTopNKeyDesc.keyExpressions  At VectorTopNKeyDesc.java:[line 33] |
|  |  org.apache.hadoop.hive.ql.plan.VectorTopNKeyDesc.setKeyExpressions(VectorExpression[]) may expose internal representation by storing an externally mutable object into VectorTopNKeyDesc.keyExpressions  At VectorTopNKeyDesc.java:by storing an externally mutable object into VectorTopNKeyDesc.keyExpressions  At VectorTopNKeyDesc.java:[line 37] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-11445/dev-support/hive-personality.sh |
| git revision | master / 4463c2b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-11445/yetus/diff-checkstyle-ql.txt |
| findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-11445/yetus/new-findbugs-serde.html |
| findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-11445/yetus/new-findbugs-ql.html |
| modules | C: common serde itests ql U: . |
| Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-11445/yetus.txt |
| Powered by | Apache Yetus    http://yetus.apache.org |


This message was automatically generated.



> TopNKey: Create a standalone vectorizable TopNKey operator
> ----------------------------------------------------------
>
>                 Key: HIVE-17896
>                 URL: https://issues.apache.org/jira/browse/HIVE-17896
>             Project: Hive
>          Issue Type: New Feature
>          Components: Operators
>    Affects Versions: 3.0.0
>            Reporter: Gopal V
>            Assignee: Teddy Choi
>            Priority: Major
>         Attachments: HIVE-17896.1.patch, HIVE-17896.3.patch, HIVE-17896.4.patch, HIVE-17896.5.patch, HIVE-17896.6.patch, HIVE-17896.7.patch, HIVE-17896.8.patch, HIVE-17896.9.patch
>
>
> For TPC-DS Query27, the TopN operation is delayed by the group-by - the group-by operator buffers up all the rows before discarding the 99% of the rows in the TopN Hash within the ReduceSink Operator.
> The RS TopN operator is very restrictive as it only supports doing the filtering on the shuffle keys, but it is better to do this before breaking the vectors into rows and losing the isRepeating properties.
> Adding a TopN Key operator in the physical operator tree allows the following to happen.
> GBY->RS(Top=1)
> can become 
> TNK(1)->GBY->RS(Top=1)
> So that, the TopNKey can remove rows before they are buffered into the GBY and consume memory.
> Here's the equivalent implementation in Presto
> https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35
> Adding this as a sub-feature of GroupBy prevents further optimizations if the GBY is on keys "a,b,c" and the TopNKey is on just "a".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)