You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "TezQA (Jira)" <ji...@apache.org> on 2020/07/27 07:42:00 UTC

[jira] [Commented] (TEZ-4207) Provide approximate number of input records to be processed in UnorderedKVInput

    [ https://issues.apache.org/jira/browse/TEZ-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165507#comment-17165507 ] 

TezQA commented on TEZ-4207:
----------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  0s{color} | {color:green} No case conflicting files found. {color} |
| {color:blue}0{color} | {color:blue} prototool {color} | {color:blue}  0m  0s{color} | {color:blue} prototool was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m  0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  3s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 53s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 52s{color} | {color:red} tez-runtime-library in master has 1 extant findbugs warnings. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 30s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  0m 15s{color} | {color:orange} tez-runtime-library: The patch generated 1 new + 96 unchanged - 0 fixed = 97 total (was 96) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 45s{color} | {color:green} tez-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 46s{color} | {color:green} tez-runtime-library in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 13s{color} | {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-TEZ-Build/499/artifact/out/Dockerfile |
| JIRA Issue | TEZ-4207 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13008470/TEZ-4207.wip.patch |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux 9351e451687f 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 2d7c60849 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| findbugs | https://builds.apache.org/job/PreCommit-TEZ-Build/499/artifact/out/branch-findbugs-tez-runtime-library-warnings.html |
| checkstyle | https://builds.apache.org/job/PreCommit-TEZ-Build/499/artifact/out/diff-checkstyle-tez-runtime-library.txt |
|  Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/499/testReport/ |
| Max. process+thread count | 259 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-library U: . |
| Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/499/console |
| versions | git=2.17.1 maven=3.6.0 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |


This message was automatically generated.



> Provide approximate number of input records to be processed in UnorderedKVInput
> -------------------------------------------------------------------------------
>
>                 Key: TEZ-4207
>                 URL: https://issues.apache.org/jira/browse/TEZ-4207
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Priority: Major
>         Attachments: TEZ-4207.wip.patch
>
>
> There are cases when broadcasted data is loaded into hashtable in upstream applications (e.g Hive). Apps tends to predict the number of entries in the hashtable diligently, but there are cases where these estimates can be very complicated at compile time.
>  
> Tez can help in such cases, by providing "approximate number of input records counter", to be processed in UnorderedKVInput. This is to avoid expensive rehash when hashtable sizes are not estimated correctly. It would be good to start with broadcast first and then to move on to unordered partitioned case later.
>  
> This would help in predicting the number of entries at runtime & can get better estimates for hashtable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)