You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "abstractdog (via GitHub)" <gi...@apache.org> on 2024/03/30 08:37:35 UTC
[PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
abstractdog opened a new pull request, #341:
URL: https://github.com/apache/tez/pull/341
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2030729876
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 9s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 0s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 6m 2s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 9m 16s | master passed |
| +1 :green_heart: | compile | 1m 14s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 17s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 28s | master passed |
| +1 :green_heart: | javadoc | 1m 23s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 11s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 50s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 3m 1s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 7s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 0m 53s | the patch passed |
| +1 :green_heart: | compile | 0m 50s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 50s | the patch passed |
| +1 :green_heart: | javac | 0m 50s | the patch passed |
| +1 :green_heart: | compile | 0m 49s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 49s | the patch passed |
| +1 :green_heart: | javac | 0m 49s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 8s | The patch passed checkstyle in tez-api |
| +1 :green_heart: | checkstyle | 0m 5s | The patch passed checkstyle in tez-runtime-internals |
| +1 :green_heart: | checkstyle | 0m 8s | tez-mapreduce: The patch generated 0 new + 43 unchanged - 2 fixed = 43 total (was 45) |
| +1 :green_heart: | checkstyle | 0m 10s | The patch passed checkstyle in tez-dag |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 36s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 40s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | findbugs | 2m 13s | the patch passed |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 1m 57s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 29s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 1m 0s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 13s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 33s | The patch does not generate ASF License warnings. |
| | | 41m 17s | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/6/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux a58b56a303df 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/6/testReport/ |
| Max. process+thread count | 694 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/6/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2032027449
> 🎊 **+1 overall**
>
> Vote Subsystem Runtime Comment
> +0 🆗 reexec 0m 9s Docker mode activated.
> _ Prechecks _
> +1 💚 dupname 0m 0s No case conflicting files found.
> +0 🆗 prototool 0m 0s prototool was not available.
> +1 💚 @author 0m 0s The patch does not contain any @author tags.
> +1 💚 test4tests 0m 0s The patch appears to include 5 new or modified test files.
> _ master Compile Tests _
> +0 🆗 mvndep 5m 55s Maven dependency ordering for branch
> +1 💚 mvninstall 8m 2s master passed
> +1 💚 compile 1m 13s master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
> +1 💚 compile 1m 24s master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
> +1 💚 checkstyle 1m 25s master passed
> +1 💚 javadoc 1m 19s master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
> +1 💚 javadoc 1m 10s master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
> +0 🆗 spotbugs 0m 42s Used deprecated FindBugs config; considering switching to SpotBugs.
> +1 💚 findbugs 2m 59s master passed
> _ Patch Compile Tests _
> +0 🆗 mvndep 0m 7s Maven dependency ordering for patch
> +1 💚 mvninstall 0m 53s the patch passed
> +1 💚 compile 0m 52s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
> +1 💚 cc 0m 52s the patch passed
> +1 💚 javac 0m 52s the patch passed
> +1 💚 compile 0m 53s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
> +1 💚 cc 0m 53s the patch passed
> +1 💚 javac 0m 53s the patch passed
> -0 ⚠️ checkstyle 0m 8s tez-mapreduce: The patch generated 4 new + 43 unchanged - 2 fixed = 47 total (was 45)
> +1 💚 whitespace 0m 0s The patch has no whitespace issues.
> +1 💚 javadoc 0m 37s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1
> +1 💚 javadoc 0m 40s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
> +1 💚 findbugs 2m 18s the patch passed
> _ Other Tests _
> +1 💚 unit 1m 57s tez-api in the patch passed.
> +1 💚 unit 0m 30s tez-runtime-internals in the patch passed.
> +1 💚 unit 0m 59s tez-mapreduce in the patch passed.
> +1 💚 unit 4m 46s tez-dag in the patch passed.
> +1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
> 40m 36s
> Subsystem Report/Notes
> Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/8/artifact/out/Dockerfile
> GITHUB PR #341
> JIRA Issue [TEZ-4548](https://issues.apache.org/jira/browse/TEZ-4548)
> Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool
> uname Linux c4a861b17fef 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
> Build tool maven
> Personality personality/tez.sh
> git revision master / [34bb628](https://github.com/apache/tez/commit/34bb628e3a25edf5e36e3ca1bf73f06752a43118)
> Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
> Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06
> checkstyle https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/8/artifact/out/diff-checkstyle-tez-mapreduce.txt
> Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/8/testReport/
> Max. process+thread count 573 (vs. ulimit of 5500)
> modules C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: .
> Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/8/console
> versions git=2.34.1 maven=3.6.3 findbugs=3.0.1
> Powered by Apache Yetus 0.12.0 https://yetus.apache.org
> This message was automatically generated.
LOL, right, I haven't checked it in detail, I thought it was the another checkstyle warning which is usually a noise :) fixed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2031972279
> Thanx @abstractdog for the update. There are some checkstyle warnings can you take care of them, rest things LGTM
thanks!
I addressed what I could have (renaming some fields), I cannot really address [HiddenField], I consider it as a noise now
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2027990447
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 12m 14s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 0s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 4 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 5m 48s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 7m 36s | master passed |
| +1 :green_heart: | compile | 1m 16s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 23s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 32s | master passed |
| +1 :green_heart: | javadoc | 1m 24s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 11s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 48s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 3m 2s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 7s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 0m 58s | the patch passed |
| +1 :green_heart: | compile | 0m 52s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 52s | the patch passed |
| +1 :green_heart: | javac | 0m 52s | the patch passed |
| +1 :green_heart: | compile | 0m 48s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 48s | the patch passed |
| +1 :green_heart: | javac | 0m 48s | the patch passed |
| -0 :warning: | checkstyle | 0m 8s | tez-api: The patch generated 2 new + 2 unchanged - 0 fixed = 4 total (was 2) |
| -0 :warning: | checkstyle | 0m 8s | tez-mapreduce: The patch generated 3 new + 43 unchanged - 1 fixed = 46 total (was 44) |
| -0 :warning: | checkstyle | 0m 13s | tez-dag: The patch generated 1 new + 6 unchanged - 0 fixed = 7 total (was 6) |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 37s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 40s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| -1 :x: | findbugs | 0m 29s | tez-runtime-internals generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 1m 56s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 28s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 0m 59s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 13s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 34s | The patch does not generate ASF License warnings. |
| | | 51m 53s | |
| Reason | Tests |
|-------:|:------|
| FindBugs | module:tez-runtime-internals |
| | Found reliance on default encoding in org.apache.tez.common.ProtoConverters.convertRootInputDataInformationEventToProto(InputDataInformationEvent):in org.apache.tez.common.ProtoConverters.convertRootInputDataInformationEventToProto(InputDataInformationEvent): String.getBytes() At ProtoConverters.java:[line 141] |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/1/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux 6e16603385e4 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/1/artifact/out/diff-checkstyle-tez-api.txt |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/1/artifact/out/diff-checkstyle-tez-mapreduce.txt |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/1/artifact/out/diff-checkstyle-tez-dag.txt |
| findbugs | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/1/artifact/out/new-findbugs-tez-runtime-internals.html |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/1/testReport/ |
| Max. process+thread count | 613 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/1/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "aturoczy (via GitHub)" <gi...@apache.org>.
aturoczy commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1546906996
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -103,6 +113,7 @@ public Object getDeserializedUserPayload() {
public String toString() {
return "InputDataInformationEvent [sourceIndex=" + sourceIndex + ", targetIndex="
+ targetIndex + ", serializedUserPayloadExists=" + (userPayload != null)
- + ", deserializedUserPayloadExists=" + (userPayloadObject != null) + "]";
- }
+ + ", deserializedUserPayloadExists=" + (userPayloadObject != null)
+ + ", serializedPath=" + serializedPath + "]";
Review Comment:
If serializedPath is null then it would look a bit odd. I like to handle this type of scenario. But this is just styling :)
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -79,6 +79,12 @@ public static InputDataInformationEvent createWithObjectPayload(int srcIndex,
return new InputDataInformationEvent(srcIndex, userPayloadDeserialized, null);
}
+ public static InputDataInformationEvent createWithSerializedPath(int srcIndex, String serializedPath) {
+ InputDataInformationEvent event = new InputDataInformationEvent(srcIndex, null);
+ event.serializedPath = serializedPath;
Review Comment:
Won't just be easier to add a 3rd constructor for the InputDataInformationEvent? To set a property is fine, just not fail safe.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2031214469
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 11s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 0s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 5m 49s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 8m 40s | master passed |
| +1 :green_heart: | compile | 1m 18s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 22s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 20s | master passed |
| +1 :green_heart: | javadoc | 1m 14s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 6s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 44s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 2m 45s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 7s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 0m 51s | the patch passed |
| +1 :green_heart: | compile | 0m 51s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 51s | the patch passed |
| +1 :green_heart: | javac | 0m 51s | the patch passed |
| +1 :green_heart: | compile | 0m 47s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 47s | the patch passed |
| +1 :green_heart: | javac | 0m 47s | the patch passed |
| -0 :warning: | checkstyle | 0m 6s | tez-api: The patch generated 1 new + 17 unchanged - 0 fixed = 18 total (was 17) |
| +1 :green_heart: | checkstyle | 0m 7s | The patch passed checkstyle in tez-runtime-internals |
| +1 :green_heart: | checkstyle | 0m 7s | tez-mapreduce: The patch generated 0 new + 43 unchanged - 2 fixed = 43 total (was 45) |
| +1 :green_heart: | checkstyle | 0m 12s | The patch passed checkstyle in tez-dag |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 37s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 37s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| -1 :x: | findbugs | 0m 38s | tez-api generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 1m 52s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 26s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 0m 56s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 3m 57s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 30s | The patch does not generate ASF License warnings. |
| | | 39m 18s | |
| Reason | Tests |
|-------:|:------|
| FindBugs | module:tez-api |
| | Redundant nullcheck of StringBuilder.toString(), which is known to be non-null in org.apache.tez.runtime.api.events.InputDataInformationEvent.toString() Redundant null check at InputDataInformationEvent.java:is known to be non-null in org.apache.tez.runtime.api.events.InputDataInformationEvent.toString() Redundant null check at InputDataInformationEvent.java:[line 114] |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/7/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux d30fa7dc7fac 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/7/artifact/out/diff-checkstyle-tez-api.txt |
| findbugs | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/7/artifact/out/new-findbugs-tez-api.html |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/7/testReport/ |
| Max. process+thread count | 278 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/7/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547385546
##########
tez-runtime-internals/src/main/java/org/apache/tez/common/ProtoConverters.java:
##########
@@ -135,15 +138,22 @@ public static VertexManagerEvent convertVertexManagerEventFromProto(
if (event.getUserPayload() != null) {
builder.setUserPayload(ByteString.copyFrom(event.getUserPayload()));
}
+ if (event.getSerializedPath() != null) {
+ builder.setSerializedPath(ByteString.copyFrom(event.getSerializedPath().getBytes(Charsets.UTF_8)));
+ }
return builder.build();
}
- public static InputDataInformationEvent
- convertRootInputDataInformationEventFromProto(
+ public static InputDataInformationEvent convertRootInputDataInformationEventFromProto(
EventProtos.RootInputDataInformationEventProto proto) {
- InputDataInformationEvent diEvent = InputDataInformationEvent.createWithSerializedPayload(
- proto.getSourceIndex(),
- proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null);
+ ByteBuffer payload = proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null;
+ InputDataInformationEvent diEvent = null;
+ if (!proto.getSerializedPath().isEmpty()) {
Review Comment:
ack, checked and proto.hasSerializedPath() works, will fix
maybe this isEmpty craziness was the outcome of an early incorrect implementation of convertRootInputDataInformationEventToProto
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2028026524
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 8s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 1s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 4 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 5m 12s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 7m 19s | master passed |
| +1 :green_heart: | compile | 1m 16s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 9s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 23s | master passed |
| +1 :green_heart: | javadoc | 1m 12s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 3s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 48s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 2m 58s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 8s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 1m 6s | the patch passed |
| +1 :green_heart: | compile | 0m 58s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 58s | the patch passed |
| +1 :green_heart: | javac | 0m 58s | the patch passed |
| +1 :green_heart: | compile | 0m 52s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 52s | the patch passed |
| +1 :green_heart: | javac | 0m 52s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 6s | The patch passed checkstyle in tez-api |
| +1 :green_heart: | checkstyle | 0m 6s | The patch passed checkstyle in tez-runtime-internals |
| +1 :green_heart: | checkstyle | 0m 7s | tez-mapreduce: The patch generated 0 new + 42 unchanged - 2 fixed = 42 total (was 44) |
| +1 :green_heart: | checkstyle | 0m 12s | The patch passed checkstyle in tez-dag |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 37s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 42s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | findbugs | 2m 40s | the patch passed |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 2m 2s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 31s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 1m 3s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 21s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 36s | The patch does not generate ASF License warnings. |
| | | 39m 7s | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/3/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux 266432ac50f2 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/3/testReport/ |
| Max. process+thread count | 733 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/3/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "ayushtkn (via GitHub)" <gi...@apache.org>.
ayushtkn commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547372195
##########
tez-runtime-internals/src/main/java/org/apache/tez/common/ProtoConverters.java:
##########
@@ -135,15 +138,22 @@ public static VertexManagerEvent convertVertexManagerEventFromProto(
if (event.getUserPayload() != null) {
builder.setUserPayload(ByteString.copyFrom(event.getUserPayload()));
}
+ if (event.getSerializedPath() != null) {
+ builder.setSerializedPath(ByteString.copyFrom(event.getSerializedPath().getBytes(Charsets.UTF_8)));
+ }
return builder.build();
}
- public static InputDataInformationEvent
- convertRootInputDataInformationEventFromProto(
+ public static InputDataInformationEvent convertRootInputDataInformationEventFromProto(
EventProtos.RootInputDataInformationEventProto proto) {
- InputDataInformationEvent diEvent = InputDataInformationEvent.createWithSerializedPayload(
- proto.getSourceIndex(),
- proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null);
+ ByteBuffer payload = proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null;
+ InputDataInformationEvent diEvent = null;
+ if (!proto.getSerializedPath().isEmpty()) {
Review Comment:
should be ok if it can never reach & there is no public method which can reach here, else maybe we can have Preconditions or Assert to see we don't land up having two source of truth accidentally even in future, but I am ok
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2032062863
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 10s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 0s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 5m 54s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 7m 40s | master passed |
| +1 :green_heart: | compile | 1m 20s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 14s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 14s | master passed |
| +1 :green_heart: | javadoc | 1m 13s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 0s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 43s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 2m 39s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 8s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 0m 57s | the patch passed |
| +1 :green_heart: | compile | 0m 59s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 59s | the patch passed |
| +1 :green_heart: | javac | 0m 59s | the patch passed |
| +1 :green_heart: | compile | 0m 55s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 56s | the patch passed |
| +1 :green_heart: | javac | 0m 55s | the patch passed |
| -0 :warning: | checkstyle | 0m 9s | tez-mapreduce: The patch generated 2 new + 43 unchanged - 2 fixed = 45 total (was 45) |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 37s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 38s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | findbugs | 2m 31s | the patch passed |
||| _ Other Tests _ |
| -1 :x: | unit | 1m 57s | tez-api in the patch failed. |
| +1 :green_heart: | unit | 0m 32s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 1m 4s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 16s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 33s | The patch does not generate ASF License warnings. |
| | | 39m 24s | |
| Reason | Tests |
|-------:|:------|
| Failed junit tests | tez.client.TestTezClient |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/9/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux c1088b3d9952 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/9/artifact/out/diff-checkstyle-tez-mapreduce.txt |
| unit | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/9/artifact/out/patch-unit-tez-api.txt |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/9/testReport/ |
| Max. process+thread count | 691 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/9/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2032248345
tests passed green
considering @ayushtkn 's LGTM an approval, will merge this soon in case of no further objections
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1546922261
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -79,6 +79,12 @@ public static InputDataInformationEvent createWithObjectPayload(int srcIndex,
return new InputDataInformationEvent(srcIndex, userPayloadDeserialized, null);
}
+ public static InputDataInformationEvent createWithSerializedPath(int srcIndex, String serializedPath) {
+ InputDataInformationEvent event = new InputDataInformationEvent(srcIndex, null);
+ event.serializedPath = serializedPath;
Review Comment:
this method makes it impossible for users to call constructors in an invalid way (like: adding payload and serialized path at the same time, which should not happen), also with having a method name that implies this behavior
what constructor are you suggesting that has the same advantages?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1546922261
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -79,6 +79,12 @@ public static InputDataInformationEvent createWithObjectPayload(int srcIndex,
return new InputDataInformationEvent(srcIndex, userPayloadDeserialized, null);
}
+ public static InputDataInformationEvent createWithSerializedPath(int srcIndex, String serializedPath) {
+ InputDataInformationEvent event = new InputDataInformationEvent(srcIndex, null);
+ event.serializedPath = serializedPath;
Review Comment:
this method makes it impossible for users to call constructors in an invalid way (like: adding payload and serialized path at the same time, which should not happen), also this has a method name that implies the expected behavior
what constructor are you suggesting that has the same advantages as createWithSerializedPath?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547193756
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -103,6 +113,7 @@ public Object getDeserializedUserPayload() {
public String toString() {
return "InputDataInformationEvent [sourceIndex=" + sourceIndex + ", targetIndex="
+ targetIndex + ", serializedUserPayloadExists=" + (userPayload != null)
- + ", deserializedUserPayloadExists=" + (userPayloadObject != null) + "]";
- }
+ + ", deserializedUserPayloadExists=" + (userPayloadObject != null)
+ + ", serializedPath=" + serializedPath + "]";
Review Comment:
ack, I fixed this in https://github.com/apache/tez/pull/341/commits/40694b77b1ff7d49e25ed7ed8abda4f0d6195367
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547349710
##########
tez-runtime-internals/src/main/java/org/apache/tez/common/ProtoConverters.java:
##########
@@ -135,15 +138,22 @@ public static VertexManagerEvent convertVertexManagerEventFromProto(
if (event.getUserPayload() != null) {
builder.setUserPayload(ByteString.copyFrom(event.getUserPayload()));
}
+ if (event.getSerializedPath() != null) {
+ builder.setSerializedPath(ByteString.copyFrom(event.getSerializedPath().getBytes(Charsets.UTF_8)));
+ }
return builder.build();
}
- public static InputDataInformationEvent
- convertRootInputDataInformationEventFromProto(
+ public static InputDataInformationEvent convertRootInputDataInformationEventFromProto(
EventProtos.RootInputDataInformationEventProto proto) {
- InputDataInformationEvent diEvent = InputDataInformationEvent.createWithSerializedPayload(
- proto.getSourceIndex(),
- proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null);
+ ByteBuffer payload = proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null;
+ InputDataInformationEvent diEvent = null;
+ if (!proto.getSerializedPath().isEmpty()) {
Review Comment:
idea was: if there is serializedPath, we just use it and don't care about payload, we should have one or another, not both
I think the static InputDataInformationEvent creator methods take care of not having both
is it fine?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547357268
##########
tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java:
##########
@@ -56,9 +62,12 @@ public class TestMRInputHelpers {
private static Path oldSplitsDir;
private static Path newSplitsDir;
- private static String TEST_ROOT_DIR = "target"
+ private static final String TEST_ROOT_DIR = "target"
+ Path.SEPARATOR + TestMRHelpers.class.getName() + "-tmpDir";
+ private static final Path LOCAL_TEST_ROOT_DIR = new Path("target"
+ + Path.SEPARATOR + TestMRHelpers.class.getName() + "-localtmpDir");
+
Review Comment:
ack, will do
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2028026525
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 9s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 0s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 4 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 5m 47s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 7m 25s | master passed |
| +1 :green_heart: | compile | 1m 16s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 10s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 22s | master passed |
| +1 :green_heart: | javadoc | 1m 12s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 2s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 46s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 2m 59s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 7s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 1m 7s | the patch passed |
| +1 :green_heart: | compile | 0m 58s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 58s | the patch passed |
| +1 :green_heart: | javac | 0m 58s | the patch passed |
| +1 :green_heart: | compile | 0m 53s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 53s | the patch passed |
| +1 :green_heart: | javac | 0m 53s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 7s | The patch passed checkstyle in tez-api |
| +1 :green_heart: | checkstyle | 0m 6s | The patch passed checkstyle in tez-runtime-internals |
| +1 :green_heart: | checkstyle | 0m 7s | tez-mapreduce: The patch generated 0 new + 42 unchanged - 2 fixed = 42 total (was 44) |
| +1 :green_heart: | checkstyle | 0m 13s | The patch passed checkstyle in tez-dag |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 37s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 43s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | findbugs | 2m 38s | the patch passed |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 1m 59s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 30s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 1m 4s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 21s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 36s | The patch does not generate ASF License warnings. |
| | | 39m 46s | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/2/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux 8eba468c023a 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/2/testReport/ |
| Max. process+thread count | 322 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/2/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2028888531
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 9s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 0s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 4 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 5m 58s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 7m 42s | master passed |
| +1 :green_heart: | compile | 1m 17s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 18s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 34s | master passed |
| +1 :green_heart: | javadoc | 1m 19s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 12s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 44s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 2m 52s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 7s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 0m 56s | the patch passed |
| +1 :green_heart: | compile | 0m 54s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 54s | the patch passed |
| +1 :green_heart: | javac | 0m 54s | the patch passed |
| +1 :green_heart: | compile | 0m 47s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 47s | the patch passed |
| +1 :green_heart: | javac | 0m 47s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 7s | The patch passed checkstyle in tez-api |
| +1 :green_heart: | checkstyle | 0m 6s | The patch passed checkstyle in tez-runtime-internals |
| +1 :green_heart: | checkstyle | 0m 8s | tez-mapreduce: The patch generated 0 new + 42 unchanged - 2 fixed = 42 total (was 44) |
| +1 :green_heart: | checkstyle | 0m 11s | The patch passed checkstyle in tez-dag |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 35s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 42s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | findbugs | 2m 14s | the patch passed |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 1m 58s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 30s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 0m 59s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 16s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 34s | The patch does not generate ASF License warnings. |
| | | 39m 46s | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/5/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux 2e13a876daba 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/5/testReport/ |
| Max. process+thread count | 588 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/5/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547308955
##########
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/MRInputHelpers.java:
##########
@@ -889,4 +895,29 @@ public static int getDagAttemptNumber(Configuration conf) {
return getIntProperty(conf, MRInput.TEZ_MAPREDUCE_DAG_ATTEMPT_NUMBER);
}
+ public static MRSplitProto getProto(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ return Strings.isNullOrEmpty(initEvent.getSerializedPath()) ? readProtoFromPayload(initEvent)
+ : readProtoFromFs(initEvent, jobConf);
+ }
+
+ private static MRSplitProto readProtoFromFs(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ String serializedPath = initEvent.getSerializedPath();
+ Path filePath = new Path(serializedPath);
+ LOG.info("Reading InputDataInformationEvent from path: {}", filePath);
+
+ MRSplitProto splitProto = null;
+ FileSystem fs = FileSystem.get(filePath.toUri(), jobConf);
Review Comment:
ack, let me do
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2032236638
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 26m 35s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 0s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 6m 13s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 12m 44s | master passed |
| +1 :green_heart: | compile | 2m 21s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 2m 6s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 2m 9s | master passed |
| +1 :green_heart: | javadoc | 2m 2s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 46s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 1m 18s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 4m 26s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 10s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 1m 24s | the patch passed |
| +1 :green_heart: | compile | 1m 33s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 1m 33s | the patch passed |
| +1 :green_heart: | javac | 1m 33s | the patch passed |
| +1 :green_heart: | compile | 1m 23s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 1m 23s | the patch passed |
| +1 :green_heart: | javac | 1m 23s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 12s | The patch passed checkstyle in tez-api |
| +1 :green_heart: | checkstyle | 0m 10s | The patch passed checkstyle in tez-runtime-internals |
| +1 :green_heart: | checkstyle | 0m 11s | tez-mapreduce: The patch generated 0 new + 43 unchanged - 2 fixed = 43 total (was 45) |
| +1 :green_heart: | checkstyle | 0m 20s | The patch passed checkstyle in tez-dag |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 1m 0s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 59s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | findbugs | 3m 45s | the patch passed |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 2m 17s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 37s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 1m 22s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 56s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 40s | The patch does not generate ASF License warnings. |
| | | 83m 13s | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/11/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux 14c63bf90eaa 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/11/testReport/ |
| Max. process+thread count | 490 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/11/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2032030599
> > I cannot address [HiddenField], I consider it as noise now
>
> @abstractdog that HiddenField should be fixable, it is telling you have localFs already defined above & you are redefining the same variable name in the test method, rather than using the class variable
>
> ```
> diff --git a/tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java b/tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java
> index c97158d14..a7501e8ae 100644
> --- a/tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java
> +++ b/tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java
> @@ -217,7 +217,6 @@ public class TestMRInputHelpers {
> public void testInputEventSerializedPath() throws IOException {
> MRSplitProto proto = MRSplitProto.newBuilder().setSplitBytes(ByteString.copyFrom("splits".getBytes())).build();
>
> - FileSystem localFs = FileSystem.getLocal(conf);
> Path splitsDir = localFs.resolvePath(localTestRootDir);
>
> Path serializedPath = new Path(splitsDir + Path.SEPARATOR + "splitpayload");
> @@ -282,7 +281,6 @@ public class TestMRInputHelpers {
>
> @Test(timeout = 5000)
> public void testInputSplitLocalResourceCreationWithDifferentFS() throws Exception {
> - FileSystem localFs = FileSystem.getLocal(conf);
> Path splitsDir = localFs.resolvePath(localTestRootDir);
>
> DataSourceDescriptor dataSource = generateDataSourceDescriptorMapRed(splitsDir);
> ```
>
> Just remove that line `FileSystem localFs = FileSystem.getLocal(conf);` line and it should work
LOL, right, I haven't checked it in detail, I thought it was the another checkstyle warning which is usually a noise :) fixed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2028212983
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 9s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 1s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 4 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 5m 54s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 7m 31s | master passed |
| +1 :green_heart: | compile | 1m 19s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 18s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 29s | master passed |
| +1 :green_heart: | javadoc | 1m 17s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 11s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 46s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 2m 52s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 7s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 0m 57s | the patch passed |
| +1 :green_heart: | compile | 0m 53s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 53s | the patch passed |
| +1 :green_heart: | javac | 0m 53s | the patch passed |
| +1 :green_heart: | compile | 0m 49s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 49s | the patch passed |
| +1 :green_heart: | javac | 0m 49s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 7s | The patch passed checkstyle in tez-api |
| +1 :green_heart: | checkstyle | 0m 6s | The patch passed checkstyle in tez-runtime-internals |
| +1 :green_heart: | checkstyle | 0m 8s | tez-mapreduce: The patch generated 0 new + 42 unchanged - 2 fixed = 42 total (was 44) |
| +1 :green_heart: | checkstyle | 0m 12s | The patch passed checkstyle in tez-dag |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 38s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 42s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | findbugs | 2m 20s | the patch passed |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 1m 55s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 29s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 0m 59s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 14s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 36s | The patch does not generate ASF License warnings. |
| | | 39m 32s | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/4/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux 56f24c2d90b7 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/4/testReport/ |
| Max. process+thread count | 373 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/4/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1546918816
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -103,6 +113,7 @@ public Object getDeserializedUserPayload() {
public String toString() {
return "InputDataInformationEvent [sourceIndex=" + sourceIndex + ", targetIndex="
+ targetIndex + ", serializedUserPayloadExists=" + (userPayload != null)
- + ", deserializedUserPayloadExists=" + (userPayloadObject != null) + "]";
- }
+ + ", deserializedUserPayloadExists=" + (userPayloadObject != null)
+ + ", serializedPath=" + serializedPath + "]";
Review Comment:
I can add a null check like:
```
+ serializedPath != null ? (", serializedPath=" + serializedPath) : "" + "]";
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547308574
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -103,6 +113,7 @@ public Object getDeserializedUserPayload() {
public String toString() {
return "InputDataInformationEvent [sourceIndex=" + sourceIndex + ", targetIndex="
+ targetIndex + ", serializedUserPayloadExists=" + (userPayload != null)
- + ", deserializedUserPayloadExists=" + (userPayloadObject != null) + "]";
- }
+ + ", deserializedUserPayloadExists=" + (userPayloadObject != null)
+ + serializedPath != null ? (", serializedPath=" + serializedPath) : "" + "]";
Review Comment:
agreed, it's time to refactor this
##########
tez-api/src/main/java/org/apache/tez/runtime/api/InputInitializerContext.java:
##########
@@ -87,7 +87,12 @@ public interface InputInitializerContext {
* @return Resource
*/
Resource getVertexTaskResource();
-
+
+ /**
+ * Get the vertex id as integer that belongs to this input.
+ */
+ int getVertexId();
Review Comment:
in hive's splitgenerator:
https://github.com/apache/hive/pull/5174/files#diff-e4b863c252f3407ad05ccb56548e9f8384e0af33ce4bf6841a0ce11b173bf00dR197
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog merged PR #341:
URL: https://github.com/apache/tez/pull/341
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1546922261
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -79,6 +79,12 @@ public static InputDataInformationEvent createWithObjectPayload(int srcIndex,
return new InputDataInformationEvent(srcIndex, userPayloadDeserialized, null);
}
+ public static InputDataInformationEvent createWithSerializedPath(int srcIndex, String serializedPath) {
+ InputDataInformationEvent event = new InputDataInformationEvent(srcIndex, null);
+ event.serializedPath = serializedPath;
Review Comment:
this method makes it impossible for users to call constructors in an invalid way (like: adding payload and serialized path at the same time, which should not happen), also this has a method name that implies the expected behavior
what constructor are you suggesting that has the same advantages?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547322934
##########
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/MRInputHelpers.java:
##########
@@ -889,4 +895,29 @@ public static int getDagAttemptNumber(Configuration conf) {
return getIntProperty(conf, MRInput.TEZ_MAPREDUCE_DAG_ATTEMPT_NUMBER);
}
+ public static MRSplitProto getProto(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ return Strings.isNullOrEmpty(initEvent.getSerializedPath()) ? readProtoFromPayload(initEvent)
+ : readProtoFromFs(initEvent, jobConf);
+ }
+
+ private static MRSplitProto readProtoFromFs(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ String serializedPath = initEvent.getSerializedPath();
+ Path filePath = new Path(serializedPath);
+ LOG.info("Reading InputDataInformationEvent from path: {}", filePath);
+
+ MRSplitProto splitProto = null;
+ FileSystem fs = FileSystem.get(filePath.toUri(), jobConf);
+
+ try (FSDataInputStream in = fs.open(filePath)) {
+ splitProto = MRSplitProto.parseFrom(in);
+ fs.delete(filePath, false);
Review Comment:
yeah, I think under normal circumstances everything should work: parse + delete
any of them failing is fine to lead to a task attempt failure (and eventually rerun)
regarding leftover files: it's okay, current hive patch uses the tez scratch dir, which is supposed to be cleaned up by hive as well
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "aturoczy (via GitHub)" <gi...@apache.org>.
aturoczy commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547289396
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -79,6 +79,12 @@ public static InputDataInformationEvent createWithObjectPayload(int srcIndex,
return new InputDataInformationEvent(srcIndex, userPayloadDeserialized, null);
}
+ public static InputDataInformationEvent createWithSerializedPath(int srcIndex, String serializedPath) {
+ InputDataInformationEvent event = new InputDataInformationEvent(srcIndex, null);
+ event.serializedPath = serializedPath;
Review Comment:
ack
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "ayushtkn (via GitHub)" <gi...@apache.org>.
ayushtkn commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547274765
##########
tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java:
##########
@@ -56,9 +62,12 @@ public class TestMRInputHelpers {
private static Path oldSplitsDir;
private static Path newSplitsDir;
- private static String TEST_ROOT_DIR = "target"
+ private static final String TEST_ROOT_DIR = "target"
+ Path.SEPARATOR + TestMRHelpers.class.getName() + "-tmpDir";
+ private static final Path LOCAL_TEST_ROOT_DIR = new Path("target"
+ + Path.SEPARATOR + TestMRHelpers.class.getName() + "-localtmpDir");
+
Review Comment:
this is creating a path & then resolving it to LocalFs, maybe just doing a ``Files.createTempDirectory`` in the test might have saved you from this multiple..
##########
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/MRInputHelpers.java:
##########
@@ -889,4 +895,29 @@ public static int getDagAttemptNumber(Configuration conf) {
return getIntProperty(conf, MRInput.TEZ_MAPREDUCE_DAG_ATTEMPT_NUMBER);
}
+ public static MRSplitProto getProto(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ return Strings.isNullOrEmpty(initEvent.getSerializedPath()) ? readProtoFromPayload(initEvent)
+ : readProtoFromFs(initEvent, jobConf);
+ }
+
+ private static MRSplitProto readProtoFromFs(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ String serializedPath = initEvent.getSerializedPath();
+ Path filePath = new Path(serializedPath);
+ LOG.info("Reading InputDataInformationEvent from path: {}", filePath);
+
+ MRSplitProto splitProto = null;
+ FileSystem fs = FileSystem.get(filePath.toUri(), jobConf);
+
+ try (FSDataInputStream in = fs.open(filePath)) {
+ splitProto = MRSplitProto.parseFrom(in);
+ fs.delete(filePath, false);
+ }
+ return splitProto;
+ }
+
+ private static MRSplitProto readProtoFromPayload(InputDataInformationEvent initEvent) throws IOException {
+ ByteBuffer payload = initEvent.getUserPayload();
+ LOG.info("Reading InputDataInformationEvent from payload, size: {} bytes}", payload.limit());
Review Comment:
``initEvent.getUserPayload()`` can return `null` as well, there is a logic inside that, if that returns `null`, this log line will shoot a NPE., maybe just put `payload` in the log & if it is not null, the `ByteBuffer` `toString()` shall print the limit along with other details
##########
tez-api/src/main/java/org/apache/tez/runtime/api/InputInitializerContext.java:
##########
@@ -87,7 +87,12 @@ public interface InputInitializerContext {
* @return Resource
*/
Resource getVertexTaskResource();
-
+
+ /**
+ * Get the vertex id as integer that belongs to this input.
+ */
+ int getVertexId();
Review Comment:
Where is this being used?
##########
tez-runtime-internals/src/main/java/org/apache/tez/common/ProtoConverters.java:
##########
@@ -135,15 +138,22 @@ public static VertexManagerEvent convertVertexManagerEventFromProto(
if (event.getUserPayload() != null) {
builder.setUserPayload(ByteString.copyFrom(event.getUserPayload()));
}
+ if (event.getSerializedPath() != null) {
+ builder.setSerializedPath(ByteString.copyFrom(event.getSerializedPath().getBytes(Charsets.UTF_8)));
+ }
return builder.build();
}
- public static InputDataInformationEvent
- convertRootInputDataInformationEventFromProto(
+ public static InputDataInformationEvent convertRootInputDataInformationEventFromProto(
EventProtos.RootInputDataInformationEventProto proto) {
- InputDataInformationEvent diEvent = InputDataInformationEvent.createWithSerializedPayload(
- proto.getSourceIndex(),
- proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null);
+ ByteBuffer payload = proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null;
+ InputDataInformationEvent diEvent = null;
+ if (!proto.getSerializedPath().isEmpty()) {
Review Comment:
Should there be a ``proto.hasSerializedPath()`` check?
##########
tez-runtime-internals/src/main/java/org/apache/tez/common/ProtoConverters.java:
##########
@@ -135,15 +138,22 @@ public static VertexManagerEvent convertVertexManagerEventFromProto(
if (event.getUserPayload() != null) {
builder.setUserPayload(ByteString.copyFrom(event.getUserPayload()));
}
+ if (event.getSerializedPath() != null) {
+ builder.setSerializedPath(ByteString.copyFrom(event.getSerializedPath().getBytes(Charsets.UTF_8)));
+ }
return builder.build();
}
- public static InputDataInformationEvent
- convertRootInputDataInformationEventFromProto(
+ public static InputDataInformationEvent convertRootInputDataInformationEventFromProto(
EventProtos.RootInputDataInformationEventProto proto) {
- InputDataInformationEvent diEvent = InputDataInformationEvent.createWithSerializedPayload(
- proto.getSourceIndex(),
- proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null);
+ ByteBuffer payload = proto.hasUserPayload() ? proto.getUserPayload().asReadOnlyByteBuffer() : null;
+ InputDataInformationEvent diEvent = null;
+ if (!proto.getSerializedPath().isEmpty()) {
Review Comment:
At this point, there maybe a possibility that payload wasn't `null` as well, but we ignored that because there was a `serializedPath`?
Should we throw out in that case?
##########
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/MRInputHelpers.java:
##########
@@ -889,4 +895,29 @@ public static int getDagAttemptNumber(Configuration conf) {
return getIntProperty(conf, MRInput.TEZ_MAPREDUCE_DAG_ATTEMPT_NUMBER);
}
+ public static MRSplitProto getProto(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ return Strings.isNullOrEmpty(initEvent.getSerializedPath()) ? readProtoFromPayload(initEvent)
+ : readProtoFromFs(initEvent, jobConf);
+ }
+
+ private static MRSplitProto readProtoFromFs(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ String serializedPath = initEvent.getSerializedPath();
+ Path filePath = new Path(serializedPath);
+ LOG.info("Reading InputDataInformationEvent from path: {}", filePath);
+
+ MRSplitProto splitProto = null;
+ FileSystem fs = FileSystem.get(filePath.toUri(), jobConf);
Review Comment:
Change to
```
FileSystem fs = filePath.getFileSystem(jobConf);
```
##########
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/MRInputHelpers.java:
##########
@@ -889,4 +895,29 @@ public static int getDagAttemptNumber(Configuration conf) {
return getIntProperty(conf, MRInput.TEZ_MAPREDUCE_DAG_ATTEMPT_NUMBER);
}
+ public static MRSplitProto getProto(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ return Strings.isNullOrEmpty(initEvent.getSerializedPath()) ? readProtoFromPayload(initEvent)
+ : readProtoFromFs(initEvent, jobConf);
+ }
+
+ private static MRSplitProto readProtoFromFs(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ String serializedPath = initEvent.getSerializedPath();
+ Path filePath = new Path(serializedPath);
+ LOG.info("Reading InputDataInformationEvent from path: {}", filePath);
+
+ MRSplitProto splitProto = null;
+ FileSystem fs = FileSystem.get(filePath.toUri(), jobConf);
+
+ try (FSDataInputStream in = fs.open(filePath)) {
+ splitProto = MRSplitProto.parseFrom(in);
+ fs.delete(filePath, false);
Review Comment:
If the parsing fails, in that case the file won't be deleted. is that expected? and if your delete fails our operation will fail as well, so it that ok or we can live with that?
##########
tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java:
##########
@@ -232,30 +278,32 @@ private DataSourceDescriptor generateDataSourceDescriptorMapRed(Path inputSplits
@Test(timeout = 5000)
public void testInputSplitLocalResourceCreationWithDifferentFS() throws Exception {
FileSystem localFs = FileSystem.getLocal(conf);
- Path LOCAL_TEST_ROOT_DIR = new Path("target"
- + Path.SEPARATOR + TestMRHelpers.class.getName() + "-localtmpDir");
+ Path splitsDir = localFs.resolvePath(LOCAL_TEST_ROOT_DIR);
- try {
- localFs.mkdirs(LOCAL_TEST_ROOT_DIR);
-
- Path splitsDir = localFs.resolvePath(LOCAL_TEST_ROOT_DIR);
+ DataSourceDescriptor dataSource = generateDataSourceDescriptorMapRed(splitsDir);
- DataSourceDescriptor dataSource = generateDataSourceDescriptorMapRed(splitsDir);
-
- Map<String, LocalResource> localResources = dataSource.getAdditionalLocalFiles();
+ Map<String, LocalResource> localResources = dataSource.getAdditionalLocalFiles();
- Assert.assertEquals(2, localResources.size());
- Assert.assertTrue(localResources.containsKey(
- MRInputHelpers.JOB_SPLIT_RESOURCE_NAME));
- Assert.assertTrue(localResources.containsKey(
- MRInputHelpers.JOB_SPLIT_METAINFO_RESOURCE_NAME));
+ Assert.assertEquals(2, localResources.size());
+ Assert.assertTrue(localResources.containsKey(
+ MRInputHelpers.JOB_SPLIT_RESOURCE_NAME));
+ Assert.assertTrue(localResources.containsKey(
+ MRInputHelpers.JOB_SPLIT_METAINFO_RESOURCE_NAME));
- for (LocalResource lr : localResources.values()) {
- Assert.assertFalse(lr.getResource().getScheme().contains(remoteFs.getScheme()));
- }
- } finally {
- localFs.delete(LOCAL_TEST_ROOT_DIR, true);
+ for (LocalResource lr : localResources.values()) {
+ Assert.assertFalse(lr.getResource().getScheme().contains(remoteFs.getScheme()));
}
}
+ @Before
+ public void before() throws IOException {
+ FileSystem localFs = FileSystem.getLocal(conf);
Review Comment:
We are getting the `localFs` multiple times, can we store it as a class variable just like the `remoteFs`
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -103,6 +113,7 @@ public Object getDeserializedUserPayload() {
public String toString() {
return "InputDataInformationEvent [sourceIndex=" + sourceIndex + ", targetIndex="
+ targetIndex + ", serializedUserPayloadExists=" + (userPayload != null)
- + ", deserializedUserPayloadExists=" + (userPayloadObject != null) + "]";
- }
+ + ", deserializedUserPayloadExists=" + (userPayloadObject != null)
+ + serializedPath != null ? (", serializedPath=" + serializedPath) : "" + "]";
Review Comment:
This is wrong.
```
"InputDataInformationEvent [sourceIndex=" + sourceIndex + ", targetIndex="
+ targetIndex + ", serializedUserPayloadExists=" + (userPayload != null)
+ ", deserializedUserPayloadExists=" + (userPayloadObject != null)
+ serializedPath != null ? <Something>
```
This will evaluate the != operator and check if everything before != is not null, which is always true, so it will always print just
``
(", serializedPath=" + serializedPath)
``
You need a bracket here:
```
return "InputDataInformationEvent [sourceIndex=" + sourceIndex + ", targetIndex="
+ targetIndex + ", serializedUserPayloadExists=" + (userPayload != null)
+ ", deserializedUserPayloadExists=" + (userPayloadObject != null)
+ (serializedPath != null ? (", serializedPath=" + serializedPath) : "" )+ "]";
```
It is too much concat with conditions, maybe we can explore StringBuilder, that might look more redable, something like this
```
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("InputDataInformationEvent [sourceIndex=").append(sourceIndex)
.append(", targetIndex=").append(targetIndex)
.append(", serializedUserPayloadExists=").append(userPayload != null)
.append(", deserializedUserPayloadExists=").append(userPayloadObject != null);
if (serializedPath != null) {
sb.append(", serializedPath=").append(serializedPath);
}
sb.append("]");
return sb.toString();
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547346163
##########
tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java:
##########
@@ -232,30 +278,32 @@ private DataSourceDescriptor generateDataSourceDescriptorMapRed(Path inputSplits
@Test(timeout = 5000)
public void testInputSplitLocalResourceCreationWithDifferentFS() throws Exception {
FileSystem localFs = FileSystem.getLocal(conf);
- Path LOCAL_TEST_ROOT_DIR = new Path("target"
- + Path.SEPARATOR + TestMRHelpers.class.getName() + "-localtmpDir");
+ Path splitsDir = localFs.resolvePath(LOCAL_TEST_ROOT_DIR);
- try {
- localFs.mkdirs(LOCAL_TEST_ROOT_DIR);
-
- Path splitsDir = localFs.resolvePath(LOCAL_TEST_ROOT_DIR);
+ DataSourceDescriptor dataSource = generateDataSourceDescriptorMapRed(splitsDir);
- DataSourceDescriptor dataSource = generateDataSourceDescriptorMapRed(splitsDir);
-
- Map<String, LocalResource> localResources = dataSource.getAdditionalLocalFiles();
+ Map<String, LocalResource> localResources = dataSource.getAdditionalLocalFiles();
- Assert.assertEquals(2, localResources.size());
- Assert.assertTrue(localResources.containsKey(
- MRInputHelpers.JOB_SPLIT_RESOURCE_NAME));
- Assert.assertTrue(localResources.containsKey(
- MRInputHelpers.JOB_SPLIT_METAINFO_RESOURCE_NAME));
+ Assert.assertEquals(2, localResources.size());
+ Assert.assertTrue(localResources.containsKey(
+ MRInputHelpers.JOB_SPLIT_RESOURCE_NAME));
+ Assert.assertTrue(localResources.containsKey(
+ MRInputHelpers.JOB_SPLIT_METAINFO_RESOURCE_NAME));
- for (LocalResource lr : localResources.values()) {
- Assert.assertFalse(lr.getResource().getScheme().contains(remoteFs.getScheme()));
- }
- } finally {
- localFs.delete(LOCAL_TEST_ROOT_DIR, true);
+ for (LocalResource lr : localResources.values()) {
+ Assert.assertFalse(lr.getResource().getScheme().contains(remoteFs.getScheme()));
}
}
+ @Before
+ public void before() throws IOException {
+ FileSystem localFs = FileSystem.getLocal(conf);
Review Comment:
ack
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2031463310
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 9s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 0s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 5m 55s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 8m 2s | master passed |
| +1 :green_heart: | compile | 1m 13s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 24s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 25s | master passed |
| +1 :green_heart: | javadoc | 1m 19s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 1m 10s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 42s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 2m 59s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 7s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 0m 53s | the patch passed |
| +1 :green_heart: | compile | 0m 52s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 52s | the patch passed |
| +1 :green_heart: | javac | 0m 52s | the patch passed |
| +1 :green_heart: | compile | 0m 53s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 53s | the patch passed |
| +1 :green_heart: | javac | 0m 53s | the patch passed |
| -0 :warning: | checkstyle | 0m 8s | tez-mapreduce: The patch generated 4 new + 43 unchanged - 2 fixed = 47 total (was 45) |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 37s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 40s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | findbugs | 2m 18s | the patch passed |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 1m 57s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 30s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 0m 59s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 46s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 33s | The patch does not generate ASF License warnings. |
| | | 40m 36s | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/8/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux c4a861b17fef 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/8/artifact/out/diff-checkstyle-tez-mapreduce.txt |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/8/testReport/ |
| Max. process+thread count | 573 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/8/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "ayushtkn (via GitHub)" <gi...@apache.org>.
ayushtkn commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2031990575
> I cannot address [HiddenField], I consider it as noise now
@abstractdog that HiddenField should be fixable, it is telling you have localFs already defined above & you are redefining the same variable name in the test method, rather than using the class variable
```
diff --git a/tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java b/tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java
index c97158d14..a7501e8ae 100644
--- a/tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java
+++ b/tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java
@@ -217,7 +217,6 @@ public class TestMRInputHelpers {
public void testInputEventSerializedPath() throws IOException {
MRSplitProto proto = MRSplitProto.newBuilder().setSplitBytes(ByteString.copyFrom("splits".getBytes())).build();
- FileSystem localFs = FileSystem.getLocal(conf);
Path splitsDir = localFs.resolvePath(localTestRootDir);
Path serializedPath = new Path(splitsDir + Path.SEPARATOR + "splitpayload");
@@ -282,7 +281,6 @@ public class TestMRInputHelpers {
@Test(timeout = 5000)
public void testInputSplitLocalResourceCreationWithDifferentFS() throws Exception {
- FileSystem localFs = FileSystem.getLocal(conf);
Path splitsDir = localFs.resolvePath(localTestRootDir);
DataSourceDescriptor dataSource = generateDataSourceDescriptorMapRed(splitsDir);
```
Just remove that line ``FileSystem localFs = FileSystem.getLocal(conf);`` line and it should work
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547310433
##########
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/MRInputHelpers.java:
##########
@@ -889,4 +895,29 @@ public static int getDagAttemptNumber(Configuration conf) {
return getIntProperty(conf, MRInput.TEZ_MAPREDUCE_DAG_ATTEMPT_NUMBER);
}
+ public static MRSplitProto getProto(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ return Strings.isNullOrEmpty(initEvent.getSerializedPath()) ? readProtoFromPayload(initEvent)
+ : readProtoFromFs(initEvent, jobConf);
+ }
+
+ private static MRSplitProto readProtoFromFs(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ String serializedPath = initEvent.getSerializedPath();
+ Path filePath = new Path(serializedPath);
+ LOG.info("Reading InputDataInformationEvent from path: {}", filePath);
+
+ MRSplitProto splitProto = null;
+ FileSystem fs = FileSystem.get(filePath.toUri(), jobConf);
+
+ try (FSDataInputStream in = fs.open(filePath)) {
+ splitProto = MRSplitProto.parseFrom(in);
+ fs.delete(filePath, false);
+ }
+ return splitProto;
+ }
+
+ private static MRSplitProto readProtoFromPayload(InputDataInformationEvent initEvent) throws IOException {
+ ByteBuffer payload = initEvent.getUserPayload();
+ LOG.info("Reading InputDataInformationEvent from payload, size: {} bytes}", payload.limit());
Review Comment:
ack, will do this below:
```
LOG.info("Reading InputDataInformationEvent from payload: {}", payload);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2028036146
@ayushtkn : can you please take a look at this one?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1546922261
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -79,6 +79,12 @@ public static InputDataInformationEvent createWithObjectPayload(int srcIndex,
return new InputDataInformationEvent(srcIndex, userPayloadDeserialized, null);
}
+ public static InputDataInformationEvent createWithSerializedPath(int srcIndex, String serializedPath) {
+ InputDataInformationEvent event = new InputDataInformationEvent(srcIndex, null);
+ event.serializedPath = serializedPath;
Review Comment:
this method makes it impossible for users to call constructors in an invalid way (like: adding payload and serialized path at the same time), also with having a method name that implies this behavior, what constructor are you suggesting that has the same advantages?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "aturoczy (via GitHub)" <gi...@apache.org>.
aturoczy commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1546922508
##########
tez-api/src/main/java/org/apache/tez/runtime/api/events/InputDataInformationEvent.java:
##########
@@ -103,6 +113,7 @@ public Object getDeserializedUserPayload() {
public String toString() {
return "InputDataInformationEvent [sourceIndex=" + sourceIndex + ", targetIndex="
+ targetIndex + ", serializedUserPayloadExists=" + (userPayload != null)
- + ", deserializedUserPayloadExists=" + (userPayloadObject != null) + "]";
- }
+ + ", deserializedUserPayloadExists=" + (userPayloadObject != null)
+ + ", serializedPath=" + serializedPath + "]";
Review Comment:
I like it :) I know it add a bit complexity into a simple toString() but it is more elegant imho
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547310433
##########
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/MRInputHelpers.java:
##########
@@ -889,4 +895,29 @@ public static int getDagAttemptNumber(Configuration conf) {
return getIntProperty(conf, MRInput.TEZ_MAPREDUCE_DAG_ATTEMPT_NUMBER);
}
+ public static MRSplitProto getProto(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ return Strings.isNullOrEmpty(initEvent.getSerializedPath()) ? readProtoFromPayload(initEvent)
+ : readProtoFromFs(initEvent, jobConf);
+ }
+
+ private static MRSplitProto readProtoFromFs(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ String serializedPath = initEvent.getSerializedPath();
+ Path filePath = new Path(serializedPath);
+ LOG.info("Reading InputDataInformationEvent from path: {}", filePath);
+
+ MRSplitProto splitProto = null;
+ FileSystem fs = FileSystem.get(filePath.toUri(), jobConf);
+
+ try (FSDataInputStream in = fs.open(filePath)) {
+ splitProto = MRSplitProto.parseFrom(in);
+ fs.delete(filePath, false);
+ }
+ return splitProto;
+ }
+
+ private static MRSplitProto readProtoFromPayload(InputDataInformationEvent initEvent) throws IOException {
+ ByteBuffer payload = initEvent.getUserPayload();
+ LOG.info("Reading InputDataInformationEvent from payload, size: {} bytes}", payload.limit());
Review Comment:
ack, will do this below:
```
LOG.info("Reading InputDataInformationEvent from payload: {}}", payload);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "aturoczy (via GitHub)" <gi...@apache.org>.
aturoczy commented on code in PR #341:
URL: https://github.com/apache/tez/pull/341#discussion_r1547311523
##########
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/MRInputHelpers.java:
##########
@@ -889,4 +895,29 @@ public static int getDagAttemptNumber(Configuration conf) {
return getIntProperty(conf, MRInput.TEZ_MAPREDUCE_DAG_ATTEMPT_NUMBER);
}
+ public static MRSplitProto getProto(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ return Strings.isNullOrEmpty(initEvent.getSerializedPath()) ? readProtoFromPayload(initEvent)
+ : readProtoFromFs(initEvent, jobConf);
+ }
+
+ private static MRSplitProto readProtoFromFs(InputDataInformationEvent initEvent, JobConf jobConf) throws IOException {
+ String serializedPath = initEvent.getSerializedPath();
+ Path filePath = new Path(serializedPath);
+ LOG.info("Reading InputDataInformationEvent from path: {}", filePath);
+
+ MRSplitProto splitProto = null;
+ FileSystem fs = FileSystem.get(filePath.toUri(), jobConf);
+
+ try (FSDataInputStream in = fs.open(filePath)) {
+ splitProto = MRSplitProto.parseFrom(in);
+ fs.delete(filePath, false);
Review Comment:
In my understanding yes, this is expected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "abstractdog (via GitHub)" <gi...@apache.org>.
abstractdog commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2031381165
thanks for the comments @ayushtkn, addressed them in https://github.com/apache/tez/pull/341/commits/564949466ddbd95b00f4314e373ebcf4277ae9e2
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] TEZ-4548: InputDataInformationEvent to be read from serialized payload from filesystem [tez]
Posted by "tez-yetus (via GitHub)" <gi...@apache.org>.
tez-yetus commented on PR #341:
URL: https://github.com/apache/tez/pull/341#issuecomment-2032069693
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 9s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. |
| +0 :ok: | prototool | 0m 0s | prototool was not available. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. |
||| _ master Compile Tests _ |
| +0 :ok: | mvndep | 5m 17s | Maven dependency ordering for branch |
| +1 :green_heart: | mvninstall | 7m 22s | master passed |
| +1 :green_heart: | compile | 1m 17s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | compile | 1m 9s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | checkstyle | 1m 5s | master passed |
| +1 :green_heart: | javadoc | 1m 11s | master passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 58s | master passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +0 :ok: | spotbugs | 0m 44s | Used deprecated FindBugs config; considering switching to SpotBugs. |
| +1 :green_heart: | findbugs | 2m 42s | master passed |
||| _ Patch Compile Tests _ |
| +0 :ok: | mvndep | 0m 7s | Maven dependency ordering for patch |
| +1 :green_heart: | mvninstall | 1m 4s | the patch passed |
| +1 :green_heart: | compile | 0m 53s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | cc | 0m 53s | the patch passed |
| +1 :green_heart: | javac | 0m 53s | the patch passed |
| +1 :green_heart: | compile | 0m 52s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | cc | 0m 52s | the patch passed |
| +1 :green_heart: | javac | 0m 52s | the patch passed |
| -0 :warning: | checkstyle | 0m 9s | tez-mapreduce: The patch generated 2 new + 43 unchanged - 2 fixed = 45 total (was 45) |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | javadoc | 0m 40s | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 |
| +1 :green_heart: | javadoc | 0m 46s | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| +1 :green_heart: | findbugs | 2m 25s | the patch passed |
||| _ Other Tests _ |
| +1 :green_heart: | unit | 1m 56s | tez-api in the patch passed. |
| +1 :green_heart: | unit | 0m 32s | tez-runtime-internals in the patch passed. |
| +1 :green_heart: | unit | 1m 2s | tez-mapreduce in the patch passed. |
| +1 :green_heart: | unit | 4m 15s | tez-dag in the patch passed. |
| +1 :green_heart: | asflicense | 0m 30s | The patch does not generate ASF License warnings. |
| | | 38m 5s | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/10/artifact/out/Dockerfile |
| GITHUB PR | https://github.com/apache/tez/pull/341 |
| JIRA Issue | TEZ-4548 |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile cc prototool |
| uname | Linux 5c39ec52f44b 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 34bb628e3 |
| Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu222.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~22.04-b06 |
| checkstyle | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/10/artifact/out/diff-checkstyle-tez-mapreduce.txt |
| Test Results | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/10/testReport/ |
| Max. process+thread count | 360 (vs. ulimit of 5500) |
| modules | C: tez-api tez-runtime-internals tez-mapreduce tez-dag U: . |
| Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-341/10/console |
| versions | git=2.34.1 maven=3.6.3 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@tez.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org