You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by Eric Goodman <Er...@microsoft.com.INVALID> on 2019/01/30 19:58:25 UTC

Scaling VertexInitializedEvent

Hi Tez devs,

The current design of VertexInitializedEvent<https://github.com/apache/tez/blob/3f2373e2b2ab3825ef50e9f19b8704265542a8b2/tez-dag/src/main/java/org/apache/tez/dag/history/events/VertexInitializedEvent.java> contains all of the InputDataInformationEvents for a particular vertex. We've had trouble scaling this for large vertices as Protobuf limits message sizes to 64 MB. I'm wondering if anyone is working on a more scaleable solution, and if not, if you guys have any suggestions for how to decompose this event into smaller events so that Protobuf's message size limit is never an issue.

Thanks,
Eric

Re: Scaling VertexInitializedEvent

Posted by Jonathan Eagles <je...@gmail.com>.
Not sure if you received this email or not, Eric as I don't see you
subscribed to the dev apache list. Re-Replying to the original message
plus two extra jira links
https://issues.apache.org/jira/browse/TEZ-3914
https://issues.apache.org/jira/browse/TEZ-3784

Eric, Could you post a stack trace of the error. We have had a few
bugs in the past that prevented messages over 64MB. This is only an
artificial limit based on the high-level reader and writer APIs that
are used. If a low-level api is used, messages of arbitrary length (or
at least up to 2GB IIRC) should be possible.


On Mon, Feb 4, 2019 at 3:50 PM Jonathan Eagles <je...@gmail.com> wrote:
>
> Eric, Could you post a stack trace of the error. We have had a few
> bugs in the past that prevented messages over 64MB. This is only an
> artificial limit based on the high-level reader and writer APIs that
> are used. If a low-level api is used, messages of arbitrary length (or
> at least up to 2GB IIRC) should be possible.
>
> On Wed, Jan 30, 2019 at 2:47 PM Eric Goodman
> <Er...@microsoft.com.invalid> wrote:
> >
> > Hi Tez devs,
> >
> > The current design of VertexInitializedEvent<https://github.com/apache/tez/blob/3f2373e2b2ab3825ef50e9f19b8704265542a8b2/tez-dag/src/main/java/org/apache/tez/dag/history/events/VertexInitializedEvent.java> contains all of the InputDataInformationEvents for a particular vertex. We've had trouble scaling this for large vertices as Protobuf limits message sizes to 64 MB. I'm wondering if anyone is working on a more scaleable solution, and if not, if you guys have any suggestions for how to decompose this event into smaller events so that Protobuf's message size limit is never an issue.
> >
> > Thanks,
> > Eric

Re: Scaling VertexInitializedEvent

Posted by Jonathan Eagles <je...@gmail.com>.
Eric, Could you post a stack trace of the error. We have had a few
bugs in the past that prevented messages over 64MB. This is only an
artificial limit based on the high-level reader and writer APIs that
are used. If a low-level api is used, messages of arbitrary length (or
at least up to 2GB IIRC) should be possible.

On Wed, Jan 30, 2019 at 2:47 PM Eric Goodman
<Er...@microsoft.com.invalid> wrote:
>
> Hi Tez devs,
>
> The current design of VertexInitializedEvent<https://github.com/apache/tez/blob/3f2373e2b2ab3825ef50e9f19b8704265542a8b2/tez-dag/src/main/java/org/apache/tez/dag/history/events/VertexInitializedEvent.java> contains all of the InputDataInformationEvents for a particular vertex. We've had trouble scaling this for large vertices as Protobuf limits message sizes to 64 MB. I'm wondering if anyone is working on a more scaleable solution, and if not, if you guys have any suggestions for how to decompose this event into smaller events so that Protobuf's message size limit is never an issue.
>
> Thanks,
> Eric