You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Guozhang Wang <wa...@gmail.com> on 2014/02/13 02:43:53 UTC

Error Handling on New Producer

Folks,

While working on the new producer, I noticed another issue about error
handling that I would like to ask for some discussions (I know Jay said the
protocol definition is the last one, but I guess that can never be true :)

Currently we have two classes of exceptions that we can throw when sending
messages with the producer, class A is more operational related and
recoverable, such as:

1. Producer buffer full with BLOCK_ON_BUFFER_FULL set to false.
2. Message size too large.
3. Timed out on produce requests.
4. Error responses from brokers.
etc

Class B is more related to client app bugs and fatal, such as:

1. Invalid specified partition Id.
2. Calling send() after close().
3. Send a message with topic as null.
etc

We have two options to throw them: either we wrap all exceptions in the
returned future which will then only throw the exception if you call
future.get(), and the send() call will not expect any checked exceptions;
or we wrap class A exceptions in future and directly throw class B
exceptions in send() call. The pros of the first option is that you only
need to try/catch the future.get call if you care enough to check these
errors, the cons is that if your async producer does not care about class A
but only want to check class B errors, you still need to block wait on
get() for all messages. And vice versa for the other option.

Thoughts?

-- 
-- Guozhang

Re: Error Handling on New Producer

Posted by Jay Kreps <ja...@gmail.com>.

Sounds good to me.

-Jay


On Thu, Feb 13, 2014 at 1:28 PM, Guozhang Wang <wa...@gmail.com> wrote:

> OK, unless someone else is against the second option I will change the code
> accordingly.
>
>
> On Thu, Feb 13, 2014 at 10:15 AM, Jay Kreps <ja...@gmail.com> wrote:
>
> > Yeah I agree, I think we just have to be very careful that the right
> errors
> > are thrown versus stuffed in the future.
> >
> > -Jay
> >
> >
> > On Thu, Feb 13, 2014 at 9:24 AM, Neha Narkhede <neha.narkhede@gmail.com
> > >wrote:
> >
> > > Alternately we could throw user errors from send. These you wouldn't
> need
> > > to catch because they should not happen (and if they do it is a bug,
> not
> > > something you should really catch and handle). The only concern here is
> > > that we have to be very careful to ensure we categorize each exception
> > > correctly.
> > >
> > > Thought about this. If I was a user of this API, I would prefer this
> > > approach.
> > > So you still don't have to wrap the send with try catch since those
> > > exceptions
> > > are not meant to be caught.
> > >
> > > Thanks,
> > > Neha
> > >
> > >
> > > On Thu, Feb 13, 2014 at 8:34 AM, Jay Kreps <ja...@gmail.com>
> wrote:
> > >
> > > > Some context on this.
> > > >
> > > > Originally (i.e. in the first patch) I threw any exception I could
> > > > including all class B exceptions directly from the send() call and
> only
> > > > threw api exceptions with the future. My concern with this approach
> was
> > > > just the complexity of the error handling. You end up needing to
> > > try/catch
> > > > around the send call to handle anything that is thrown there but then
> > you
> > > > also MUST try/catch around the Future (since it is a checked
> > exception).
> > > >
> > > > My concern with this is just that people would do it wrong and only
> do
> > > one
> > > > of these. So I changed it to throw all exceptions through the Future.
> > > > However this has another problem. There could be a set of users who
> > just
> > > > want fast asynchronous send and who won't check the result of the
> send
> > at
> > > > all. Their rationale could be that, yes, there could be occasional
> > > network
> > > > errors but as long as 99.9% of their data gets through they are okay.
> > The
> > > > problem is here that for these use cases we essentially mask a set of
> > > > coding errors by funneling them through the future (if you don't
> check
> > > you
> > > > may not realize that you are producing invalid partitions, messages
> > that
> > > > are too large, or whatever). This may be okay if we just do a good
> job
> > of
> > > > documenting how errors work.
> > > >
> > > > Alternately we could throw user errors from send. These you wouldn't
> > need
> > > > to catch because they should not happen (and if they do it is a bug,
> > not
> > > > something you should really catch and handle). The only concern here
> is
> > > > that we have to be very careful to ensure we categorize each
> exception
> > > > correctly. For example I am not sure if sending a message larger than
> > the
> > > > max message size the producer set in their own config is a class A or
> > > class
> > > > B exception. Ostensibly it is an error on their part, but often I
> have
> > > > found that people haven't really thought about the serialized size of
> > > their
> > > > messages at all.
> > > >
> > > > -Jay
> > > >
> > > >
> > > > On Wed, Feb 12, 2014 at 5:43 PM, Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > >
> > > > > Folks,
> > > > >
> > > > > While working on the new producer, I noticed another issue about
> > error
> > > > > handling that I would like to ask for some discussions (I know Jay
> > said
> > > > the
> > > > > protocol definition is the last one, but I guess that can never be
> > true
> > > > :)
> > > > >
> > > > > Currently we have two classes of exceptions that we can throw when
> > > > sending
> > > > > messages with the producer, class A is more operational related and
> > > > > recoverable, such as:
> > > > >
> > > > > 1. Producer buffer full with BLOCK_ON_BUFFER_FULL set to false.
> > > > > 2. Message size too large.
> > > > > 3. Timed out on produce requests.
> > > > > 4. Error responses from brokers.
> > > > > etc
> > > > >
> > > > > Class B is more related to client app bugs and fatal, such as:
> > > > >
> > > > > 1. Invalid specified partition Id.
> > > > > 2. Calling send() after close().
> > > > > 3. Send a message with topic as null.
> > > > > etc
> > > > >
> > > > > We have two options to throw them: either we wrap all exceptions in
> > the
> > > > > returned future which will then only throw the exception if you
> call
> > > > > future.get(), and the send() call will not expect any checked
> > > exceptions;
> > > > > or we wrap class A exceptions in future and directly throw class B
> > > > > exceptions in send() call. The pros of the first option is that you
> > > only
> > > > > need to try/catch the future.get call if you care enough to check
> > these
> > > > > errors, the cons is that if your async producer does not care about
> > > > class A
> > > > > but only want to check class B errors, you still need to block wait
> > on
> > > > > get() for all messages. And vice versa for the other option.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: Error Handling on New Producer

Posted by Guozhang Wang <wa...@gmail.com>.

OK, unless someone else is against the second option I will change the code
accordingly.


On Thu, Feb 13, 2014 at 10:15 AM, Jay Kreps <ja...@gmail.com> wrote:

> Yeah I agree, I think we just have to be very careful that the right errors
> are thrown versus stuffed in the future.
>
> -Jay
>
>
> On Thu, Feb 13, 2014 at 9:24 AM, Neha Narkhede <neha.narkhede@gmail.com
> >wrote:
>
> > Alternately we could throw user errors from send. These you wouldn't need
> > to catch because they should not happen (and if they do it is a bug, not
> > something you should really catch and handle). The only concern here is
> > that we have to be very careful to ensure we categorize each exception
> > correctly.
> >
> > Thought about this. If I was a user of this API, I would prefer this
> > approach.
> > So you still don't have to wrap the send with try catch since those
> > exceptions
> > are not meant to be caught.
> >
> > Thanks,
> > Neha
> >
> >
> > On Thu, Feb 13, 2014 at 8:34 AM, Jay Kreps <ja...@gmail.com> wrote:
> >
> > > Some context on this.
> > >
> > > Originally (i.e. in the first patch) I threw any exception I could
> > > including all class B exceptions directly from the send() call and only
> > > threw api exceptions with the future. My concern with this approach was
> > > just the complexity of the error handling. You end up needing to
> > try/catch
> > > around the send call to handle anything that is thrown there but then
> you
> > > also MUST try/catch around the Future (since it is a checked
> exception).
> > >
> > > My concern with this is just that people would do it wrong and only do
> > one
> > > of these. So I changed it to throw all exceptions through the Future.
> > > However this has another problem. There could be a set of users who
> just
> > > want fast asynchronous send and who won't check the result of the send
> at
> > > all. Their rationale could be that, yes, there could be occasional
> > network
> > > errors but as long as 99.9% of their data gets through they are okay.
> The
> > > problem is here that for these use cases we essentially mask a set of
> > > coding errors by funneling them through the future (if you don't check
> > you
> > > may not realize that you are producing invalid partitions, messages
> that
> > > are too large, or whatever). This may be okay if we just do a good job
> of
> > > documenting how errors work.
> > >
> > > Alternately we could throw user errors from send. These you wouldn't
> need
> > > to catch because they should not happen (and if they do it is a bug,
> not
> > > something you should really catch and handle). The only concern here is
> > > that we have to be very careful to ensure we categorize each exception
> > > correctly. For example I am not sure if sending a message larger than
> the
> > > max message size the producer set in their own config is a class A or
> > class
> > > B exception. Ostensibly it is an error on their part, but often I have
> > > found that people haven't really thought about the serialized size of
> > their
> > > messages at all.
> > >
> > > -Jay
> > >
> > >
> > > On Wed, Feb 12, 2014 at 5:43 PM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > Folks,
> > > >
> > > > While working on the new producer, I noticed another issue about
> error
> > > > handling that I would like to ask for some discussions (I know Jay
> said
> > > the
> > > > protocol definition is the last one, but I guess that can never be
> true
> > > :)
> > > >
> > > > Currently we have two classes of exceptions that we can throw when
> > > sending
> > > > messages with the producer, class A is more operational related and
> > > > recoverable, such as:
> > > >
> > > > 1. Producer buffer full with BLOCK_ON_BUFFER_FULL set to false.
> > > > 2. Message size too large.
> > > > 3. Timed out on produce requests.
> > > > 4. Error responses from brokers.
> > > > etc
> > > >
> > > > Class B is more related to client app bugs and fatal, such as:
> > > >
> > > > 1. Invalid specified partition Id.
> > > > 2. Calling send() after close().
> > > > 3. Send a message with topic as null.
> > > > etc
> > > >
> > > > We have two options to throw them: either we wrap all exceptions in
> the
> > > > returned future which will then only throw the exception if you call
> > > > future.get(), and the send() call will not expect any checked
> > exceptions;
> > > > or we wrap class A exceptions in future and directly throw class B
> > > > exceptions in send() call. The pros of the first option is that you
> > only
> > > > need to try/catch the future.get call if you care enough to check
> these
> > > > errors, the cons is that if your async producer does not care about
> > > class A
> > > > but only want to check class B errors, you still need to block wait
> on
> > > > get() for all messages. And vice versa for the other option.
> > > >
> > > > Thoughts?
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>



-- 
-- Guozhang

Re: Error Handling on New Producer

Posted by Jay Kreps <ja...@gmail.com>.

Yeah I agree, I think we just have to be very careful that the right errors
are thrown versus stuffed in the future.

-Jay


On Thu, Feb 13, 2014 at 9:24 AM, Neha Narkhede <ne...@gmail.com>wrote:

> Alternately we could throw user errors from send. These you wouldn't need
> to catch because they should not happen (and if they do it is a bug, not
> something you should really catch and handle). The only concern here is
> that we have to be very careful to ensure we categorize each exception
> correctly.
>
> Thought about this. If I was a user of this API, I would prefer this
> approach.
> So you still don't have to wrap the send with try catch since those
> exceptions
> are not meant to be caught.
>
> Thanks,
> Neha
>
>
> On Thu, Feb 13, 2014 at 8:34 AM, Jay Kreps <ja...@gmail.com> wrote:
>
> > Some context on this.
> >
> > Originally (i.e. in the first patch) I threw any exception I could
> > including all class B exceptions directly from the send() call and only
> > threw api exceptions with the future. My concern with this approach was
> > just the complexity of the error handling. You end up needing to
> try/catch
> > around the send call to handle anything that is thrown there but then you
> > also MUST try/catch around the Future (since it is a checked exception).
> >
> > My concern with this is just that people would do it wrong and only do
> one
> > of these. So I changed it to throw all exceptions through the Future.
> > However this has another problem. There could be a set of users who just
> > want fast asynchronous send and who won't check the result of the send at
> > all. Their rationale could be that, yes, there could be occasional
> network
> > errors but as long as 99.9% of their data gets through they are okay. The
> > problem is here that for these use cases we essentially mask a set of
> > coding errors by funneling them through the future (if you don't check
> you
> > may not realize that you are producing invalid partitions, messages that
> > are too large, or whatever). This may be okay if we just do a good job of
> > documenting how errors work.
> >
> > Alternately we could throw user errors from send. These you wouldn't need
> > to catch because they should not happen (and if they do it is a bug, not
> > something you should really catch and handle). The only concern here is
> > that we have to be very careful to ensure we categorize each exception
> > correctly. For example I am not sure if sending a message larger than the
> > max message size the producer set in their own config is a class A or
> class
> > B exception. Ostensibly it is an error on their part, but often I have
> > found that people haven't really thought about the serialized size of
> their
> > messages at all.
> >
> > -Jay
> >
> >
> > On Wed, Feb 12, 2014 at 5:43 PM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > Folks,
> > >
> > > While working on the new producer, I noticed another issue about error
> > > handling that I would like to ask for some discussions (I know Jay said
> > the
> > > protocol definition is the last one, but I guess that can never be true
> > :)
> > >
> > > Currently we have two classes of exceptions that we can throw when
> > sending
> > > messages with the producer, class A is more operational related and
> > > recoverable, such as:
> > >
> > > 1. Producer buffer full with BLOCK_ON_BUFFER_FULL set to false.
> > > 2. Message size too large.
> > > 3. Timed out on produce requests.
> > > 4. Error responses from brokers.
> > > etc
> > >
> > > Class B is more related to client app bugs and fatal, such as:
> > >
> > > 1. Invalid specified partition Id.
> > > 2. Calling send() after close().
> > > 3. Send a message with topic as null.
> > > etc
> > >
> > > We have two options to throw them: either we wrap all exceptions in the
> > > returned future which will then only throw the exception if you call
> > > future.get(), and the send() call will not expect any checked
> exceptions;
> > > or we wrap class A exceptions in future and directly throw class B
> > > exceptions in send() call. The pros of the first option is that you
> only
> > > need to try/catch the future.get call if you care enough to check these
> > > errors, the cons is that if your async producer does not care about
> > class A
> > > but only want to check class B errors, you still need to block wait on
> > > get() for all messages. And vice versa for the other option.
> > >
> > > Thoughts?
> > >
> > > --
> > > -- Guozhang
> > >
> >
>

Re: Error Handling on New Producer

Posted by Neha Narkhede <ne...@gmail.com>.

Alternately we could throw user errors from send. These you wouldn't need
to catch because they should not happen (and if they do it is a bug, not
something you should really catch and handle). The only concern here is
that we have to be very careful to ensure we categorize each exception
correctly.

Thought about this. If I was a user of this API, I would prefer this
approach.
So you still don't have to wrap the send with try catch since those
exceptions
are not meant to be caught.

Thanks,
Neha


On Thu, Feb 13, 2014 at 8:34 AM, Jay Kreps <ja...@gmail.com> wrote:

> Some context on this.
>
> Originally (i.e. in the first patch) I threw any exception I could
> including all class B exceptions directly from the send() call and only
> threw api exceptions with the future. My concern with this approach was
> just the complexity of the error handling. You end up needing to try/catch
> around the send call to handle anything that is thrown there but then you
> also MUST try/catch around the Future (since it is a checked exception).
>
> My concern with this is just that people would do it wrong and only do one
> of these. So I changed it to throw all exceptions through the Future.
> However this has another problem. There could be a set of users who just
> want fast asynchronous send and who won't check the result of the send at
> all. Their rationale could be that, yes, there could be occasional network
> errors but as long as 99.9% of their data gets through they are okay. The
> problem is here that for these use cases we essentially mask a set of
> coding errors by funneling them through the future (if you don't check you
> may not realize that you are producing invalid partitions, messages that
> are too large, or whatever). This may be okay if we just do a good job of
> documenting how errors work.
>
> Alternately we could throw user errors from send. These you wouldn't need
> to catch because they should not happen (and if they do it is a bug, not
> something you should really catch and handle). The only concern here is
> that we have to be very careful to ensure we categorize each exception
> correctly. For example I am not sure if sending a message larger than the
> max message size the producer set in their own config is a class A or class
> B exception. Ostensibly it is an error on their part, but often I have
> found that people haven't really thought about the serialized size of their
> messages at all.
>
> -Jay
>
>
> On Wed, Feb 12, 2014 at 5:43 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > Folks,
> >
> > While working on the new producer, I noticed another issue about error
> > handling that I would like to ask for some discussions (I know Jay said
> the
> > protocol definition is the last one, but I guess that can never be true
> :)
> >
> > Currently we have two classes of exceptions that we can throw when
> sending
> > messages with the producer, class A is more operational related and
> > recoverable, such as:
> >
> > 1. Producer buffer full with BLOCK_ON_BUFFER_FULL set to false.
> > 2. Message size too large.
> > 3. Timed out on produce requests.
> > 4. Error responses from brokers.
> > etc
> >
> > Class B is more related to client app bugs and fatal, such as:
> >
> > 1. Invalid specified partition Id.
> > 2. Calling send() after close().
> > 3. Send a message with topic as null.
> > etc
> >
> > We have two options to throw them: either we wrap all exceptions in the
> > returned future which will then only throw the exception if you call
> > future.get(), and the send() call will not expect any checked exceptions;
> > or we wrap class A exceptions in future and directly throw class B
> > exceptions in send() call. The pros of the first option is that you only
> > need to try/catch the future.get call if you care enough to check these
> > errors, the cons is that if your async producer does not care about
> class A
> > but only want to check class B errors, you still need to block wait on
> > get() for all messages. And vice versa for the other option.
> >
> > Thoughts?
> >
> > --
> > -- Guozhang
> >
>

Re: Error Handling on New Producer

Posted by Jay Kreps <ja...@gmail.com>.

Some context on this.

Originally (i.e. in the first patch) I threw any exception I could
including all class B exceptions directly from the send() call and only
threw api exceptions with the future. My concern with this approach was
just the complexity of the error handling. You end up needing to try/catch
around the send call to handle anything that is thrown there but then you
also MUST try/catch around the Future (since it is a checked exception).

My concern with this is just that people would do it wrong and only do one
of these. So I changed it to throw all exceptions through the Future.
However this has another problem. There could be a set of users who just
want fast asynchronous send and who won't check the result of the send at
all. Their rationale could be that, yes, there could be occasional network
errors but as long as 99.9% of their data gets through they are okay. The
problem is here that for these use cases we essentially mask a set of
coding errors by funneling them through the future (if you don't check you
may not realize that you are producing invalid partitions, messages that
are too large, or whatever). This may be okay if we just do a good job of
documenting how errors work.

Alternately we could throw user errors from send. These you wouldn't need
to catch because they should not happen (and if they do it is a bug, not
something you should really catch and handle). The only concern here is
that we have to be very careful to ensure we categorize each exception
correctly. For example I am not sure if sending a message larger than the
max message size the producer set in their own config is a class A or class
B exception. Ostensibly it is an error on their part, but often I have
found that people haven't really thought about the serialized size of their
messages at all.

-Jay

On Wed, Feb 12, 2014 at 5:43 PM, Guozhang Wang <wa...@gmail.com> wrote:

> Folks,
>
> While working on the new producer, I noticed another issue about error
> handling that I would like to ask for some discussions (I know Jay said the
> protocol definition is the last one, but I guess that can never be true :)
>
> Currently we have two classes of exceptions that we can throw when sending
> messages with the producer, class A is more operational related and
> recoverable, such as:
>
> 1. Producer buffer full with BLOCK_ON_BUFFER_FULL set to false.
> 2. Message size too large.
> 3. Timed out on produce requests.
> 4. Error responses from brokers.
> etc
>
> Class B is more related to client app bugs and fatal, such as:
>
> 1. Invalid specified partition Id.
> 2. Calling send() after close().
> 3. Send a message with topic as null.
> etc
>
> We have two options to throw them: either we wrap all exceptions in the
> returned future which will then only throw the exception if you call
> future.get(), and the send() call will not expect any checked exceptions;
> or we wrap class A exceptions in future and directly throw class B
> exceptions in send() call. The pros of the first option is that you only
> need to try/catch the future.get call if you care enough to check these
> errors, the cons is that if your async producer does not care about class A
> but only want to check class B errors, you still need to block wait on
> get() for all messages. And vice versa for the other option.
>
> Thoughts?
>
> --
> -- Guozhang
>