You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Adam Taft <ad...@adamtaft.com> on 2016/09/01 02:46:10 UTC

Re: PostHTTP Penalize file on HTTP 5xx response

In the wild west of HTTP response codes, a 500 Server Error could mean
practically anything.  In my experience, you can't infer any semantic
meaning for what a 500 status code could mean, unless you're very familiar
with the server application.

I'd even go so far as to suggest, if a modification is made to PostHTTP,
that all non-200 response codes should be penalized.  The dataflow manager
can always adjust the penalization timeout towards zero if a processing
delay is not warranted.

Unrelated, but this also reminds me, we really need a PenalizeFlowFile
processor, which would allow a dataflow manager to penalize a flowfile
anywhere that is deemed necessary, even if other processors haven't done so
(have routed to success).


On Wed, Aug 31, 2016 at 1:54 PM, Andrew Grande <ap...@gmail.com> wrote:

> Wasn't HTTP 400 Bad Request meant for that? 500 only means the server
> failed, not necessarily due to user input.
>
> Andrew
>
> On Wed, Aug 31, 2016, 10:16 AM Mark Payne <ma...@hotmail.com> wrote:
>
> > Hey Chris,
> >
> > I think it is reasonable to penalize when we receive a 500 response. 500
> > means Internal Server Error, and it is
> > very reasonable to believe that the Internal Server Error occurred due to
> > the specific input (i.e., that it may not
> > always occur with different input). So penalizing the FlowFile so that it
> > can be retried after a little bit is reasonable
> > IMO.
> >
> > When using the prioritizers, any FlowFile that is penalized will not hold
> > up other FlowFiles. They are always at the
> > bottom of the queue until the penalization expires.
> >
> > Thanks
> > -Mark
> >
> >
> > > On Aug 31, 2016, at 10:06 AM, McDermott, Chris Kevin (MSDU -
> > STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
> > >
> > > I wanted to ask if it would be at all sane to have the PostHTTP
> > processor penalize a flowfile on 5xx response.  5xx indicates that the
> > request may be good but it cannot be handle by the server Currently it
> > seems the processor routes files eliciting this response to the failure
> > output but does not penalize them.  What do we think of adding such
> > penalization?
> > >
> > > On a related note.  If a file penalized file is routed to a funnel that
> > is connect to a processor via a connection with the OldestFlowFileFirst
> > prioritizer will the consumption of files from that connection be blocked
> > until penalization period is over?
> > >
> > > What I am trying to accomplish is this: I am using PostHTTP to send
> > files to web service that is throttling incoming data by returning a 500
> > response.  When that happens I want to slow down files being to that that
> > service.
> > >
> > > Thanks,
> > >
> > > Chris McDermott.
> > >
> > > Remote Business Analytics
> > > STaTS/StoreFront Remote
> > > HPE Storage
> > > Hewlett Packard Enterprise
> > > Mobile: +1 978-697-5315
> > >
> > >
> >
> >
>

Re: PostHTTP Penalize file on HTTP 5xx response

Posted by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com>.
Hmmm, given this I wonder penalizing the flow file is going to help.  I’d like maintain delivery order as best I can.  If the web service is having intermittent problems some files might be penalized but others, added to the flow later, don’t get penalized and are sent out of order.

It might be better to yield the processor.  That would solve the out of order problem.  However, since the URL supports the EL a single processor could be talking to multiple web-services and yielding the processor could penalize files that destined for web services that are not having problems.    Maybe that is OK though, since using a single processor for multiple web services is probably a corner case and routing to multiple PostHTTP processors could be used to handle such a case.
	
Chris McDermott
 
Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315
 


On 8/31/16, 11:28 PM, "Joe Witt" <jo...@gmail.com> wrote:

    It will not be blocked by penalized things.  The queues are setup to
    basically put those aside and move on to other things until their
    penalty period passes. If you're seeing different behavior please
    advise.
    
    Thanks
    Joe
    
    On Thu, Sep 1, 2016 at 1:11 PM, McDermott, Chris Kevin (MSDU -
    STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
    > Thanks, everyone for the feedback. I’ll file a JIRA for this and see if I can find some time to address it.
    >
    > Does anyone have any thoughts on my related question?
    >
    > (with spelling and grammar corrections:)
    >
    > ➢ If a penalized file is routed to a funnel that’s s connect to a processor via a connection with the OldestFlowFileFirst  prioritizer will the consumption of files from that connection be blocked until penalization period is over?
    >
    >
    >
    > Chris McDermott
    >
    > Remote Business Analytics
    > STaTS/StoreFront Remote
    > HPE Storage
    > Hewlett Packard Enterprise
    > Mobile: +1 978-697-5315
    >
    >
    >
    > On 8/31/16, 11:00 PM, "Matt Burgess" <ma...@apache.org> wrote:
    >
    >     Adam,
    >
    >     A PenalizeFlowFile processor could be pretty useful, please feel free
    >     to file a New Feature Jira for this if you like.
    >
    >     In the meantime you could use ExecuteScript (with Groovy for this
    >     example) and the following:
    >
    >     def flowFile = session.get()
    >     if(!flowFile) return
    >     flowFile = session.penalize(flowFile)
    >     session.transfer(flowFile, REL_SUCCESS)
    >
    >     In this case the "success" relationship is awkward, it means you
    >     successfully penalized the flow file. But then you can route it
    >     back/forward to the appropriate processor. If you create a template
    >     from this single processor, then dragging the template onto the canvas
    >     is somewhat equivalent to dragging a "PenalizeFlowFile" processor onto
    >     the canvas (meaning I suggest the template is named PenalizeFlowFile).
    >
    >     Regards,
    >     Matt
    >
    >     On Wed, Aug 31, 2016 at 10:46 PM, Adam Taft <ad...@adamtaft.com> wrote:
    >     > In the wild west of HTTP response codes, a 500 Server Error could mean
    >     > practically anything.  In my experience, you can't infer any semantic
    >     > meaning for what a 500 status code could mean, unless you're very familiar
    >     > with the server application.
    >     >
    >     > I'd even go so far as to suggest, if a modification is made to PostHTTP,
    >     > that all non-200 response codes should be penalized.  The dataflow manager
    >     > can always adjust the penalization timeout towards zero if a processing
    >     > delay is not warranted.
    >     >
    >     > Unrelated, but this also reminds me, we really need a PenalizeFlowFile
    >     > processor, which would allow a dataflow manager to penalize a flowfile
    >     > anywhere that is deemed necessary, even if other processors haven't done so
    >     > (have routed to success).
    >     >
    >     >
    >     > On Wed, Aug 31, 2016 at 1:54 PM, Andrew Grande <ap...@gmail.com> wrote:
    >     >
    >     >> Wasn't HTTP 400 Bad Request meant for that? 500 only means the server
    >     >> failed, not necessarily due to user input.
    >     >>
    >     >> Andrew
    >     >>
    >     >> On Wed, Aug 31, 2016, 10:16 AM Mark Payne <ma...@hotmail.com> wrote:
    >     >>
    >     >> > Hey Chris,
    >     >> >
    >     >> > I think it is reasonable to penalize when we receive a 500 response. 500
    >     >> > means Internal Server Error, and it is
    >     >> > very reasonable to believe that the Internal Server Error occurred due to
    >     >> > the specific input (i.e., that it may not
    >     >> > always occur with different input). So penalizing the FlowFile so that it
    >     >> > can be retried after a little bit is reasonable
    >     >> > IMO.
    >     >> >
    >     >> > When using the prioritizers, any FlowFile that is penalized will not hold
    >     >> > up other FlowFiles. They are always at the
    >     >> > bottom of the queue until the penalization expires.
    >     >> >
    >     >> > Thanks
    >     >> > -Mark
    >     >> >
    >     >> >
    >     >> > > On Aug 31, 2016, at 10:06 AM, McDermott, Chris Kevin (MSDU -
    >     >> > STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
    >     >> > >
    >     >> > > I wanted to ask if it would be at all sane to have the PostHTTP
    >     >> > processor penalize a flowfile on 5xx response.  5xx indicates that the
    >     >> > request may be good but it cannot be handle by the server Currently it
    >     >> > seems the processor routes files eliciting this response to the failure
    >     >> > output but does not penalize them.  What do we think of adding such
    >     >> > penalization?
    >     >> > >
    >     >> > > On a related note.  If a file penalized file is routed to a funnel that
    >     >> > is connect to a processor via a connection with the OldestFlowFileFirst
    >     >> > prioritizer will the consumption of files from that connection be blocked
    >     >> > until penalization period is over?
    >     >> > >
    >     >> > > What I am trying to accomplish is this: I am using PostHTTP to send
    >     >> > files to web service that is throttling incoming data by returning a 500
    >     >> > response.  When that happens I want to slow down files being to that that
    >     >> > service.
    >     >> > >
    >     >> > > Thanks,
    >     >> > >
    >     >> > > Chris McDermott.
    >     >> > >
    >     >> > > Remote Business Analytics
    >     >> > > STaTS/StoreFront Remote
    >     >> > > HPE Storage
    >     >> > > Hewlett Packard Enterprise
    >     >> > > Mobile: +1 978-697-5315
    >     >> > >
    >     >> > >
    >     >> >
    >     >> >
    >     >>
    >
    >
    


Re: PostHTTP Penalize file on HTTP 5xx response

Posted by Joe Witt <jo...@gmail.com>.
It will not be blocked by penalized things.  The queues are setup to
basically put those aside and move on to other things until their
penalty period passes. If you're seeing different behavior please
advise.

Thanks
Joe

On Thu, Sep 1, 2016 at 1:11 PM, McDermott, Chris Kevin (MSDU -
STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
> Thanks, everyone for the feedback. I’ll file a JIRA for this and see if I can find some time to address it.
>
> Does anyone have any thoughts on my related question?
>
> (with spelling and grammar corrections:)
>
> ➢ If a penalized file is routed to a funnel that’s s connect to a processor via a connection with the OldestFlowFileFirst  prioritizer will the consumption of files from that connection be blocked until penalization period is over?
>
>
>
> Chris McDermott
>
> Remote Business Analytics
> STaTS/StoreFront Remote
> HPE Storage
> Hewlett Packard Enterprise
> Mobile: +1 978-697-5315
>
>
>
> On 8/31/16, 11:00 PM, "Matt Burgess" <ma...@apache.org> wrote:
>
>     Adam,
>
>     A PenalizeFlowFile processor could be pretty useful, please feel free
>     to file a New Feature Jira for this if you like.
>
>     In the meantime you could use ExecuteScript (with Groovy for this
>     example) and the following:
>
>     def flowFile = session.get()
>     if(!flowFile) return
>     flowFile = session.penalize(flowFile)
>     session.transfer(flowFile, REL_SUCCESS)
>
>     In this case the "success" relationship is awkward, it means you
>     successfully penalized the flow file. But then you can route it
>     back/forward to the appropriate processor. If you create a template
>     from this single processor, then dragging the template onto the canvas
>     is somewhat equivalent to dragging a "PenalizeFlowFile" processor onto
>     the canvas (meaning I suggest the template is named PenalizeFlowFile).
>
>     Regards,
>     Matt
>
>     On Wed, Aug 31, 2016 at 10:46 PM, Adam Taft <ad...@adamtaft.com> wrote:
>     > In the wild west of HTTP response codes, a 500 Server Error could mean
>     > practically anything.  In my experience, you can't infer any semantic
>     > meaning for what a 500 status code could mean, unless you're very familiar
>     > with the server application.
>     >
>     > I'd even go so far as to suggest, if a modification is made to PostHTTP,
>     > that all non-200 response codes should be penalized.  The dataflow manager
>     > can always adjust the penalization timeout towards zero if a processing
>     > delay is not warranted.
>     >
>     > Unrelated, but this also reminds me, we really need a PenalizeFlowFile
>     > processor, which would allow a dataflow manager to penalize a flowfile
>     > anywhere that is deemed necessary, even if other processors haven't done so
>     > (have routed to success).
>     >
>     >
>     > On Wed, Aug 31, 2016 at 1:54 PM, Andrew Grande <ap...@gmail.com> wrote:
>     >
>     >> Wasn't HTTP 400 Bad Request meant for that? 500 only means the server
>     >> failed, not necessarily due to user input.
>     >>
>     >> Andrew
>     >>
>     >> On Wed, Aug 31, 2016, 10:16 AM Mark Payne <ma...@hotmail.com> wrote:
>     >>
>     >> > Hey Chris,
>     >> >
>     >> > I think it is reasonable to penalize when we receive a 500 response. 500
>     >> > means Internal Server Error, and it is
>     >> > very reasonable to believe that the Internal Server Error occurred due to
>     >> > the specific input (i.e., that it may not
>     >> > always occur with different input). So penalizing the FlowFile so that it
>     >> > can be retried after a little bit is reasonable
>     >> > IMO.
>     >> >
>     >> > When using the prioritizers, any FlowFile that is penalized will not hold
>     >> > up other FlowFiles. They are always at the
>     >> > bottom of the queue until the penalization expires.
>     >> >
>     >> > Thanks
>     >> > -Mark
>     >> >
>     >> >
>     >> > > On Aug 31, 2016, at 10:06 AM, McDermott, Chris Kevin (MSDU -
>     >> > STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
>     >> > >
>     >> > > I wanted to ask if it would be at all sane to have the PostHTTP
>     >> > processor penalize a flowfile on 5xx response.  5xx indicates that the
>     >> > request may be good but it cannot be handle by the server Currently it
>     >> > seems the processor routes files eliciting this response to the failure
>     >> > output but does not penalize them.  What do we think of adding such
>     >> > penalization?
>     >> > >
>     >> > > On a related note.  If a file penalized file is routed to a funnel that
>     >> > is connect to a processor via a connection with the OldestFlowFileFirst
>     >> > prioritizer will the consumption of files from that connection be blocked
>     >> > until penalization period is over?
>     >> > >
>     >> > > What I am trying to accomplish is this: I am using PostHTTP to send
>     >> > files to web service that is throttling incoming data by returning a 500
>     >> > response.  When that happens I want to slow down files being to that that
>     >> > service.
>     >> > >
>     >> > > Thanks,
>     >> > >
>     >> > > Chris McDermott.
>     >> > >
>     >> > > Remote Business Analytics
>     >> > > STaTS/StoreFront Remote
>     >> > > HPE Storage
>     >> > > Hewlett Packard Enterprise
>     >> > > Mobile: +1 978-697-5315
>     >> > >
>     >> > >
>     >> >
>     >> >
>     >>
>
>

Re: PostHTTP Penalize file on HTTP 5xx response

Posted by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com>.
Thanks, everyone for the feedback. I’ll file a JIRA for this and see if I can find some time to address it.

Does anyone have any thoughts on my related question?

(with spelling and grammar corrections:)

➢ If a penalized file is routed to a funnel that’s s connect to a processor via a connection with the OldestFlowFileFirst  prioritizer will the consumption of files from that connection be blocked until penalization period is over?
    


Chris McDermott
 
Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315
 


On 8/31/16, 11:00 PM, "Matt Burgess" <ma...@apache.org> wrote:

    Adam,
    
    A PenalizeFlowFile processor could be pretty useful, please feel free
    to file a New Feature Jira for this if you like.
    
    In the meantime you could use ExecuteScript (with Groovy for this
    example) and the following:
    
    def flowFile = session.get()
    if(!flowFile) return
    flowFile = session.penalize(flowFile)
    session.transfer(flowFile, REL_SUCCESS)
    
    In this case the "success" relationship is awkward, it means you
    successfully penalized the flow file. But then you can route it
    back/forward to the appropriate processor. If you create a template
    from this single processor, then dragging the template onto the canvas
    is somewhat equivalent to dragging a "PenalizeFlowFile" processor onto
    the canvas (meaning I suggest the template is named PenalizeFlowFile).
    
    Regards,
    Matt
    
    On Wed, Aug 31, 2016 at 10:46 PM, Adam Taft <ad...@adamtaft.com> wrote:
    > In the wild west of HTTP response codes, a 500 Server Error could mean
    > practically anything.  In my experience, you can't infer any semantic
    > meaning for what a 500 status code could mean, unless you're very familiar
    > with the server application.
    >
    > I'd even go so far as to suggest, if a modification is made to PostHTTP,
    > that all non-200 response codes should be penalized.  The dataflow manager
    > can always adjust the penalization timeout towards zero if a processing
    > delay is not warranted.
    >
    > Unrelated, but this also reminds me, we really need a PenalizeFlowFile
    > processor, which would allow a dataflow manager to penalize a flowfile
    > anywhere that is deemed necessary, even if other processors haven't done so
    > (have routed to success).
    >
    >
    > On Wed, Aug 31, 2016 at 1:54 PM, Andrew Grande <ap...@gmail.com> wrote:
    >
    >> Wasn't HTTP 400 Bad Request meant for that? 500 only means the server
    >> failed, not necessarily due to user input.
    >>
    >> Andrew
    >>
    >> On Wed, Aug 31, 2016, 10:16 AM Mark Payne <ma...@hotmail.com> wrote:
    >>
    >> > Hey Chris,
    >> >
    >> > I think it is reasonable to penalize when we receive a 500 response. 500
    >> > means Internal Server Error, and it is
    >> > very reasonable to believe that the Internal Server Error occurred due to
    >> > the specific input (i.e., that it may not
    >> > always occur with different input). So penalizing the FlowFile so that it
    >> > can be retried after a little bit is reasonable
    >> > IMO.
    >> >
    >> > When using the prioritizers, any FlowFile that is penalized will not hold
    >> > up other FlowFiles. They are always at the
    >> > bottom of the queue until the penalization expires.
    >> >
    >> > Thanks
    >> > -Mark
    >> >
    >> >
    >> > > On Aug 31, 2016, at 10:06 AM, McDermott, Chris Kevin (MSDU -
    >> > STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
    >> > >
    >> > > I wanted to ask if it would be at all sane to have the PostHTTP
    >> > processor penalize a flowfile on 5xx response.  5xx indicates that the
    >> > request may be good but it cannot be handle by the server Currently it
    >> > seems the processor routes files eliciting this response to the failure
    >> > output but does not penalize them.  What do we think of adding such
    >> > penalization?
    >> > >
    >> > > On a related note.  If a file penalized file is routed to a funnel that
    >> > is connect to a processor via a connection with the OldestFlowFileFirst
    >> > prioritizer will the consumption of files from that connection be blocked
    >> > until penalization period is over?
    >> > >
    >> > > What I am trying to accomplish is this: I am using PostHTTP to send
    >> > files to web service that is throttling incoming data by returning a 500
    >> > response.  When that happens I want to slow down files being to that that
    >> > service.
    >> > >
    >> > > Thanks,
    >> > >
    >> > > Chris McDermott.
    >> > >
    >> > > Remote Business Analytics
    >> > > STaTS/StoreFront Remote
    >> > > HPE Storage
    >> > > Hewlett Packard Enterprise
    >> > > Mobile: +1 978-697-5315
    >> > >
    >> > >
    >> >
    >> >
    >>
    


Re: PostHTTP Penalize file on HTTP 5xx response

Posted by Matt Burgess <ma...@apache.org>.
Adam,

A PenalizeFlowFile processor could be pretty useful, please feel free
to file a New Feature Jira for this if you like.

In the meantime you could use ExecuteScript (with Groovy for this
example) and the following:

def flowFile = session.get()
if(!flowFile) return
flowFile = session.penalize(flowFile)
session.transfer(flowFile, REL_SUCCESS)

In this case the "success" relationship is awkward, it means you
successfully penalized the flow file. But then you can route it
back/forward to the appropriate processor. If you create a template
from this single processor, then dragging the template onto the canvas
is somewhat equivalent to dragging a "PenalizeFlowFile" processor onto
the canvas (meaning I suggest the template is named PenalizeFlowFile).

Regards,
Matt

On Wed, Aug 31, 2016 at 10:46 PM, Adam Taft <ad...@adamtaft.com> wrote:
> In the wild west of HTTP response codes, a 500 Server Error could mean
> practically anything.  In my experience, you can't infer any semantic
> meaning for what a 500 status code could mean, unless you're very familiar
> with the server application.
>
> I'd even go so far as to suggest, if a modification is made to PostHTTP,
> that all non-200 response codes should be penalized.  The dataflow manager
> can always adjust the penalization timeout towards zero if a processing
> delay is not warranted.
>
> Unrelated, but this also reminds me, we really need a PenalizeFlowFile
> processor, which would allow a dataflow manager to penalize a flowfile
> anywhere that is deemed necessary, even if other processors haven't done so
> (have routed to success).
>
>
> On Wed, Aug 31, 2016 at 1:54 PM, Andrew Grande <ap...@gmail.com> wrote:
>
>> Wasn't HTTP 400 Bad Request meant for that? 500 only means the server
>> failed, not necessarily due to user input.
>>
>> Andrew
>>
>> On Wed, Aug 31, 2016, 10:16 AM Mark Payne <ma...@hotmail.com> wrote:
>>
>> > Hey Chris,
>> >
>> > I think it is reasonable to penalize when we receive a 500 response. 500
>> > means Internal Server Error, and it is
>> > very reasonable to believe that the Internal Server Error occurred due to
>> > the specific input (i.e., that it may not
>> > always occur with different input). So penalizing the FlowFile so that it
>> > can be retried after a little bit is reasonable
>> > IMO.
>> >
>> > When using the prioritizers, any FlowFile that is penalized will not hold
>> > up other FlowFiles. They are always at the
>> > bottom of the queue until the penalization expires.
>> >
>> > Thanks
>> > -Mark
>> >
>> >
>> > > On Aug 31, 2016, at 10:06 AM, McDermott, Chris Kevin (MSDU -
>> > STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
>> > >
>> > > I wanted to ask if it would be at all sane to have the PostHTTP
>> > processor penalize a flowfile on 5xx response.  5xx indicates that the
>> > request may be good but it cannot be handle by the server Currently it
>> > seems the processor routes files eliciting this response to the failure
>> > output but does not penalize them.  What do we think of adding such
>> > penalization?
>> > >
>> > > On a related note.  If a file penalized file is routed to a funnel that
>> > is connect to a processor via a connection with the OldestFlowFileFirst
>> > prioritizer will the consumption of files from that connection be blocked
>> > until penalization period is over?
>> > >
>> > > What I am trying to accomplish is this: I am using PostHTTP to send
>> > files to web service that is throttling incoming data by returning a 500
>> > response.  When that happens I want to slow down files being to that that
>> > service.
>> > >
>> > > Thanks,
>> > >
>> > > Chris McDermott.
>> > >
>> > > Remote Business Analytics
>> > > STaTS/StoreFront Remote
>> > > HPE Storage
>> > > Hewlett Packard Enterprise
>> > > Mobile: +1 978-697-5315
>> > >
>> > >
>> >
>> >
>>