You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Tomasz Dudziak <t....@mwam.com> on 2020/07/23 10:59:01 UTC

REST API randomly returns Not Found for an existing job

Hi,

I have come across an issue related to GET /job/:jobId endpoint from monitoring REST API in Flink 1.9.0. A few seconds after successfully starting a job and confirming its status as RUNNING, that endpoint would return 404 (Not Found). Interestingly, querying immediately again (literally a millisecond later) would return a valid result. I later noticed a similar behaviour in regard to a finished job as well. At certain points in time that endpoint would arbitrarily return 404, but similarly querying again would succeed. I saw this strange behaviour only recently and it used to work fine before.

Do you know what could be the root cause of this? At the moment, as a workaround I just query a job a couple of times in a row to ensure whether it definitely does not exist or it is just being misreported as non-existent, but this feels a bit like cottage industry...

Kind regards,
Tomasz

Tomasz Dudziak | Marshall Wace LLP, George House, 131 Sloane Street, London, SW1X 9AT | E-mail: t.dudziak@mwam.com<ma...@mwam.com> | Tel: +44 207 024 7061


This e-mail and any attachments are confidential to the addressee(s) and may contain information that is legally privileged and/or confidential. Please refer to http://www.mwam.com/email-disclaimer-uk for important disclosures regarding this email. If we collect and use your personal data we will use it in accordance with our privacy policy, which can be reviewed at https://www.mwam.com/privacy-policy .

Marshall Wace LLP is authorised and regulated by the Financial Conduct Authority. Marshall Wace LLP is a limited liability partnership registered in England and Wales with registered number OC302228 and registered office at George House, 131 Sloane Street, London, SW1X 9AT. If you are receiving this e-mail as a client, or an investor in an investment vehicle, managed or advised by Marshall Wace North America L.P., the sender of this e-mail is communicating with you in the sender's capacity as an associated or related person of Marshall Wace North America L.P., which is registered with the US Securities and Exchange Commission as an investment adviser.

Re: REST API randomly returns Not Found for an existing job

Posted by Kostas Kloudas <kk...@gmail.com>.
Thanks a lot for the update Tomasz and keep up posted if it happens again.

Kostas

On Fri, Jul 24, 2020 at 6:37 PM Tomasz Dudziak <t....@mwam.com> wrote:
>
> Yes, the job was running and the REST server as well. No JobMaster failures noticed.
> I used a test cluster deployed on a bunch of VM's and bare metal boxes.
> I am afraid, I can no longer reproduce this issue. It occurred a couple days ago and lasted for an entire day with jobs being quite often erratically reported as Not Found. As I said, I noticed that another query immediately after the one that returned Not Found consistently returned a correct result.
> It had never occurred before and I am afraid now I could no longer observe it again. I appreciate it does not give too much information so I will come back with more info on this thread if it happens again.
>
> -----Original Message-----
> From: Kostas Kloudas <kk...@gmail.com>
> Sent: 24 July 2020 15:46
> To: Tomasz Dudziak <t....@mwam.com>
> Cc: user@flink.apache.org; Chesnay Schepler <ch...@apache.org>
> Subject: Re: REST API randomly returns Not Found for an existing job
>
> Hi Tomasz,
>
> Thanks a lot for reporting this issue. If you have verified that the job is running AND that the REST server is also up and running (e.g.
> check the overview page) then I think that this should not be happening. I am cc'ing Chesnay who may have an additional opinion on this.
>
> Cheers,
> Kostas
>
> On Thu, Jul 23, 2020 at 12:59 PM Tomasz Dudziak <t....@mwam.com> wrote:
> >
> > Hi,
> >
> >
> >
> > I have come across an issue related to GET /job/:jobId endpoint from monitoring REST API in Flink 1.9.0. A few seconds after successfully starting a job and confirming its status as RUNNING, that endpoint would return 404 (Not Found). Interestingly, querying immediately again (literally a millisecond later) would return a valid result. I later noticed a similar behaviour in regard to a finished job as well. At certain points in time that endpoint would arbitrarily return 404, but similarly querying again would succeed. I saw this strange behaviour only recently and it used to work fine before.
> >
> >
> >
> > Do you know what could be the root cause of this? At the moment, as a
> > workaround I just query a job a couple of times in a row to ensure
> > whether it definitely does not exist or it is just being misreported
> > as non-existent, but this feels a bit like cottage industry…
> >
> >
> >
> > Kind regards,
> >
> > Tomasz
> >
> >
> >
> > Tomasz Dudziak | Marshall Wace LLP, George House, 131 Sloane Street,
> > London, SW1X 9AT | E-mail: t.dudziak@mwam.com | Tel: +44 207 024 7061
> >
> >
> >
> >
> >
> > This e-mail and any attachments are confidential to the addressee(s) and may contain information that is legally privileged and/or confidential. Please refer to http://www.mwam.com/email-disclaimer-uk for important disclosures regarding this email. If we collect and use your personal data we will use it in accordance with our privacy policy, which can be reviewed at https://www.mwam.com/privacy-policy.
> >
> > Marshall Wace LLP is authorised and regulated by the Financial Conduct Authority. Marshall Wace LLP is a limited liability partnership registered in England and Wales with registered number OC302228 and registered office at George House, 131 Sloane Street, London, SW1X 9AT. If you are receiving this e-mail as a client, or an investor in an investment vehicle, managed or advised by Marshall Wace North America L.P., the sender of this e-mail is communicating with you in the sender's capacity as an associated or related person of Marshall Wace North America L.P., which is registered with the US Securities and Exchange Commission as an investment adviser.
>
> This e-mail and any attachments are confidential to the addressee(s) and may contain information that is legally privileged and/or confidential. Please refer to http://www.mwam.com/email-disclaimer-uk for important disclosures regarding this email. If we collect and use your personal data we will use it in accordance with our privacy policy, which can be reviewed at https://www.mwam.com/privacy-policy .
>
> Marshall Wace LLP is authorised and regulated by the Financial Conduct Authority. Marshall Wace LLP is a limited liability partnership registered in England and Wales with registered number OC302228 and registered office at George House, 131 Sloane Street, London, SW1X 9AT. If you are receiving this e-mail as a client, or an investor in an investment vehicle, managed or advised by Marshall Wace North America L.P., the sender of this e-mail is communicating with you in the sender's capacity as an associated or related person of Marshall Wace North America L.P., which is registered with the US Securities and Exchange Commission as an investment adviser.

RE: REST API randomly returns Not Found for an existing job

Posted by Tomasz Dudziak <t....@mwam.com>.
Yes, the job was running and the REST server as well. No JobMaster failures noticed.
I used a test cluster deployed on a bunch of VM's and bare metal boxes.
I am afraid, I can no longer reproduce this issue. It occurred a couple days ago and lasted for an entire day with jobs being quite often erratically reported as Not Found. As I said, I noticed that another query immediately after the one that returned Not Found consistently returned a correct result.
It had never occurred before and I am afraid now I could no longer observe it again. I appreciate it does not give too much information so I will come back with more info on this thread if it happens again.

-----Original Message-----
From: Kostas Kloudas <kk...@gmail.com> 
Sent: 24 July 2020 15:46
To: Tomasz Dudziak <t....@mwam.com>
Cc: user@flink.apache.org; Chesnay Schepler <ch...@apache.org>
Subject: Re: REST API randomly returns Not Found for an existing job

Hi Tomasz,

Thanks a lot for reporting this issue. If you have verified that the job is running AND that the REST server is also up and running (e.g.
check the overview page) then I think that this should not be happening. I am cc'ing Chesnay who may have an additional opinion on this.

Cheers,
Kostas

On Thu, Jul 23, 2020 at 12:59 PM Tomasz Dudziak <t....@mwam.com> wrote:
>
> Hi,
>
>
>
> I have come across an issue related to GET /job/:jobId endpoint from monitoring REST API in Flink 1.9.0. A few seconds after successfully starting a job and confirming its status as RUNNING, that endpoint would return 404 (Not Found). Interestingly, querying immediately again (literally a millisecond later) would return a valid result. I later noticed a similar behaviour in regard to a finished job as well. At certain points in time that endpoint would arbitrarily return 404, but similarly querying again would succeed. I saw this strange behaviour only recently and it used to work fine before.
>
>
>
> Do you know what could be the root cause of this? At the moment, as a 
> workaround I just query a job a couple of times in a row to ensure 
> whether it definitely does not exist or it is just being misreported 
> as non-existent, but this feels a bit like cottage industry…
>
>
>
> Kind regards,
>
> Tomasz
>
>
>
> Tomasz Dudziak | Marshall Wace LLP, George House, 131 Sloane Street, 
> London, SW1X 9AT | E-mail: t.dudziak@mwam.com | Tel: +44 207 024 7061
>
>
>
>
>
> This e-mail and any attachments are confidential to the addressee(s) and may contain information that is legally privileged and/or confidential. Please refer to http://www.mwam.com/email-disclaimer-uk for important disclosures regarding this email. If we collect and use your personal data we will use it in accordance with our privacy policy, which can be reviewed at https://www.mwam.com/privacy-policy.
>
> Marshall Wace LLP is authorised and regulated by the Financial Conduct Authority. Marshall Wace LLP is a limited liability partnership registered in England and Wales with registered number OC302228 and registered office at George House, 131 Sloane Street, London, SW1X 9AT. If you are receiving this e-mail as a client, or an investor in an investment vehicle, managed or advised by Marshall Wace North America L.P., the sender of this e-mail is communicating with you in the sender's capacity as an associated or related person of Marshall Wace North America L.P., which is registered with the US Securities and Exchange Commission as an investment adviser.

This e-mail and any attachments are confidential to the addressee(s) and may contain information that is legally privileged and/or confidential. Please refer to http://www.mwam.com/email-disclaimer-uk for important disclosures regarding this email. If we collect and use your personal data we will use it in accordance with our privacy policy, which can be reviewed at https://www.mwam.com/privacy-policy .

Marshall Wace LLP is authorised and regulated by the Financial Conduct Authority. Marshall Wace LLP is a limited liability partnership registered in England and Wales with registered number OC302228 and registered office at George House, 131 Sloane Street, London, SW1X 9AT. If you are receiving this e-mail as a client, or an investor in an investment vehicle, managed or advised by Marshall Wace North America L.P., the sender of this e-mail is communicating with you in the sender's capacity as an associated or related person of Marshall Wace North America L.P., which is registered with the US Securities and Exchange Commission as an investment adviser.

Re: REST API randomly returns Not Found for an existing job

Posted by Kostas Kloudas <kk...@gmail.com>.
Hi Tomasz,

Thanks a lot for reporting this issue. If you have verified that the
job is running AND that the REST server is also up and running (e.g.
check the overview page) then I think that this should not be
happening. I am cc'ing Chesnay who may have an additional opinion on
this.

Cheers,
Kostas

On Thu, Jul 23, 2020 at 12:59 PM Tomasz Dudziak <t....@mwam.com> wrote:
>
> Hi,
>
>
>
> I have come across an issue related to GET /job/:jobId endpoint from monitoring REST API in Flink 1.9.0. A few seconds after successfully starting a job and confirming its status as RUNNING, that endpoint would return 404 (Not Found). Interestingly, querying immediately again (literally a millisecond later) would return a valid result. I later noticed a similar behaviour in regard to a finished job as well. At certain points in time that endpoint would arbitrarily return 404, but similarly querying again would succeed. I saw this strange behaviour only recently and it used to work fine before.
>
>
>
> Do you know what could be the root cause of this? At the moment, as a workaround I just query a job a couple of times in a row to ensure whether it definitely does not exist or it is just being misreported as non-existent, but this feels a bit like cottage industry…
>
>
>
> Kind regards,
>
> Tomasz
>
>
>
> Tomasz Dudziak | Marshall Wace LLP, George House, 131 Sloane Street, London, SW1X 9AT | E-mail: t.dudziak@mwam.com | Tel: +44 207 024 7061
>
>
>
>
>
> This e-mail and any attachments are confidential to the addressee(s) and may contain information that is legally privileged and/or confidential. Please refer to http://www.mwam.com/email-disclaimer-uk for important disclosures regarding this email. If we collect and use your personal data we will use it in accordance with our privacy policy, which can be reviewed at https://www.mwam.com/privacy-policy.
>
> Marshall Wace LLP is authorised and regulated by the Financial Conduct Authority. Marshall Wace LLP is a limited liability partnership registered in England and Wales with registered number OC302228 and registered office at George House, 131 Sloane Street, London, SW1X 9AT. If you are receiving this e-mail as a client, or an investor in an investment vehicle, managed or advised by Marshall Wace North America L.P., the sender of this e-mail is communicating with you in the sender's capacity as an associated or related person of Marshall Wace North America L.P., which is registered with the US Securities and Exchange Commission as an investment adviser.