You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Haripriya Ayyalasomayajula <ah...@gmail.com> on 2016/12/05 21:34:32 UTC

Mesos 1.1 web ui issues

Hi all,

I have two issues with the web UI in Mesos 1.1

1.

Earlier when I was using Mesos 0.28, mesos web UI would try to reconnect
only when there are network issues or when there is a newly elected leader.
After upgrade to 1.1, we see that it won't work (shows no leader is elected
even when there is a leader elected and jobs are running happily ) on
safari, works on chrome and firefox but tries to re-connect very often
(less than 1 min though the  jobs are running just fine).

Is there any new configuration that has to be added?


2. The UI does not display the name of the cluster despite using the
--cluster flag.

/usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/mesos
--port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
--authenticate_frameworks=true --cluster="cluster1"
--credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/mesos


I also tried adding the name of the cluster without quotes: cluster1
instead of "cluster1", but that doesn't work either.

/usr/sbin/mesos-master
--zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/mesos --port=5050
--log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
--authenticate_frameworks=true --cluster=cluster1
--credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/mesos
I greatly appreciate any help!

-- 
Thanks,
Haripriya

Re: Mesos 1.1 web ui issues

Posted by haosdent <ha...@gmail.com>.
Hi, @haripriya Ping me in Mesos Slack (https://mesos.slack.com/) when you
are available, I think it would speed up the progress to solve your
problem. My id is @haosdent. If you have not join Mesos Slack before, you
could join it via https://mesos-slackin.herokuapp.com .

On Tue, Dec 20, 2016 at 2:22 AM, Haripriya Ayyalasomayajula <
aharipriya92@gmail.com> wrote:

> Hi @Haosdent,
>
> We have multiple networks- that could be one of the problems. I tried with
> all 3 of them and it still shows the same error. Can you help me understand
> what hostname exactly expects in such scenario?
>
> On Thu, Dec 15, 2016 at 6:08 PM, haosdent <ha...@gmail.com> wrote:
>
>> Hi, @haripriya What's the hostname flag that you use to start master?
>> According to the screenshot you posted before, I think you need to set it
>> to something like `socrates-nid000xxx.us.cray.com`.
>> However, the error log you post above, you set the hostname flag to
>> nid00016 which could not be resolved.
>>
>> On Fri, Dec 16, 2016 at 6:51 AM, Haripriya Ayyalasomayajula <
>> aharipriya92@gmail.com> wrote:
>>
>>> Hello @Haosdent,
>>>
>>> After I tried to use hostname, I still see the error. This is the output
>>> I see in developer tools for chrome:
>>>
>>> Failed to load resource: the server responded with a status of 404 (Not
>>> Found)
>>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._2 Failed
>>> to load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/master/state?jsonp=angular.callbacks._3 Failed to
>>> load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/master/state?jsonp=angular.callbacks._4 Failed to
>>> load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._5 Failed
>>> to load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/master/state?jsonp=angular.callbacks._6 Failed to
>>> load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._7 Failed
>>> to load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/master/state?jsonp=angular.callbacks._8 Failed to
>>> load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._9 Failed
>>> to load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/master/state?jsonp=angular.callbacks._a Failed to
>>> load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._b Failed
>>> to load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/master/state?jsonp=angular.callbacks._c Failed to
>>> load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._d Failed
>>> to load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/master/state?jsonp=angular.callbacks._e Failed to
>>> load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._f Failed
>>> to load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/master/state?jsonp=angular.callbacks._g Failed to
>>> load resource: net::ERR_NAME_NOT_RESOLVED
>>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._h Failed
>>> to load resource: net::ERR_NAME_NOT_RESOLVED
>>> angular-1.2.3.min.js:70 GET http://nid00016:5050/master/st
>>> ate?jsonp=angular.callbacks._i net::ERR_NAME_NOT_RESOLVEDg @
>>> angular-1.2.3.min.js:70(anonymous function) @ angular-1.2.3.min.js:71D
>>> @ angular-1.2.3.min.js:68h @ angular-1.2.3.min.js:66D @
>>> angular-1.2.3.min.js:91D @ angular-1.2.3.min.js:91(anonymous function)
>>> @ angular-1.2.3.min.js:93$eval @ angular-1.2.3.min.js:101$digest @
>>> angular-1.2.3.min.js:98$apply @ angular-1.2.3.min.js:101(anonymous
>>> function) @ angular-1.2.3.min.js:111e @ angular-1.2.3.min.js:33(anonymous
>>> function) @ angular-1.2.3.min.js:37
>>> angular-1.2.3.min.js:70 GET http://nid00016:5050/metrics/s
>>> napshot?jsonp=angular.callbacks._j net::ERR_NAME_NOT_RESOLVEDg @
>>> angular-1.2.3.min.js:70(anonymous function) @ angular-1.2.3.min.js:71D
>>> @ angular-1.2.3.min.js:68h @ angular-1.2.3.min.js:66D @
>>> angular-1.2.3.min.js:91D @ angular-1.2.3.min.js:91(anonymous function)
>>> @ angular-1.2.3.min.js:93$eval @ angular-1.2.3.min.js:101$digest @
>>> angular-1.2.3.min.js:98$apply @ angular-1.2.3.min.js:101(anonymous
>>> function) @ angular-1.2.3.min.js:111e @ angular-1.2.3.min.js:33(anonymous
>>> function) @ angular-1.2.3.min.js:37
>>>
>>>
>>> Also, regarding the "cluster flag", here is my output:
>>>
>>> nid00016: root     14940  2.5  0.0 2080192 85012 ?       Ssl  16:44
>>> 0:08 /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181
>>> ,192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
>>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>>> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
>>> --hostname=nid00016 --quorum=2 --work_dir=/var/lib/mesos
>>>
>>> nid00016: root     14965  0.0  0.0 107892   612 ?        S    16:44
>>> 0:00 logger -p user.info -t mesos-master[14940]
>>>
>>> nid00016: root     14966  0.0  0.0 107892   692 ?        S    16:44
>>> 0:00 logger -p user.err -t mesos-master[14940]
>>>
>>> nid00016: root     15892  0.0  0.0 113116  1604 ?        Ss   16:50
>>> 0:00 bash -c ps -aux | grep mesos-master
>>>
>>> nid00016: root     15959  0.0  0.0 112644   948 ?        S    16:50
>>> 0:00 grep mesos-master
>>>
>>> nid00032: root     30018  2.5  0.0 2670032 26480 ?       Ssl  16:44
>>> 0:08 /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181
>>> ,192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
>>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>>> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
>>> --hostname=nid00032 --quorum=2 --work_dir=/var/lib/mesos
>>>
>>> nid00032: root     30043  0.0  0.0 107892   612 ?        S    16:44
>>> 0:00 logger -p user.info -t mesos-master[30018]
>>>
>>> nid00032: root     30044  0.0  0.0 107892   692 ?        S    16:44
>>> 0:00 logger -p user.err -t mesos-master[30018]
>>>
>>> nid00032: root     31091  0.0  0.0 113116  1604 ?        Ss   16:50
>>> 0:00 bash -c ps -aux | grep mesos-master
>>>
>>> nid00032: root     31158  0.0  0.0 112644   948 ?        S    16:50
>>> 0:00 grep mesos-master
>>>
>>> nid00000: root     49753  3.7  0.0 3259912 27584 ?       Ssl  16:44
>>> 0:13 /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181
>>> ,192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
>>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>>> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
>>> --hostname=nid00000.local --quorum=2 --work_dir=/var/lib/mesos
>>>
>>> nid00000: root     49778  0.0  0.0 107892   612 ?        S    16:44
>>> 0:00 logger -p user.info -t mesos-master[49753]
>>>
>>> nid00000: root     49779  0.0  0.0 107892   692 ?        S    16:44
>>> 0:00 logger -p user.err -t mesos-master[49753]
>>>
>>> nid00000: root     50887  0.0  0.0 113116  1604 ?        Ss   16:50
>>> 0:00 bash -c ps -aux | grep mesos-master
>>>
>>> nid00000: root     50954  0.0  0.0 112648   948 ?        S    16:50
>>> 0:00 grep mesos-master
>>>
>>> On Tue, Dec 6, 2016 at 6:58 PM, haosdent <ha...@gmail.com> wrote:
>>>
>>>> Hi, @Haripriya It looks like there are some problems in your master
>>>> flags.
>>>>
>>>> > I'm attaching a snapshot of the error I've seen in Chrome with this
>>>> email. It'll be great if you can suggest if I'm missing any configuration
>>>> or if its some bug.
>>>> According to the screenshot you attached, the hostnames are incorrect
>>>> on your servers. Mesos WebUI depends on that to find the leading master.
>>>> A workaround is to specific the `--hostname` flag when starting your
>>>> masters. For example, launch your masters with
>>>>
>>>> ```
>>>> $ mesos-master --hostname=socrates-nid000xxx.us.cray.com xxx
>>>> ```
>>>>
>>>> > Is it something to do with a stale state of mesos anywhere or the
>>>> way I'm passing cluster? I have a config file named cluster in
>>>> /etc/mesos-master/ and when I restart the cluster it picks up the config
>>>> files.
>>>>
>>>> You need to ensure the flags of every master contains
>>>> `--cluster=your_cluster_name`.
>>>>
>>>> Could you perform `ps aux |grep mesos-master` on every master and paste
>>>> their outputs here?
>>>>
>>>>
>>>> On Wed, Dec 7, 2016 at 4:39 AM, Haripriya Ayyalasomayajula <
>>>> aharipriya92@gmail.com> wrote:
>>>>
>>>>> Hello, @Haosdent,
>>>>>
>>>>> Thanks for suggesting these.
>>>>> I'm attaching a snapshot of the error I've seen in Chrome with this
>>>>> email. It'll be great if you can suggest if I'm missing any configuration
>>>>> or if its some bug.
>>>>>
>>>>> And for the second part, my `/master/state` end point does not return
>>>>> "cluster" anywhere. It returned 75k lines of json so I'm not pasting all of
>>>>> it.
>>>>> {
>>>>>     "activated_slaves": 37.0,
>>>>>     "build_date": "2016-11-16 01:31:49",
>>>>>     "build_time": 1479259909.0,
>>>>>     "build_user": "centos",
>>>>>     "completed_frameworks": [
>>>>>         {
>>>>>             "active": true,
>>>>>   ..........
>>>>>
>>>>>
>>>>>
>>>>>     "start_time": 1480967418.42687,
>>>>>     "unregistered_frameworks": [],
>>>>>     "version": "1.1.0"
>>>>> }
>>>>>
>>>>> Is it something to do with a stale state of mesos anywhere or the way
>>>>> I'm passing cluster? I have a config file named cluster in
>>>>> /etc/mesos-master/ and when I restart the cluster it picks up the config
>>>>> files.
>>>>>
>>>>> On Mon, Dec 5, 2016 at 6:24 PM, haosdent <ha...@gmail.com> wrote:
>>>>>
>>>>>> Hi, @Haripriya
>>>>>>
>>>>>> > (less than 1 min though the  jobs are running just fine).
>>>>>> > Is there any new configuration that has to be added?
>>>>>>
>>>>>> We change to use JSONP to send requests in WebUI since 1.0 May I have
>>>>>> your error log in Safari, Chrome and Firefox?
>>>>>> You could open it via https://developers.google.
>>>>>> com/web/tools/chrome-devtools/console/
>>>>>>
>>>>>> > The UI does not display the name of the cluster despite using the
>>>>>> --cluster flag.
>>>>>> --cluster flag works fine for me. May you paste your `/master/state`
>>>>>> endpoint at the email, I would like to check the value of `cluster` field
>>>>>> in it.
>>>>>>
>>>>>> On Tue, Dec 6, 2016 at 5:34 AM, Haripriya Ayyalasomayajula <
>>>>>> aharipriya92@gmail.com> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I have two issues with the web UI in Mesos 1.1
>>>>>>>
>>>>>>> 1.
>>>>>>>
>>>>>>> Earlier when I was using Mesos 0.28, mesos web UI would try to
>>>>>>> reconnect only when there are network issues or when there is a newly
>>>>>>> elected leader. After upgrade to 1.1, we see that it won't work (shows no
>>>>>>> leader is elected even when there is a leader elected and jobs are running
>>>>>>> happily ) on safari, works on chrome and firefox but tries to re-connect
>>>>>>> very often (less than 1 min though the  jobs are running just fine).
>>>>>>>
>>>>>>> Is there any new configuration that has to be added?
>>>>>>>
>>>>>>>
>>>>>>> 2. The UI does not display the name of the cluster despite using the
>>>>>>> --cluster flag.
>>>>>>>
>>>>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2
>>>>>>> 181,mesos3:2181/mesos --port=5050 --log_dir=/var/log/mesos
>>>>>>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>>>>>>> --cluster="cluster1" --credentials=/etc/auth/credentials --quorum=2
>>>>>>> --work_dir=/var/lib/mesos
>>>>>>>
>>>>>>>
>>>>>>> I also tried adding the name of the cluster without quotes: cluster1
>>>>>>> instead of "cluster1", but that doesn't work either.
>>>>>>>
>>>>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2
>>>>>>> 181,mesos3:2181/mesos --port=5050 --log_dir=/var/log/mesos --acl
>>>>>>> s=/etc/mesos_acls.json --authenticate_frameworks=true
>>>>>>> --cluster=cluster1 --credentials=/etc/auth/credentials --quorum=2
>>>>>>> --work_dir=/var/lib/mesos
>>>>>>> I greatly appreciate any help!
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>> Haripriya
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> Haosdent Huang
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Haripriya
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Haosdent Huang
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Haripriya Ayyalasomayajula
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>
>
> --
> Regards,
> Haripriya Ayyalasomayajula
>
>


-- 
Best Regards,
Haosdent Huang

Re: Mesos 1.1 web ui issues

Posted by Haripriya Ayyalasomayajula <ah...@gmail.com>.
Hi @Haosdent,

We have multiple networks- that could be one of the problems. I tried with
all 3 of them and it still shows the same error. Can you help me understand
what hostname exactly expects in such scenario?

On Thu, Dec 15, 2016 at 6:08 PM, haosdent <ha...@gmail.com> wrote:

> Hi, @haripriya What's the hostname flag that you use to start master?
> According to the screenshot you posted before, I think you need to set it
> to something like `socrates-nid000xxx.us.cray.com`.
> However, the error log you post above, you set the hostname flag to
> nid00016 which could not be resolved.
>
> On Fri, Dec 16, 2016 at 6:51 AM, Haripriya Ayyalasomayajula <
> aharipriya92@gmail.com> wrote:
>
>> Hello @Haosdent,
>>
>> After I tried to use hostname, I still see the error. This is the output
>> I see in developer tools for chrome:
>>
>> Failed to load resource: the server responded with a status of 404 (Not
>> Found)
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._2 Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._3 Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._4 Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._5 Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._6 Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._7 Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._8 Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._9 Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._a Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._b Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._c Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._d Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._e Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._f Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._g Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._h Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> angular-1.2.3.min.js:70 GET http://nid00016:5050/master/st
>> ate?jsonp=angular.callbacks._i net::ERR_NAME_NOT_RESOLVEDg @
>> angular-1.2.3.min.js:70(anonymous function) @ angular-1.2.3.min.js:71D @
>> angular-1.2.3.min.js:68h @ angular-1.2.3.min.js:66D @
>> angular-1.2.3.min.js:91D @ angular-1.2.3.min.js:91(anonymous function) @
>> angular-1.2.3.min.js:93$eval @ angular-1.2.3.min.js:101$digest @
>> angular-1.2.3.min.js:98$apply @ angular-1.2.3.min.js:101(anonymous
>> function) @ angular-1.2.3.min.js:111e @ angular-1.2.3.min.js:33(anonymous
>> function) @ angular-1.2.3.min.js:37
>> angular-1.2.3.min.js:70 GET http://nid00016:5050/metrics/s
>> napshot?jsonp=angular.callbacks._j net::ERR_NAME_NOT_RESOLVEDg @
>> angular-1.2.3.min.js:70(anonymous function) @ angular-1.2.3.min.js:71D @
>> angular-1.2.3.min.js:68h @ angular-1.2.3.min.js:66D @
>> angular-1.2.3.min.js:91D @ angular-1.2.3.min.js:91(anonymous function) @
>> angular-1.2.3.min.js:93$eval @ angular-1.2.3.min.js:101$digest @
>> angular-1.2.3.min.js:98$apply @ angular-1.2.3.min.js:101(anonymous
>> function) @ angular-1.2.3.min.js:111e @ angular-1.2.3.min.js:33(anonymous
>> function) @ angular-1.2.3.min.js:37
>>
>>
>> Also, regarding the "cluster flag", here is my output:
>>
>> nid00016: root     14940  2.5  0.0 2080192 85012 ?       Ssl  16:44
>> 0:08 /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
>> 192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
>> --hostname=nid00016 --quorum=2 --work_dir=/var/lib/mesos
>>
>> nid00016: root     14965  0.0  0.0 107892   612 ?        S    16:44
>> 0:00 logger -p user.info -t mesos-master[14940]
>>
>> nid00016: root     14966  0.0  0.0 107892   692 ?        S    16:44
>> 0:00 logger -p user.err -t mesos-master[14940]
>>
>> nid00016: root     15892  0.0  0.0 113116  1604 ?        Ss   16:50
>> 0:00 bash -c ps -aux | grep mesos-master
>>
>> nid00016: root     15959  0.0  0.0 112644   948 ?        S    16:50
>> 0:00 grep mesos-master
>>
>> nid00032: root     30018  2.5  0.0 2670032 26480 ?       Ssl  16:44
>> 0:08 /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
>> 192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
>> --hostname=nid00032 --quorum=2 --work_dir=/var/lib/mesos
>>
>> nid00032: root     30043  0.0  0.0 107892   612 ?        S    16:44
>> 0:00 logger -p user.info -t mesos-master[30018]
>>
>> nid00032: root     30044  0.0  0.0 107892   692 ?        S    16:44
>> 0:00 logger -p user.err -t mesos-master[30018]
>>
>> nid00032: root     31091  0.0  0.0 113116  1604 ?        Ss   16:50
>> 0:00 bash -c ps -aux | grep mesos-master
>>
>> nid00032: root     31158  0.0  0.0 112644   948 ?        S    16:50
>> 0:00 grep mesos-master
>>
>> nid00000: root     49753  3.7  0.0 3259912 27584 ?       Ssl  16:44
>> 0:13 /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
>> 192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
>> --hostname=nid00000.local --quorum=2 --work_dir=/var/lib/mesos
>>
>> nid00000: root     49778  0.0  0.0 107892   612 ?        S    16:44
>> 0:00 logger -p user.info -t mesos-master[49753]
>>
>> nid00000: root     49779  0.0  0.0 107892   692 ?        S    16:44
>> 0:00 logger -p user.err -t mesos-master[49753]
>>
>> nid00000: root     50887  0.0  0.0 113116  1604 ?        Ss   16:50
>> 0:00 bash -c ps -aux | grep mesos-master
>>
>> nid00000: root     50954  0.0  0.0 112648   948 ?        S    16:50
>> 0:00 grep mesos-master
>>
>> On Tue, Dec 6, 2016 at 6:58 PM, haosdent <ha...@gmail.com> wrote:
>>
>>> Hi, @Haripriya It looks like there are some problems in your master
>>> flags.
>>>
>>> > I'm attaching a snapshot of the error I've seen in Chrome with this
>>> email. It'll be great if you can suggest if I'm missing any configuration
>>> or if its some bug.
>>> According to the screenshot you attached, the hostnames are incorrect on
>>> your servers. Mesos WebUI depends on that to find the leading master.
>>> A workaround is to specific the `--hostname` flag when starting your
>>> masters. For example, launch your masters with
>>>
>>> ```
>>> $ mesos-master --hostname=socrates-nid000xxx.us.cray.com xxx
>>> ```
>>>
>>> > Is it something to do with a stale state of mesos anywhere or the way
>>> I'm passing cluster? I have a config file named cluster in
>>> /etc/mesos-master/ and when I restart the cluster it picks up the config
>>> files.
>>>
>>> You need to ensure the flags of every master contains
>>> `--cluster=your_cluster_name`.
>>>
>>> Could you perform `ps aux |grep mesos-master` on every master and paste
>>> their outputs here?
>>>
>>>
>>> On Wed, Dec 7, 2016 at 4:39 AM, Haripriya Ayyalasomayajula <
>>> aharipriya92@gmail.com> wrote:
>>>
>>>> Hello, @Haosdent,
>>>>
>>>> Thanks for suggesting these.
>>>> I'm attaching a snapshot of the error I've seen in Chrome with this
>>>> email. It'll be great if you can suggest if I'm missing any configuration
>>>> or if its some bug.
>>>>
>>>> And for the second part, my `/master/state` end point does not return
>>>> "cluster" anywhere. It returned 75k lines of json so I'm not pasting all of
>>>> it.
>>>> {
>>>>     "activated_slaves": 37.0,
>>>>     "build_date": "2016-11-16 01:31:49",
>>>>     "build_time": 1479259909.0,
>>>>     "build_user": "centos",
>>>>     "completed_frameworks": [
>>>>         {
>>>>             "active": true,
>>>>   ..........
>>>>
>>>>
>>>>
>>>>     "start_time": 1480967418.42687,
>>>>     "unregistered_frameworks": [],
>>>>     "version": "1.1.0"
>>>> }
>>>>
>>>> Is it something to do with a stale state of mesos anywhere or the way
>>>> I'm passing cluster? I have a config file named cluster in
>>>> /etc/mesos-master/ and when I restart the cluster it picks up the config
>>>> files.
>>>>
>>>> On Mon, Dec 5, 2016 at 6:24 PM, haosdent <ha...@gmail.com> wrote:
>>>>
>>>>> Hi, @Haripriya
>>>>>
>>>>> > (less than 1 min though the  jobs are running just fine).
>>>>> > Is there any new configuration that has to be added?
>>>>>
>>>>> We change to use JSONP to send requests in WebUI since 1.0 May I have
>>>>> your error log in Safari, Chrome and Firefox?
>>>>> You could open it via https://developers.google.
>>>>> com/web/tools/chrome-devtools/console/
>>>>>
>>>>> > The UI does not display the name of the cluster despite using the
>>>>> --cluster flag.
>>>>> --cluster flag works fine for me. May you paste your `/master/state`
>>>>> endpoint at the email, I would like to check the value of `cluster` field
>>>>> in it.
>>>>>
>>>>> On Tue, Dec 6, 2016 at 5:34 AM, Haripriya Ayyalasomayajula <
>>>>> aharipriya92@gmail.com> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I have two issues with the web UI in Mesos 1.1
>>>>>>
>>>>>> 1.
>>>>>>
>>>>>> Earlier when I was using Mesos 0.28, mesos web UI would try to
>>>>>> reconnect only when there are network issues or when there is a newly
>>>>>> elected leader. After upgrade to 1.1, we see that it won't work (shows no
>>>>>> leader is elected even when there is a leader elected and jobs are running
>>>>>> happily ) on safari, works on chrome and firefox but tries to re-connect
>>>>>> very often (less than 1 min though the  jobs are running just fine).
>>>>>>
>>>>>> Is there any new configuration that has to be added?
>>>>>>
>>>>>>
>>>>>> 2. The UI does not display the name of the cluster despite using the
>>>>>> --cluster flag.
>>>>>>
>>>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>>>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>>>>> --authenticate_frameworks=true --cluster="cluster1"
>>>>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/
>>>>>> mesos
>>>>>>
>>>>>>
>>>>>> I also tried adding the name of the cluster without quotes: cluster1
>>>>>> instead of "cluster1", but that doesn't work either.
>>>>>>
>>>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>>>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>>>>> --authenticate_frameworks=true --cluster=cluster1
>>>>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/
>>>>>> mesos
>>>>>> I greatly appreciate any help!
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Haripriya
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Haosdent Huang
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Haripriya
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>
>>
>>
>> --
>> Regards,
>> Haripriya Ayyalasomayajula
>>
>>
>
>
> --
> Best Regards,
> Haosdent Huang
>



-- 
Regards,
Haripriya Ayyalasomayajula

Re: Mesos 1.1 web ui issues

Posted by haosdent <ha...@gmail.com>.
Hi, @haripriya What's the hostname flag that you use to start master?
According to the screenshot you posted before, I think you need to set it
to something like `socrates-nid000xxx.us.cray.com`.
However, the error log you post above, you set the hostname flag to
nid00016 which could not be resolved.

On Fri, Dec 16, 2016 at 6:51 AM, Haripriya Ayyalasomayajula <
aharipriya92@gmail.com> wrote:

> Hello @Haosdent,
>
> After I tried to use hostname, I still see the error. This is the output I
> see in developer tools for chrome:
>
> Failed to load resource: the server responded with a status of 404 (Not
> Found)
> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._2 Failed
> to load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/master/state?jsonp=angular.callbacks._3 Failed to
> load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/master/state?jsonp=angular.callbacks._4 Failed to
> load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._5 Failed
> to load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/master/state?jsonp=angular.callbacks._6 Failed to
> load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._7 Failed
> to load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/master/state?jsonp=angular.callbacks._8 Failed to
> load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._9 Failed
> to load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/master/state?jsonp=angular.callbacks._a Failed to
> load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._b Failed
> to load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/master/state?jsonp=angular.callbacks._c Failed to
> load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._d Failed
> to load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/master/state?jsonp=angular.callbacks._e Failed to
> load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._f Failed
> to load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/master/state?jsonp=angular.callbacks._g Failed to
> load resource: net::ERR_NAME_NOT_RESOLVED
> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._h Failed
> to load resource: net::ERR_NAME_NOT_RESOLVED
> angular-1.2.3.min.js:70 GET http://nid00016:5050/master/
> state?jsonp=angular.callbacks._i net::ERR_NAME_NOT_RESOLVEDg @
> angular-1.2.3.min.js:70(anonymous function) @ angular-1.2.3.min.js:71D @
> angular-1.2.3.min.js:68h @ angular-1.2.3.min.js:66D @
> angular-1.2.3.min.js:91D @ angular-1.2.3.min.js:91(anonymous function) @
> angular-1.2.3.min.js:93$eval @ angular-1.2.3.min.js:101$digest @
> angular-1.2.3.min.js:98$apply @ angular-1.2.3.min.js:101(anonymous
> function) @ angular-1.2.3.min.js:111e @ angular-1.2.3.min.js:33(anonymous
> function) @ angular-1.2.3.min.js:37
> angular-1.2.3.min.js:70 GET http://nid00016:5050/metrics/
> snapshot?jsonp=angular.callbacks._j net::ERR_NAME_NOT_RESOLVEDg @
> angular-1.2.3.min.js:70(anonymous function) @ angular-1.2.3.min.js:71D @
> angular-1.2.3.min.js:68h @ angular-1.2.3.min.js:66D @
> angular-1.2.3.min.js:91D @ angular-1.2.3.min.js:91(anonymous function) @
> angular-1.2.3.min.js:93$eval @ angular-1.2.3.min.js:101$digest @
> angular-1.2.3.min.js:98$apply @ angular-1.2.3.min.js:101(anonymous
> function) @ angular-1.2.3.min.js:111e @ angular-1.2.3.min.js:33(anonymous
> function) @ angular-1.2.3.min.js:37
>
>
> Also, regarding the "cluster flag", here is my output:
>
> nid00016: root     14940  2.5  0.0 2080192 85012 ?       Ssl  16:44   0:08
> /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
> 192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
> --hostname=nid00016 --quorum=2 --work_dir=/var/lib/mesos
>
> nid00016: root     14965  0.0  0.0 107892   612 ?        S    16:44   0:00
> logger -p user.info -t mesos-master[14940]
>
> nid00016: root     14966  0.0  0.0 107892   692 ?        S    16:44   0:00
> logger -p user.err -t mesos-master[14940]
>
> nid00016: root     15892  0.0  0.0 113116  1604 ?        Ss   16:50   0:00
> bash -c ps -aux | grep mesos-master
>
> nid00016: root     15959  0.0  0.0 112644   948 ?        S    16:50   0:00
> grep mesos-master
>
> nid00032: root     30018  2.5  0.0 2670032 26480 ?       Ssl  16:44   0:08
> /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
> 192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
> --hostname=nid00032 --quorum=2 --work_dir=/var/lib/mesos
>
> nid00032: root     30043  0.0  0.0 107892   612 ?        S    16:44   0:00
> logger -p user.info -t mesos-master[30018]
>
> nid00032: root     30044  0.0  0.0 107892   692 ?        S    16:44   0:00
> logger -p user.err -t mesos-master[30018]
>
> nid00032: root     31091  0.0  0.0 113116  1604 ?        Ss   16:50   0:00
> bash -c ps -aux | grep mesos-master
>
> nid00032: root     31158  0.0  0.0 112644   948 ?        S    16:50   0:00
> grep mesos-master
>
> nid00000: root     49753  3.7  0.0 3259912 27584 ?       Ssl  16:44   0:13
> /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
> 192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
> --hostname=nid00000.local --quorum=2 --work_dir=/var/lib/mesos
>
> nid00000: root     49778  0.0  0.0 107892   612 ?        S    16:44   0:00
> logger -p user.info -t mesos-master[49753]
>
> nid00000: root     49779  0.0  0.0 107892   692 ?        S    16:44   0:00
> logger -p user.err -t mesos-master[49753]
>
> nid00000: root     50887  0.0  0.0 113116  1604 ?        Ss   16:50   0:00
> bash -c ps -aux | grep mesos-master
>
> nid00000: root     50954  0.0  0.0 112648   948 ?        S    16:50   0:00
> grep mesos-master
>
> On Tue, Dec 6, 2016 at 6:58 PM, haosdent <ha...@gmail.com> wrote:
>
>> Hi, @Haripriya It looks like there are some problems in your master flags.
>>
>> > I'm attaching a snapshot of the error I've seen in Chrome with this
>> email. It'll be great if you can suggest if I'm missing any configuration
>> or if its some bug.
>> According to the screenshot you attached, the hostnames are incorrect on
>> your servers. Mesos WebUI depends on that to find the leading master.
>> A workaround is to specific the `--hostname` flag when starting your
>> masters. For example, launch your masters with
>>
>> ```
>> $ mesos-master --hostname=socrates-nid000xxx.us.cray.com xxx
>> ```
>>
>> > Is it something to do with a stale state of mesos anywhere or the way
>> I'm passing cluster? I have a config file named cluster in
>> /etc/mesos-master/ and when I restart the cluster it picks up the config
>> files.
>>
>> You need to ensure the flags of every master contains
>> `--cluster=your_cluster_name`.
>>
>> Could you perform `ps aux |grep mesos-master` on every master and paste
>> their outputs here?
>>
>>
>> On Wed, Dec 7, 2016 at 4:39 AM, Haripriya Ayyalasomayajula <
>> aharipriya92@gmail.com> wrote:
>>
>>> Hello, @Haosdent,
>>>
>>> Thanks for suggesting these.
>>> I'm attaching a snapshot of the error I've seen in Chrome with this
>>> email. It'll be great if you can suggest if I'm missing any configuration
>>> or if its some bug.
>>>
>>> And for the second part, my `/master/state` end point does not return
>>> "cluster" anywhere. It returned 75k lines of json so I'm not pasting all of
>>> it.
>>> {
>>>     "activated_slaves": 37.0,
>>>     "build_date": "2016-11-16 01:31:49",
>>>     "build_time": 1479259909.0,
>>>     "build_user": "centos",
>>>     "completed_frameworks": [
>>>         {
>>>             "active": true,
>>>   ..........
>>>
>>>
>>>
>>>     "start_time": 1480967418.42687,
>>>     "unregistered_frameworks": [],
>>>     "version": "1.1.0"
>>> }
>>>
>>> Is it something to do with a stale state of mesos anywhere or the way
>>> I'm passing cluster? I have a config file named cluster in
>>> /etc/mesos-master/ and when I restart the cluster it picks up the config
>>> files.
>>>
>>> On Mon, Dec 5, 2016 at 6:24 PM, haosdent <ha...@gmail.com> wrote:
>>>
>>>> Hi, @Haripriya
>>>>
>>>> > (less than 1 min though the  jobs are running just fine).
>>>> > Is there any new configuration that has to be added?
>>>>
>>>> We change to use JSONP to send requests in WebUI since 1.0 May I have
>>>> your error log in Safari, Chrome and Firefox?
>>>> You could open it via https://developers.google.
>>>> com/web/tools/chrome-devtools/console/
>>>>
>>>> > The UI does not display the name of the cluster despite using the
>>>> --cluster flag.
>>>> --cluster flag works fine for me. May you paste your `/master/state`
>>>> endpoint at the email, I would like to check the value of `cluster` field
>>>> in it.
>>>>
>>>> On Tue, Dec 6, 2016 at 5:34 AM, Haripriya Ayyalasomayajula <
>>>> aharipriya92@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have two issues with the web UI in Mesos 1.1
>>>>>
>>>>> 1.
>>>>>
>>>>> Earlier when I was using Mesos 0.28, mesos web UI would try to
>>>>> reconnect only when there are network issues or when there is a newly
>>>>> elected leader. After upgrade to 1.1, we see that it won't work (shows no
>>>>> leader is elected even when there is a leader elected and jobs are running
>>>>> happily ) on safari, works on chrome and firefox but tries to re-connect
>>>>> very often (less than 1 min though the  jobs are running just fine).
>>>>>
>>>>> Is there any new configuration that has to be added?
>>>>>
>>>>>
>>>>> 2. The UI does not display the name of the cluster despite using the
>>>>> --cluster flag.
>>>>>
>>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>>>> --authenticate_frameworks=true --cluster="cluster1"
>>>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/
>>>>> mesos
>>>>>
>>>>>
>>>>> I also tried adding the name of the cluster without quotes: cluster1
>>>>> instead of "cluster1", but that doesn't work either.
>>>>>
>>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>>>> --authenticate_frameworks=true --cluster=cluster1
>>>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/
>>>>> mesos
>>>>> I greatly appreciate any help!
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Haripriya
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Haosdent Huang
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Haripriya
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>
>
> --
> Regards,
> Haripriya Ayyalasomayajula
>
>


-- 
Best Regards,
Haosdent Huang

Re: Mesos 1.1 web ui issues

Posted by Haripriya Ayyalasomayajula <ah...@gmail.com>.
Hello @Haosdent,

After I tried to use hostname, I still see the error. This is the output I
see in developer tools for chrome:

Failed to load resource: the server responded with a status of 404 (Not
Found)
http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._2 Failed to
load resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/master/state?jsonp=angular.callbacks._3 Failed to load
resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/master/state?jsonp=angular.callbacks._4 Failed to load
resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._5 Failed to
load resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/master/state?jsonp=angular.callbacks._6 Failed to load
resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._7 Failed to
load resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/master/state?jsonp=angular.callbacks._8 Failed to load
resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._9 Failed to
load resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/master/state?jsonp=angular.callbacks._a Failed to load
resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._b Failed to
load resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/master/state?jsonp=angular.callbacks._c Failed to load
resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._d Failed to
load resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/master/state?jsonp=angular.callbacks._e Failed to load
resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._f Failed to
load resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/master/state?jsonp=angular.callbacks._g Failed to load
resource: net::ERR_NAME_NOT_RESOLVED
http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._h Failed to
load resource: net::ERR_NAME_NOT_RESOLVED
angular-1.2.3.min.js:70 GET
http://nid00016:5050/master/state?jsonp=angular.callbacks._i
net::ERR_NAME_NOT_RESOLVEDg @ angular-1.2.3.min.js:70(anonymous function) @
angular-1.2.3.min.js:71D @ angular-1.2.3.min.js:68h @
angular-1.2.3.min.js:66D @ angular-1.2.3.min.js:91D @
angular-1.2.3.min.js:91(anonymous function) @ angular-1.2.3.min.js:93$eval
@ angular-1.2.3.min.js:101$digest @ angular-1.2.3.min.js:98$apply @
angular-1.2.3.min.js:101(anonymous function) @ angular-1.2.3.min.js:111e @
angular-1.2.3.min.js:33(anonymous function) @ angular-1.2.3.min.js:37
angular-1.2.3.min.js:70 GET
http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._j
net::ERR_NAME_NOT_RESOLVEDg @ angular-1.2.3.min.js:70(anonymous function) @
angular-1.2.3.min.js:71D @ angular-1.2.3.min.js:68h @
angular-1.2.3.min.js:66D @ angular-1.2.3.min.js:91D @
angular-1.2.3.min.js:91(anonymous function) @ angular-1.2.3.min.js:93$eval
@ angular-1.2.3.min.js:101$digest @ angular-1.2.3.min.js:98$apply @
angular-1.2.3.min.js:101(anonymous function) @ angular-1.2.3.min.js:111e @
angular-1.2.3.min.js:33(anonymous function) @ angular-1.2.3.min.js:37


Also, regarding the "cluster flag", here is my output:

nid00016: root     14940  2.5  0.0 2080192 85012 ?       Ssl  16:44   0:08
/usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
--acls=/etc/mesos_acls.json --authenticate_frameworks=true
--cluster="socrates" --credentials=/etc/marathon-auth/credentials
--hostname=nid00016 --quorum=2 --work_dir=/var/lib/mesos

nid00016: root     14965  0.0  0.0 107892   612 ?        S    16:44   0:00
logger -p user.info -t mesos-master[14940]

nid00016: root     14966  0.0  0.0 107892   692 ?        S    16:44   0:00
logger -p user.err -t mesos-master[14940]

nid00016: root     15892  0.0  0.0 113116  1604 ?        Ss   16:50   0:00
bash -c ps -aux | grep mesos-master

nid00016: root     15959  0.0  0.0 112644   948 ?        S    16:50   0:00
grep mesos-master

nid00032: root     30018  2.5  0.0 2670032 26480 ?       Ssl  16:44   0:08
/usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
--acls=/etc/mesos_acls.json --authenticate_frameworks=true
--cluster="socrates" --credentials=/etc/marathon-auth/credentials
--hostname=nid00032 --quorum=2 --work_dir=/var/lib/mesos

nid00032: root     30043  0.0  0.0 107892   612 ?        S    16:44   0:00
logger -p user.info -t mesos-master[30018]

nid00032: root     30044  0.0  0.0 107892   692 ?        S    16:44   0:00
logger -p user.err -t mesos-master[30018]

nid00032: root     31091  0.0  0.0 113116  1604 ?        Ss   16:50   0:00
bash -c ps -aux | grep mesos-master

nid00032: root     31158  0.0  0.0 112644   948 ?        S    16:50   0:00
grep mesos-master

nid00000: root     49753  3.7  0.0 3259912 27584 ?       Ssl  16:44   0:13
/usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
--acls=/etc/mesos_acls.json --authenticate_frameworks=true
--cluster="socrates" --credentials=/etc/marathon-auth/credentials
--hostname=nid00000.local --quorum=2 --work_dir=/var/lib/mesos

nid00000: root     49778  0.0  0.0 107892   612 ?        S    16:44   0:00
logger -p user.info -t mesos-master[49753]

nid00000: root     49779  0.0  0.0 107892   692 ?        S    16:44   0:00
logger -p user.err -t mesos-master[49753]

nid00000: root     50887  0.0  0.0 113116  1604 ?        Ss   16:50   0:00
bash -c ps -aux | grep mesos-master

nid00000: root     50954  0.0  0.0 112648   948 ?        S    16:50   0:00
grep mesos-master

On Tue, Dec 6, 2016 at 6:58 PM, haosdent <ha...@gmail.com> wrote:

> Hi, @Haripriya It looks like there are some problems in your master flags.
>
> > I'm attaching a snapshot of the error I've seen in Chrome with this
> email. It'll be great if you can suggest if I'm missing any configuration
> or if its some bug.
> According to the screenshot you attached, the hostnames are incorrect on
> your servers. Mesos WebUI depends on that to find the leading master.
> A workaround is to specific the `--hostname` flag when starting your
> masters. For example, launch your masters with
>
> ```
> $ mesos-master --hostname=socrates-nid000xxx.us.cray.com xxx
> ```
>
> > Is it something to do with a stale state of mesos anywhere or the way
> I'm passing cluster? I have a config file named cluster in
> /etc/mesos-master/ and when I restart the cluster it picks up the config
> files.
>
> You need to ensure the flags of every master contains
> `--cluster=your_cluster_name`.
>
> Could you perform `ps aux |grep mesos-master` on every master and paste
> their outputs here?
>
>
> On Wed, Dec 7, 2016 at 4:39 AM, Haripriya Ayyalasomayajula <
> aharipriya92@gmail.com> wrote:
>
>> Hello, @Haosdent,
>>
>> Thanks for suggesting these.
>> I'm attaching a snapshot of the error I've seen in Chrome with this
>> email. It'll be great if you can suggest if I'm missing any configuration
>> or if its some bug.
>>
>> And for the second part, my `/master/state` end point does not return
>> "cluster" anywhere. It returned 75k lines of json so I'm not pasting all of
>> it.
>> {
>>     "activated_slaves": 37.0,
>>     "build_date": "2016-11-16 01:31:49",
>>     "build_time": 1479259909.0,
>>     "build_user": "centos",
>>     "completed_frameworks": [
>>         {
>>             "active": true,
>>   ..........
>>
>>
>>
>>     "start_time": 1480967418.42687,
>>     "unregistered_frameworks": [],
>>     "version": "1.1.0"
>> }
>>
>> Is it something to do with a stale state of mesos anywhere or the way I'm
>> passing cluster? I have a config file named cluster in /etc/mesos-master/
>> and when I restart the cluster it picks up the config files.
>>
>> On Mon, Dec 5, 2016 at 6:24 PM, haosdent <ha...@gmail.com> wrote:
>>
>>> Hi, @Haripriya
>>>
>>> > (less than 1 min though the  jobs are running just fine).
>>> > Is there any new configuration that has to be added?
>>>
>>> We change to use JSONP to send requests in WebUI since 1.0 May I have
>>> your error log in Safari, Chrome and Firefox?
>>> You could open it via https://developers.google.
>>> com/web/tools/chrome-devtools/console/
>>>
>>> > The UI does not display the name of the cluster despite using the
>>> --cluster flag.
>>> --cluster flag works fine for me. May you paste your `/master/state`
>>> endpoint at the email, I would like to check the value of `cluster` field
>>> in it.
>>>
>>> On Tue, Dec 6, 2016 at 5:34 AM, Haripriya Ayyalasomayajula <
>>> aharipriya92@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I have two issues with the web UI in Mesos 1.1
>>>>
>>>> 1.
>>>>
>>>> Earlier when I was using Mesos 0.28, mesos web UI would try to
>>>> reconnect only when there are network issues or when there is a newly
>>>> elected leader. After upgrade to 1.1, we see that it won't work (shows no
>>>> leader is elected even when there is a leader elected and jobs are running
>>>> happily ) on safari, works on chrome and firefox but tries to re-connect
>>>> very often (less than 1 min though the  jobs are running just fine).
>>>>
>>>> Is there any new configuration that has to be added?
>>>>
>>>>
>>>> 2. The UI does not display the name of the cluster despite using the
>>>> --cluster flag.
>>>>
>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>>> --authenticate_frameworks=true --cluster="cluster1"
>>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/
>>>> mesos
>>>>
>>>>
>>>> I also tried adding the name of the cluster without quotes: cluster1
>>>> instead of "cluster1", but that doesn't work either.
>>>>
>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>>> --authenticate_frameworks=true --cluster=cluster1
>>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/
>>>> mesos
>>>> I greatly appreciate any help!
>>>>
>>>> --
>>>> Thanks,
>>>> Haripriya
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>
>>
>>
>> --
>> Thanks,
>> Haripriya
>>
>>
>
>
> --
> Best Regards,
> Haosdent Huang
>



-- 
Regards,
Haripriya Ayyalasomayajula

Re: Mesos 1.1 web ui issues

Posted by haosdent <ha...@gmail.com>.
Hi, @Haripriya It looks like there are some problems in your master flags.

> I'm attaching a snapshot of the error I've seen in Chrome with this
email. It'll be great if you can suggest if I'm missing any configuration
or if its some bug.
According to the screenshot you attached, the hostnames are incorrect on
your servers. Mesos WebUI depends on that to find the leading master.
A workaround is to specific the `--hostname` flag when starting your
masters. For example, launch your masters with

```
$ mesos-master --hostname=socrates-nid000xxx.us.cray.com xxx
```

> Is it something to do with a stale state of mesos anywhere or the way I'm
passing cluster? I have a config file named cluster in /etc/mesos-master/
and when I restart the cluster it picks up the config files.

You need to ensure the flags of every master contains
`--cluster=your_cluster_name`.

Could you perform `ps aux |grep mesos-master` on every master and paste
their outputs here?


On Wed, Dec 7, 2016 at 4:39 AM, Haripriya Ayyalasomayajula <
aharipriya92@gmail.com> wrote:

> Hello, @Haosdent,
>
> Thanks for suggesting these.
> I'm attaching a snapshot of the error I've seen in Chrome with this email.
> It'll be great if you can suggest if I'm missing any configuration or if
> its some bug.
>
> And for the second part, my `/master/state` end point does not return
> "cluster" anywhere. It returned 75k lines of json so I'm not pasting all of
> it.
> {
>     "activated_slaves": 37.0,
>     "build_date": "2016-11-16 01:31:49",
>     "build_time": 1479259909.0,
>     "build_user": "centos",
>     "completed_frameworks": [
>         {
>             "active": true,
>   ..........
>
>
>
>     "start_time": 1480967418.42687,
>     "unregistered_frameworks": [],
>     "version": "1.1.0"
> }
>
> Is it something to do with a stale state of mesos anywhere or the way I'm
> passing cluster? I have a config file named cluster in /etc/mesos-master/
> and when I restart the cluster it picks up the config files.
>
> On Mon, Dec 5, 2016 at 6:24 PM, haosdent <ha...@gmail.com> wrote:
>
>> Hi, @Haripriya
>>
>> > (less than 1 min though the  jobs are running just fine).
>> > Is there any new configuration that has to be added?
>>
>> We change to use JSONP to send requests in WebUI since 1.0 May I have
>> your error log in Safari, Chrome and Firefox?
>> You could open it via https://developers.google.
>> com/web/tools/chrome-devtools/console/
>>
>> > The UI does not display the name of the cluster despite using the
>> --cluster flag.
>> --cluster flag works fine for me. May you paste your `/master/state`
>> endpoint at the email, I would like to check the value of `cluster` field
>> in it.
>>
>> On Tue, Dec 6, 2016 at 5:34 AM, Haripriya Ayyalasomayajula <
>> aharipriya92@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I have two issues with the web UI in Mesos 1.1
>>>
>>> 1.
>>>
>>> Earlier when I was using Mesos 0.28, mesos web UI would try to reconnect
>>> only when there are network issues or when there is a newly elected leader.
>>> After upgrade to 1.1, we see that it won't work (shows no leader is elected
>>> even when there is a leader elected and jobs are running happily ) on
>>> safari, works on chrome and firefox but tries to re-connect very often
>>> (less than 1 min though the  jobs are running just fine).
>>>
>>> Is there any new configuration that has to be added?
>>>
>>>
>>> 2. The UI does not display the name of the cluster despite using the
>>> --cluster flag.
>>>
>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>> --authenticate_frameworks=true --cluster="cluster1"
>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/mesos
>>>
>>>
>>> I also tried adding the name of the cluster without quotes: cluster1
>>> instead of "cluster1", but that doesn't work either.
>>>
>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>> --authenticate_frameworks=true --cluster=cluster1
>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/mesos
>>> I greatly appreciate any help!
>>>
>>> --
>>> Thanks,
>>> Haripriya
>>>
>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>
>
> --
> Thanks,
> Haripriya
>
>


-- 
Best Regards,
Haosdent Huang

Re: Mesos 1.1 web ui issues

Posted by Haripriya Ayyalasomayajula <ah...@gmail.com>.
Hello, @Haosdent,

Thanks for suggesting these.
I'm attaching a snapshot of the error I've seen in Chrome with this email.
It'll be great if you can suggest if I'm missing any configuration or if
its some bug.

And for the second part, my `/master/state` end point does not return
"cluster" anywhere. It returned 75k lines of json so I'm not pasting all of
it.
{
    "activated_slaves": 37.0,
    "build_date": "2016-11-16 01:31:49",
    "build_time": 1479259909.0,
    "build_user": "centos",
    "completed_frameworks": [
        {
            "active": true,
  ..........



    "start_time": 1480967418.42687,
    "unregistered_frameworks": [],
    "version": "1.1.0"
}

Is it something to do with a stale state of mesos anywhere or the way I'm
passing cluster? I have a config file named cluster in /etc/mesos-master/
and when I restart the cluster it picks up the config files.

On Mon, Dec 5, 2016 at 6:24 PM, haosdent <ha...@gmail.com> wrote:

> Hi, @Haripriya
>
> > (less than 1 min though the  jobs are running just fine).
> > Is there any new configuration that has to be added?
>
> We change to use JSONP to send requests in WebUI since 1.0 May I have your
> error log in Safari, Chrome and Firefox?
> You could open it via https://developers.google.
> com/web/tools/chrome-devtools/console/
>
> > The UI does not display the name of the cluster despite using the
> --cluster flag.
> --cluster flag works fine for me. May you paste your `/master/state`
> endpoint at the email, I would like to check the value of `cluster` field
> in it.
>
> On Tue, Dec 6, 2016 at 5:34 AM, Haripriya Ayyalasomayajula <
> aharipriya92@gmail.com> wrote:
>
>> Hi all,
>>
>> I have two issues with the web UI in Mesos 1.1
>>
>> 1.
>>
>> Earlier when I was using Mesos 0.28, mesos web UI would try to reconnect
>> only when there are network issues or when there is a newly elected leader.
>> After upgrade to 1.1, we see that it won't work (shows no leader is elected
>> even when there is a leader elected and jobs are running happily ) on
>> safari, works on chrome and firefox but tries to re-connect very often
>> (less than 1 min though the  jobs are running just fine).
>>
>> Is there any new configuration that has to be added?
>>
>>
>> 2. The UI does not display the name of the cluster despite using the
>> --cluster flag.
>>
>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>> --authenticate_frameworks=true --cluster="cluster1"
>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/mesos
>>
>>
>> I also tried adding the name of the cluster without quotes: cluster1
>> instead of "cluster1", but that doesn't work either.
>>
>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>> --authenticate_frameworks=true --cluster=cluster1
>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/mesos
>> I greatly appreciate any help!
>>
>> --
>> Thanks,
>> Haripriya
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>



-- 
Thanks,
Haripriya

Re: Mesos 1.1 web ui issues

Posted by haosdent <ha...@gmail.com>.
Hi, @Haripriya

> (less than 1 min though the  jobs are running just fine).
> Is there any new configuration that has to be added?

We change to use JSONP to send requests in WebUI since 1.0 May I have your
error log in Safari, Chrome and Firefox?
You could open it via
https://developers.google.com/web/tools/chrome-devtools/console/

> The UI does not display the name of the cluster despite using the
--cluster flag.
--cluster flag works fine for me. May you paste your `/master/state`
endpoint at the email, I would like to check the value of `cluster` field
in it.

On Tue, Dec 6, 2016 at 5:34 AM, Haripriya Ayyalasomayajula <
aharipriya92@gmail.com> wrote:

> Hi all,
>
> I have two issues with the web UI in Mesos 1.1
>
> 1.
>
> Earlier when I was using Mesos 0.28, mesos web UI would try to reconnect
> only when there are network issues or when there is a newly elected leader.
> After upgrade to 1.1, we see that it won't work (shows no leader is elected
> even when there is a leader elected and jobs are running happily ) on
> safari, works on chrome and firefox but tries to re-connect very often
> (less than 1 min though the  jobs are running just fine).
>
> Is there any new configuration that has to be added?
>
>
> 2. The UI does not display the name of the cluster despite using the
> --cluster flag.
>
> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/mesos
> --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
> --authenticate_frameworks=true --cluster="cluster1" --credentials=/etc/auth/credentials
> --quorum=2 --work_dir=/var/lib/mesos
>
>
> I also tried adding the name of the cluster without quotes: cluster1
> instead of "cluster1", but that doesn't work either.
>
> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/mesos
>  --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
> --authenticate_frameworks=true --cluster=cluster1 --credentials=/etc/auth/credentials
> --quorum=2 --work_dir=/var/lib/mesos
> I greatly appreciate any help!
>
> --
> Thanks,
> Haripriya
>



-- 
Best Regards,
Haosdent Huang