You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by es70 <eu...@gmail.com> on 2018/09/13 11:45:55 UTC

./control.sh --baseline doesn't list nodes on a linux machines

I have a cluster of RHEL machines (3 in total) and run Ignite 2.6 on them.
The nodes seem to see each other since the visor shows them in the TOP
command
<http://apache-ignite-users.70518.x6.nabble.com/file/t2034/visor_-_top.jpg>
so I want to join the nodes to a cluster. I run the ./control.sh --baseline
command on one of the machines to see the list of the available nodes. The
command gives me an empty list. -------------[root@adp-apachelg01 bin]#
./control.sh --baselineControl utility [ver.
2.6.0#20180710-sha1:669feacc]2018 Copyright(C) Apache Software
FoundationUser:
root--------------------------------------------------------------------------------Cluster
state: inactiveCurrent topology version: 4Baseline nodes not
found.----------------The question - what's wrong with it? Please find
attached the config file which I use to start the nodes  cluster-ignite.xml
<http://apache-ignite-users.70518.x6.nabble.com/file/t2034/cluster-ignite.xml>  
cluster-default.xml
<http://apache-ignite-users.70518.x6.nabble.com/file/t2034/cluster-default.xml> 
before running the nodes I ran the service.sh file to amend the firewall
setting  service.sh
<http://apache-ignite-users.70518.x6.nabble.com/file/t2034/service.sh> 
----------------------------Before running Ignite on the linux machines I
did the same on the Windows machines - the ./control.sh --baseline command
gave me the expected listing of the nodes and was able to join them in a
cluster-----------------------One thing to note - I also did the following
thing - I started the visor and a node (with the same config as above) on my
Windows machine. The ./control.sh --baseline command listed me the running
nodes. The visor showed my Windows nodesThen I started my Linux nodes. The
control.sh command stopped to show the nodes. The TOP command in the Visor
showed just my Linus nodes Please advise!!regards,Evgeny



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by ezhuravlev <e....@gmail.com>.
>By the way - I wonder what is the right way to reset an Ignite instance to 
the initial state (the one I have when Ignite is just installed)? 
You should delete work and db directories.

>I could not activate the new cluster. When control.sh 
--baseline tried to connect to the cluster it hung 

Can you share the logs from the nodes?

I'd recommend to check all processes on the machine - topology version
you've mentioned could be caused by the running visor somewhere.

Evgenii



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by es70 <eu...@gmail.com>.
before I got the cluster to the active state I had deleted the /work folders
on the 3 machines.

Then I deleted the /work folders once again, added a Windows machine
(amended configs and firewall accordingly) to those 3 so to have a mixed
cluster of 4 machines. I could not activate the new cluster. When control.sh
--baseline tried to connect to the cluster it hung

By the way - I wonder what is the right way to reset an Ignite instance to
the initial state (the one I have when Ignite is just installed)?

I used servise.sh set_firewall to add routes to the firewall, I added new
ports to that file. So my iptable looks now like this

Chain INPUT_direct (1 references)
num  target     prot opt source               destination
1    ACCEPT     tcp  --  anywhere             anywhere             multiport
sports memcache:11220,47500:47509,47100:47109,58500:58520
2    ACCEPT     udp  --  anywhere             anywhere             multiport
sports 47400:47409
3    ACCEPT     tcp  --  anywhere             anywhere             multiport
dports memcache:11220,47500:47509,47100:47109,58500:58520
4    ACCEPT     udp  --  anywhere             anywhere             multiport
dports 47400:47409
5    ACCEPT     all  --  anywhere             anywhere             PKTTYPE =
multicast
6    ACCEPT     tcp  --  anywhere             anywhere             multiport
sports memcache:11220,47500:47509,47100:47109,58500:58520,58100:58109
7    ACCEPT     udp  --  anywhere             anywhere             multiport
sports 47400:47409,58400:58409
8    ACCEPT     tcp  --  anywhere             anywhere             multiport
dports memcache:11220,47500:47509,47100:47109,58500:58520,58100:58109
9    ACCEPT     udp  --  anywhere             anywhere             multiport
dports 47400:47409,58400:58409
10   ACCEPT     tcp  --  anywhere             anywhere             multiport
sports gap,memcache:11220,47500:47509,47100:47109,58500:58520,58100:58109
11   ACCEPT     tcp  --  anywhere             anywhere             multiport
dports gap,memcache:11220,47500:47509,47100:47109,58500:58520,58100:58109
12   ACCEPT     tcp  --  anywhere             anywhere             multiport
sports
gap:10801,memcache:11220,47500:47509,47100:47109,58500:58520,58100:58109
13   ACCEPT     tcp  --  anywhere             anywhere             multiport
dports
gap:10801,memcache:11220,47500:47509,47100:47109,58500:58520,58100:58109


regards, 
Evgeny



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by ezhuravlev <e....@gmail.com>.
If you just started 3 nodes and you have topology version 20, then, probably,
you have started visor started somewhere, because even daemon nodes can
store topology version.

Evgenii



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by Alex Plehanov <pl...@gmail.com>.
In your previous post was a message: "Current topology version: 20". It
does not look like a cluster with 3 nodes just started. So, I think the
cluster was not restarted correctly, it still was in compatibility mode.
In your last post, it was restarted correctly and not in compatibility mode
anymore. But if you connect visor to the cluster, it will be switched to
compatibility mode again.

пт, 14 сент. 2018 г. в 11:09, es70 <eu...@gmail.com>:

> UPDATE to the previous post - after my last message I started the visor and
> saw that there were some ignite processes running. Looks like that started
> some before. So I killed those leftovers and started the cluster from
> fresh.
> Now I can see the baseline
>
> [root@adp-apachelg01 bin]# ./control.sh --baseline
> Control utility [ver. 2.6.0#20180710-sha1:669feacc]
> 2018 Copyright(C) Apache Software Foundation
> User: root
>
> --------------------------------------------------------------------------------
> Cluster state: active
> Current topology version: 3
>
> Baseline nodes:
>     ConsistentID=4b799cb2-3953-48fb-b25f-88ac40ce0726, STATE=ONLINE
>     ConsistentID=d8acbf66-f6e2-4876-b3f0-caf3846b50f5, STATE=ONLINE
>     ConsistentID=dbc9fd4e-f2d0-4e65-85c6-bdd3fdbd7d6d, STATE=ONLINE
>
> --------------------------------------------------------------------------------
> Number of baseline nodes: 3
>
> Other nodes not found.
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by es70 <eu...@gmail.com>.
UPDATE to the previous post - after my last message I started the visor and
saw that there were some ignite processes running. Looks like that started
some before. So I killed those leftovers and started the cluster from fresh.
Now I can see the baseline

[root@adp-apachelg01 bin]# ./control.sh --baseline
Control utility [ver. 2.6.0#20180710-sha1:669feacc]
2018 Copyright(C) Apache Software Foundation
User: root
--------------------------------------------------------------------------------
Cluster state: active
Current topology version: 3

Baseline nodes:
    ConsistentID=4b799cb2-3953-48fb-b25f-88ac40ce0726, STATE=ONLINE
    ConsistentID=d8acbf66-f6e2-4876-b3f0-caf3846b50f5, STATE=ONLINE
    ConsistentID=dbc9fd4e-f2d0-4e65-85c6-bdd3fdbd7d6d, STATE=ONLINE
--------------------------------------------------------------------------------
Number of baseline nodes: 3

Other nodes not found.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by es70 <eu...@gmail.com>.
Hi Alex

I did it accordingly to your advice. Well the cluster seems to be active but
I still can't see the baseline nodes.

[root@adp-apachelg01 bin]# ./control.sh --baseline
Control utility [ver. 2.6.0#20180710-sha1:669feacc]
2018 Copyright(C) Apache Software Foundation
User: root
--------------------------------------------------------------------------------
Cluster state: active
Current topology version: 20

Baseline nodes not found.

regards,
Evgeny



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by Alex Plehanov <pl...@gmail.com>.
There is also the bug (ticket [1]) related to daemon nodes (visor joins the
cluster as a daemon node). Visor switches cluster to compatibility mode, in
this mode baseline topology is not supported. You can read more details
here: [2]. I think this bug can be related to your case too.
Please shutdown visor, restart the cluster and try to print baseline nodes
again.

[1]: https://issues.apache.org/jira/browse/IGNITE-8774
[2]:
http://apache-ignite-users.70518.x6.nabble.com/Node-with-BaselineTopology-cannot-join-mixed-cluster-running-in-compatibility-mode-td22200.html

чт, 13 сент. 2018 г. в 15:54, es70 <eu...@gmail.com>:

> Alex, please note that I followed the cluster activation routine on a
> windows
> machine before doing the same on a linux and I succeded in activating the
> cluster
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by es70 <eu...@gmail.com>.
Alex, please note that I followed the cluster activation routine on a windows
machine before doing the same on a linux and I succeded in activating the
cluster



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by es70 <eu...@gmail.com>.
Hi Alex

Right! Missed to mentioned about activating the cluster. I did try to
activate the cluster. I called ./control.sh --activate. Got the answer
"Cluster activated".  But the  ./control.sh --baseline command still gave me
an empty list. The top command said in the summary that  Active = true

[root@adp-apachelg01 bin]# ./control.sh --baseline
Control utility [ver. 2.6.0#20180710-sha1:669feacc]
2018 Copyright(C) Apache Software Foundation
User: root
--------------------------------------------------------------------------------
Cluster state: active
Current topology version: 3

Baseline nodes not found.


Please advise

regards,
Evgeny



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: ./control.sh --baseline doesn't list nodes on a linux machines

Posted by Alex Plehanov <pl...@gmail.com>.
Hello Eugene,

Did you activate this cluster before?
Baseline topology can be set explicitly (via control.sh script or Java API)
or implicitly after the first activation. Currently, your cluster is not
activated. If cluster never was activated before and you didn't set
baseline topology explicitly, then the list of baseline nodes will be empty.

чт, 13 сент. 2018 г. в 14:46, es70 <eu...@gmail.com>:

> I have a cluster of RHEL machines (3 in total) and run Ignite 2.6 on them.
> The nodes seem to see each other since the visor shows them in the TOP
> command so I want to join the nodes to a cluster. I run the ./control.sh
> --baseline command on one of the machines to see the list of the available
> nodes. The command gives me an empty list. -------------
> [root@adp-apachelg01 bin]# ./control.sh --baseline Control utility [ver.
> 2.6.0#20180710-sha1:669feacc] 2018 Copyright(C) Apache Software Foundation
> User: root
> --------------------------------------------------------------------------------
> Cluster state: inactive Current topology version: 4 Baseline nodes not
> found. ---------------- The question - what's wrong with it? Please find
> attached the config file which I use to start the nodes cluster-ignite.xml
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2034/cluster-ignite.xml>
> cluster-default.xml
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2034/cluster-default.xml>
> before running the nodes I ran the service.sh file to amend the firewall
> setting service.sh
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2034/service.sh>
> ---------------------------- Before running Ignite on the linux machines I
> did the same on the Windows machines - the ./control.sh --baseline command
> gave me the expected listing of the nodes and was able to join them in a
> cluster ----------------------- One thing to note - I also did the
> following thing - I started the visor and a node (with the same config as
> above) on my Windows machine. The ./control.sh --baseline command listed me
> the running nodes. The visor showed my Windows nodes Then I started my
> Linux nodes. The control.sh command stopped to show the nodes. The TOP
> command in the Visor showed just my Linus nodes Please advise!! regards,
> Evgeny
> ------------------------------
> Sent from the Apache Ignite Users mailing list archive
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>