You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Jon Marshall <jm...@hotmail.co.uk> on 2018/06/04 14:29:20 UTC

advanced networking with public IPs direct to VMs

Hi all


I am close to giving up on basic networking as I just cannot get failover working with multiple NICs (I am not even sure it is supported).


What I would like is to use 3 NICs for management, storage and guest traffic. I would like to assign public IPs direct to the VMs which is why I originally chose basic.


If I switch to advanced networking do I just configure a guest VM with public IPs on one NIC and not both with the public traffic -


would this work ?

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Sorry that should say "not bother with the public traffic"


________________________________
From: Jon Marshall <jm...@hotmail.co.uk>
Sent: 04 June 2018 15:29
To: users@cloudstack.apache.org
Subject: advanced networking with public IPs direct to VMs

Hi all


I am close to giving up on basic networking as I just cannot get failover working with multiple NICs (I am not even sure it is supported).


What I would like is to use 3 NICs for management, storage and guest traffic. I would like to assign public IPs direct to the VMs which is why I originally chose basic.


If I switch to advanced networking do I just configure a guest VM with public IPs on one NIC and not both with the public traffic -


would this work ?

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Hi Dag


Thanks for joining in.


I did use a separate network for management (172.30.3.0/27) and storage (172.30.5.0/28) when I configured the zone it is just for some reason it is not referencing the 172.30.5.x subnet anywhere in the SQL output.


My compute nodes have 3 NICs, one for management, one for guest VM traffic and one for storage, all different subnets and in different vlans on the switch.


I also set it up with two NICs just as you suggested with storage/management on one NIC and guest traffic on the other NIC and I got exactly the same result ie. host in "Alert" state and this from logs -



2018-06-04 12:53:45,853 WARN  [c.c.h.KVMInvestigator] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) Agent investigation was requested on host Host[-2-Routing], but host does not support investigation because it has no NFS storage. Skipping investigation.
2018-06-04 12:53:45,854 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) KVMInvestigator was able to determine host 2 is in Disconnected
2018-06-04 12:53:45,854 INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) The agent from host 2 state determined is Disconnected
2018-06-04 12:53:45,854 WARN  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) Agent is disconnected but the host is still up: 2-dcp-cscn2.local
2018-06-04 12:53:45,854 WARN  [o.a.c.alerts] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) AlertType:: 7 | dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host disconnected, name: dcp-cscn2.local (id:2), availability zone: dcp1, pod: dcpp1


the only difference was when I configured the zone I did not have to configure cloudbr2 (for storage) and did not enter any storage traffic IP subnet range.

I know it is something stupid I am doing 😊


________________________________
From: Dag Sonstebo <Da...@shapeblue.com>
Sent: 06 June 2018 10:49
To: users@cloudstack.apache.org
Subject: Re: advanced networking with public IPs direct to VMs

Hi John,

I’m late to this thread and have possibly missed some things – but a couple of observations:

“When I add the zone and get to the storage web page I exclude the IPs already used for the compute node NICs and the NFS server itself. …..”
“So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10 -> 172.30.5.14.”

I think you may have some confusion around the use of the storage network. The important part here is to understand this is for *secondary storage* use only – it has nothing to do with primary storage. This means this storage network needs to be accessible to the SSVM, to the hypervisors, and secondary storage NFS pools needs to be accessible on this network.

The important part – this also means you *can not use the same IP ranges for management and storage networks* - doing so means you will have issues where effectively both hypervisors and SSVM can see the same subnet on two NICs – and you end up in a routing black hole.

So – you need to either:

1) Use different IP subnets on management and storage, or
2) preferably just simplify your setup – stop using a secondary storage network altogether and just allow secondary storage to use the management network (which is default). Unless you have a very high I/O environment in production you are just adding complexity by running separate management and storage.

Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue

On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk> wrote:

    I will disconnect the host this morning and test but before I do that I ran this command when all hosts are up -





     select * from cloud.host;
    +----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
    | id | name            | uuid                                 | status | type               | private_ip_address | private_netmask | private_mac_address | storage_ip_address | storage_netmask | storage_mac_address | storage_ip_address_2 | storage_mac_address_2 | storage_netmask_2 | cluster_id | public_ip_address | public_netmask  | public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets | cpus | speed | url                                 | fs_type | hypervisor_type | hypervisor_version | ram        | resource | version  | parent | total_size | capabilities | guid                                                          | available | setup | dom0_memory | last_ping  | mgmt_server_id | disconnected        | created             | removed | update_count | resource_state | owner | lastUpdated | engine_state |
    +----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
    |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up     | Routing            | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.3        | 255.255.255.128 | 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |    2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource |         1 |     0 |           0 | 1492390408 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 | Enabled        | NULL  | NULL        | Disabled     |
    |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up     | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 | 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 | 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.98       | 255.255.255.128 | 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | Proxy.2-ConsoleProxyResource                                  |         1 |     0 |           0 | 1492390409 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
    |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up     | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 | 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 | 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.86       | 255.255.255.128 | 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | s-1-VM-NfsSecondaryStorageResource                            |         1 |     0 |           0 | 1492390407 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
    |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up     | Routing            | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.4        | 255.255.255.128 | 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |    2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource |         1 |     0 |           0 | 1492450882 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 | Enabled        | NULL  | NULL        | Disabled     |
    |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up     | Routing            | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.5        | 255.255.255.128 | 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |    2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource |         1 |     0 |           0 | 1492390408 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 | Enabled        | NULL  | NULL        | Disabled     |
    +----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
    5 rows in set (0.00 sec)



    and you can see that it says the storage IP address is the same as the private IP address (the management network).


    I also ran the command you provided using the Cluster ID number from the table above -



    mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is not null;
    Empty set (0.00 sec)

    mysql>

    So assuming I am reading this correctly that seems to be the issue.


    I am at a loss as to why though.


    I have a separate NIC for storage as described. When I add the zone and get to the storage web page I exclude the IPs already used for the compute node NICs and the NFS server itself. I do this because initially I didn't and the SSVM started using the IP address of the NFS server.


    So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10 -> 172.30.5.14.


    And I used the label "cloudbr2" for storage.


    I must be doing this wrong somehow.


    Any pointers would be much appreciated.




    ________________________________
    From: Rafael Weingärtner <ra...@gmail.com>
    Sent: 05 June 2018 16:13
    To: users
    Subject: Re: advanced networking with public IPs direct to VMs

    That is interesting. Let's see the source of all truth...
    This is the code that is generating that odd message.

    >     List<StoragePoolVO> clusterPools =
    > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
    >         boolean hasNfs = false;
    >         for (StoragePoolVO pool : clusterPools) {
    >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem) {
    >                 hasNfs = true;
    >                 break;
    >             }
    >         }
    >         if (!hasNfs) {
    >             s_logger.warn(
    >                     "Agent investigation was requested on host " + agent +
    > ", but host does not support investigation because it has no NFS storage.
    > Skipping investigation.");
    >             return Status.Disconnected;
    >         }
    >

    There are two possibilities here. You do not have any NFS storage? Is that
    the case? Or maybe, for some reason, the call
    "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not returning
    any NFS storage pools. Looking at the "listPoolsByCluster " we will see
    that the following SQL is used:

    Select * from storage_pool where cluster_id = <host'sClusterId> and removed
    > is not null
    >

    Can you run that SQL to see the its return when your hosts are marked as
    disconnected?


Dag.Sonstebo@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue



On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

    > I reran the tests with the 3 NIC setup. When I configured the zone through
    > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
    > traffic and cloudbr2 for NFS as per my original response to you.
    >
    >
    > When I pull the power to the node (dcp-cscn2.local) after about 5 mins
    > the  host status goes to "Alert" but never to "Down"
    >
    >
    > I get this in the logs -
    >
    >
    > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was
    > requested on host Host[-4-Routing], but host does not support investigation
    > because it has no NFS storage. Skipping investigation.
    > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was able to
    > determine host 4 is in Disconnected
    > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4 state
    > determined is Disconnected
    > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected but
    > the host is still up: 4-dcp-cscn2.local
    >
    > I don't understand why it thinks there is no NFS storage as each compute
    > node has a dedicated storage NIC.
    >
    >
    > I also don't understand why it thinks the host is still up ie. what test
    > is it doing to determine that ?
    >
    >
    > Am I just trying to get something working that is not supported ?
    >
    >
    > ________________________________
    > From: Rafael Weingärtner <ra...@gmail.com>
    > Sent: 04 June 2018 15:31
    > To: users
    > Subject: Re: advanced networking with public IPs direct to VMs
    >
    > What type of failover are you talking about?
    > What ACS version are you using?
    > What hypervisor are you using?
    > How are you configuring your NICs in the hypervisor?
    > How are you configuring the traffic labels in ACS?
    >
    > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
    > wrote:
    >
    > > Hi all
    > >
    > >
    > > I am close to giving up on basic networking as I just cannot get failover
    > > working with multiple NICs (I am not even sure it is supported).
    > >
    > >
    > > What I would like is to use 3 NICs for management, storage and guest
    > > traffic. I would like to assign public IPs direct to the VMs which is
    > why I
    > > originally chose basic.
    > >
    > >
    > > If I switch to advanced networking do I just configure a guest VM with
    > > public IPs on one NIC and not both with the public traffic -
    > >
    > >
    > > would this work ?
    > >
    >
    >
    >
    > --
    > Rafael Weingärtner
    >



    --
    Rafael Weingärtner



Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Dag


Am not an SQL expert by any means but does this not show hosts are in cluster 1 -


mysql> select name, cluster_id from cloud.host;
+-----------------+------------+
| name            | cluster_id |
+-----------------+------------+
| dcp-cscn1.local |          1 |
| v-2-VM          |       NULL |
| s-1-VM          |       NULL |
| dcp-cscn2.local |          1 |
| dcp-cscn3.local |          1 |
+-----------------+------------+
5 rows in set (0.00 sec)

mysql>

I only have one cluster and those are the hosts I am using.


Jon


________________________________
From: Dag Sonstebo <Da...@shapeblue.com>
Sent: 06 June 2018 19:06
To: users@cloudstack.apache.org
Subject: Re: advanced networking with public IPs direct to VMs

Hi Jon,

Still confused where your primary storage pools are – are you sure your hosts are in cluster 1?

Quick question just to make sure - assuming management/storage is on the same NIC when I setup basic networking the physical network has the management and guest icons already there and I just edit the KVM labels. If I am running storage over management do I need to drag the storage icon to the physical network and use the same KVM label (cloudbr0) as the management or does CS automatically just use the management NIC ie. I would only need to drag the storage icon across in basic setup if I wanted it on a different NIC/IP subnet ?  (hope that makes sense !)

>> I would do both – set up your 2/3 physical networks, name isn’t that important – but then drag the traffic types to the correct one and make sure the labels are correct.
Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue

On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:

    Dag


    Do you mean  check the pools with "Infrastructure -> Primary Storage" and "Infrastructure -> Secondary Storage" within the UI ?


    If so Primary Storage has a state of UP, secondary storage does not show a state as such so not sure where else to check it ?


    Rerun of the command -

    mysql> select * from cloud.storage_pool where cluster_id = 1;
    Empty set (0.00 sec)

    mysql>

    I think it is something to do with my zone creation rather than the NIC, bridge setup although I can post those if needed.

    I may try to setup just the 2 NIC solution you mentioned although as I say I had the same issue with that ie. host goes to "Altert" state and same error messages.  The only time I can get it to go to "Down" state is when it is all on the single NIC.

    Quick question just to make sure - assuming management/storage is on the same NIC when I setup basic networking the physical network has the management and guest icons already there and I just edit the KVM labels. If I am running storage over management do I need to drag the storage icon to the physical network and use the same KVM label (cloudbr0) as the management or does CS automatically just use the management NIC ie. I would only need to drag the storage icon across in basic setup if I wanted it on a different NIC/IP subnet ?  (hope that makes sense !)

    On the plus side I have been at this for so long now and done so many rebuilds I could do it in my sleep now 😊


    ________________________________
    From: Dag Sonstebo <Da...@shapeblue.com>
    Sent: 06 June 2018 12:28
    To: users@cloudstack.apache.org
    Subject: Re: advanced networking with public IPs direct to VMs

    Looks OK to me Jon.

    The one thing that throws me is your storage pools – can you rerun your query: select * from cloud.storage_pool where cluster_id = 1;

    Do the pools show up as online in the CloudStack GUI?

    Regards,
    Dag Sonstebo
    Cloud Architect
    ShapeBlue

    On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk> wrote:

        Don't know whether this helps or not but I logged into the SSVM and ran an ifconfig -


        eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
                inet 169.254.3.35  netmask 255.255.0.0  broadcast 169.254.255.255
                ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
                RX packets 141  bytes 20249 (19.7 KiB)
                RX errors 0  dropped 0  overruns 0  frame 0
                TX packets 108  bytes 16287 (15.9 KiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
                inet 172.30.3.34  netmask 255.255.255.192  broadcast 172.30.3.63
                ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
                RX packets 56722  bytes 4953133 (4.7 MiB)
                RX errors 0  dropped 44573  overruns 0  frame 0
                TX packets 11224  bytes 1234932 (1.1 MiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
                inet 172.30.4.86  netmask 255.255.255.128  broadcast 172.30.4.127
                ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
                RX packets 366191  bytes 435300557 (415.1 MiB)
                RX errors 0  dropped 39456  overruns 0  frame 0
                TX packets 145065  bytes 7978602 (7.6 MiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
                inet 172.30.5.14  netmask 255.255.255.240  broadcast 172.30.5.15
                ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
                RX packets 132440  bytes 426362982 (406.6 MiB)
                RX errors 0  dropped 39446  overruns 0  frame 0
                TX packets 67443  bytes 423670834 (404.0 MiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

        lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
                inet 127.0.0.1  netmask 255.0.0.0
                loop  txqueuelen 1  (Local Loopback)
                RX packets 18  bytes 1440 (1.4 KiB)
                RX errors 0  dropped 0  overruns 0  frame 0
                TX packets 18  bytes 1440 (1.4 KiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


        so it has interfaces in both the management and the storage subnets (as well as guest).



        ________________________________
        From: Jon Marshall <jm...@hotmail.co.uk>
        Sent: 06 June 2018 11:08
        To: users@cloudstack.apache.org
        Subject: Re: advanced networking with public IPs direct to VMs

        Hi Rafael


        Thanks for the help, really appreciate it.


        So rerunning that command with all servers up -



        mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is null;
        Empty set (0.00 sec)

        mysql>


        As for the storage IP no I'm not setting it to be the management IP when I setup the zone but the output of the SQL command suggests that is what has happened.

        As I said to Dag I am using a different subnet for storage ie.

        172.30.3.0/26  - management subnet
        172.30.4.0/25 -  guest VM subnet
        172.30.5.0/28 - storage

        the NFS server IP is 172.30.5.2

        each compute node has 3 NICs with an IP from each subnet (i am assuming the management node only needs an IP in the management network ?)

        When I add the zone in the UI I have one physical network with management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
        When I fill in the storage traffic page I use the range 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the compute nodes and the NFS server.

        I think maybe I am doing something wrong in the UI setup but it is not obvious to me what it is.

        What I might try today unless you want me to keep the setup I have for more outputs is to go back to 2 NICs, one for storage/management and one for guest VMs.

        I think with the 2 NICs setup the mistake I made last time when adding the zone was to assume storage would just run over management so I did not drag and drop the storage icon and assign it to cloudbr0 as with the management which I think is what I should do ?





        ________________________________
        From: Rafael Weingärtner <ra...@gmail.com>
        Sent: 06 June 2018 10:54
        To: users
        Subject: Re: advanced networking with public IPs direct to VMs

        Jon, do not panic we are here to help you :)
        So, I might have mistyped the SQL query. You you use select * from
        cloud.storage_pool where cluster_id = 1 and removed is not null ", you are
        listing the storage pools removed. Therefore, the right query would be "
        select * from cloud.storage_pool where cluster_id = 1 and removed is null "

        There is also something else I do not understand. You are setting the
        storage IP in the management subnet? I am not sure if you should be doing
        like this. Normally, I set all my storages (primary[when working with NFS]
        and secondary) to IPs in the storage subnet.

        On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <Da...@shapeblue.com>
        wrote:

        > Hi John,
        >
        > I’m late to this thread and have possibly missed some things – but a
        > couple of observations:
        >
        > “When I add the zone and get to the storage web page I exclude the IPs
        > already used for the compute node NICs and the NFS server itself. …..”
        > “So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10
        > -> 172.30.5.14.”
        >
        > I think you may have some confusion around the use of the storage network.
        > The important part here is to understand this is for *secondary storage*
        > use only – it has nothing to do with primary storage. This means this
        > storage network needs to be accessible to the SSVM, to the hypervisors, and
        > secondary storage NFS pools needs to be accessible on this network.
        >
        > The important part – this also means you *can not use the same IP ranges
        > for management and storage networks* - doing so means you will have issues
        > where effectively both hypervisors and SSVM can see the same subnet on two
        > NICs – and you end up in a routing black hole.
        >
        > So – you need to either:
        >
        > 1) Use different IP subnets on management and storage, or
        > 2) preferably just simplify your setup – stop using a secondary storage
        > network altogether and just allow secondary storage to use the management
        > network (which is default). Unless you have a very high I/O environment in
        > production you are just adding complexity by running separate management
        > and storage.
        >
        > Regards,
        > Dag Sonstebo
        > Cloud Architect
        > ShapeBlue
        >
        > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
        >
        >     I will disconnect the host this morning and test but before I do that
        > I ran this command when all hosts are up -
        >
        >
        >
        >
        >
        >      select * from cloud.host;
        >     +----+-----------------+------------------------------------
        > --+--------+--------------------+--------------------+------
        > -----------+---------------------+--------------------+-----
        > ------------+---------------------+----------------------+--
        > ---------------------+-------------------+------------+-----
        > --------------+-----------------+--------------------+------
        > ------+----------------+--------+-------------+------+------
        > -+-------------------------------------+---------+----------
        > -------+--------------------+------------+----------+-------
        > ---+--------+------------+--------------+-------------------
        > --------------------------------------------+-----------+---
        > ----+-------------+------------+----------------+-----------
        > ----------+---------------------+---------+--------------+--
        > --------------+-------+-------------+--------------+
        >     | id | name            | uuid                                 | status
        > | type               | private_ip_address | private_netmask |
        > private_mac_address | storage_ip_address | storage_netmask |
        > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
        > storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
        > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
        > cpus | speed | url                                 | fs_type |
        > hypervisor_type | hypervisor_version | ram        | resource | version  |
        > parent | total_size | capabilities | guid
        >                         | available | setup | dom0_memory | last_ping  |
        > mgmt_server_id | disconnected        | created             | removed |
        > update_count | resource_state | owner | lastUpdated | engine_state |
        >     +----+-----------------+------------------------------------
        > --+--------+--------------------+--------------------+------
        > -----------+---------------------+--------------------+-----
        > ------------+---------------------+----------------------+--
        > ---------------------+-------------------+------------+-----
        > --------------+-----------------+--------------------+------
        > ------+----------------+--------+-------------+------+------
        > -+-------------------------------------+---------+----------
        > -------+--------------------+------------+----------+-------
        > ---+--------+------------+--------------+-------------------
        > --------------------------------------------+-----------+---
        > ----+-------------+------------+----------------+-----------
        > ----------+---------------------+---------+--------------+--
        > --------------+-------+-------------+--------------+
        >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up
        >  | Routing            | 172.30.3.3         | 255.255.255.192 |
        > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
        > 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
        >             |          1 | 172.30.4.3        | 255.255.255.128 |
        > 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
        >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
        >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
        >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
        > |         1 |     0 |           0 | 1492390408 |   146457912294 |
        > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up
        >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
        > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
        > 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL
        >             |       NULL | 172.30.4.98       | 255.255.255.128 |
        > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL |
        > NULL |  NULL | NoIqn                               | NULL    | NULL
        >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
        >  NULL | NULL         | Proxy.2-ConsoleProxyResource
        >           |         1 |     0 |           0 | 1492390409 |   146457912294 |
        > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up
        >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
        > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
        > 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL
        >             |       NULL | 172.30.4.86       | 255.255.255.128 |
        > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL |
        > NULL |  NULL | NoIqn                               | NULL    | NULL
        >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
        >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
        >             |         1 |     0 |           0 | 1492390407 |   146457912294
        > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up
        >  | Routing            | 172.30.3.4         | 255.255.255.192 |
        > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
        > 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
        >             |          1 | 172.30.4.4        | 255.255.255.128 |
        > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
        >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
        >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
        >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
        > |         1 |     0 |           0 | 1492450882 |   146457912294 |
        > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up
        >  | Routing            | 172.30.3.5         | 255.255.255.192 |
        > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
        > 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
        >             |          1 | 172.30.4.5        | 255.255.255.128 |
        > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
        >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
        >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
        >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
        > |         1 |     0 |           0 | 1492390408 |   146457912294 |
        > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     +----+-----------------+------------------------------------
        > --+--------+--------------------+--------------------+------
        > -----------+---------------------+--------------------+-----
        > ------------+---------------------+----------------------+--
        > ---------------------+-------------------+------------+-----
        > --------------+-----------------+--------------------+------
        > ------+----------------+--------+-------------+------+------
        > -+-------------------------------------+---------+----------
        > -------+--------------------+------------+----------+-------
        > ---+--------+------------+--------------+-------------------
        > --------------------------------------------+-----------+---
        > ----+-------------+------------+----------------+-----------
        > ----------+---------------------+---------+--------------+--
        > --------------+-------+-------------+--------------+
        >     5 rows in set (0.00 sec)
        >
        >
        >
        >     and you can see that it says the storage IP address is the same as the
        > private IP address (the management network).
        >
        >
        >     I also ran the command you provided using the Cluster ID number from
        > the table above -
        >
        >
        >
        >     mysql> select * from cloud.storage_pool where cluster_id = 1 and
        > removed is not null;
        >     Empty set (0.00 sec)
        >
        >     mysql>
        >
        >     So assuming I am reading this correctly that seems to be the issue.
        >
        >
        >     I am at a loss as to why though.
        >
        >
        >     I have a separate NIC for storage as described. When I add the zone
        > and get to the storage web page I exclude the IPs already used for the
        > compute node NICs and the NFS server itself. I do this because initially I
        > didn't and the SSVM started using the IP address of the NFS server.
        >
        >
        >     So the range is 172.30.5.1 -> 15 and the range I fill in is
        > 172.30.5.10 -> 172.30.5.14.
        >
        >
        >     And I used the label "cloudbr2" for storage.
        >
        >
        >     I must be doing this wrong somehow.
        >
        >
        >     Any pointers would be much appreciated.
        >
        >
        >
        >
        >     ________________________________
        >     From: Rafael Weingärtner <ra...@gmail.com>
        >     Sent: 05 June 2018 16:13
        >     To: users
        >     Subject: Re: advanced networking with public IPs direct to VMs
        >
        >     That is interesting. Let's see the source of all truth...
        >     This is the code that is generating that odd message.
        >
        >     >     List<StoragePoolVO> clusterPools =
        >     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
        >     >         boolean hasNfs = false;
        >     >         for (StoragePoolVO pool : clusterPools) {
        >     >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem)
        > {
        >     >                 hasNfs = true;
        >     >                 break;
        >     >             }
        >     >         }
        >     >         if (!hasNfs) {
        >     >             s_logger.warn(
        >     >                     "Agent investigation was requested on host " +
        > agent +
        >     > ", but host does not support investigation because it has no NFS
        > storage.
        >     > Skipping investigation.");
        >     >             return Status.Disconnected;
        >     >         }
        >     >
        >
        >     There are two possibilities here. You do not have any NFS storage? Is
        > that
        >     the case? Or maybe, for some reason, the call
        >     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
        > returning
        >     any NFS storage pools. Looking at the "listPoolsByCluster " we will see
        >     that the following SQL is used:
        >
        >     Select * from storage_pool where cluster_id = <host'sClusterId> and
        > removed
        >     > is not null
        >     >
        >
        >     Can you run that SQL to see the its return when your hosts are marked
        > as
        >     disconnected?
        >
        >
        > Dag.Sonstebo@shapeblue.com
        > www.shapeblue.com<http://www.shapeblue.com>
        Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
        www.shapeblue.com<http://www.shapeblue.com>
        ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



        > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
        > @shapeblue
        >
        >
        >
        > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk>
        > wrote:
        >
        >     > I reran the tests with the 3 NIC setup. When I configured the zone
        > through
        >     > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
        >     > traffic and cloudbr2 for NFS as per my original response to you.
        >     >
        >     >
        >     > When I pull the power to the node (dcp-cscn2.local) after about 5
        > mins
        >     > the  host status goes to "Alert" but never to "Down"
        >     >
        >     >
        >     > I get this in the logs -
        >     >
        >     >
        >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
        >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation
        > was
        >     > requested on host Host[-4-Routing], but host does not support
        > investigation
        >     > because it has no NFS storage. Skipping investigation.
        >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
        >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was
        > able to
        >     > determine host 4 is in Disconnected
        >     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
        >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host
        > 4 state
        >     > determined is Disconnected
        >     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
        >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
        > disconnected but
        >     > the host is still up: 4-dcp-cscn2.local
        >     >
        >     > I don't understand why it thinks there is no NFS storage as each
        > compute
        >     > node has a dedicated storage NIC.
        >     >
        >     >
        >     > I also don't understand why it thinks the host is still up ie. what
        > test
        >     > is it doing to determine that ?
        >     >
        >     >
        >     > Am I just trying to get something working that is not supported ?
        >     >
        >     >
        >     > ________________________________
        >     > From: Rafael Weingärtner <ra...@gmail.com>
        >     > Sent: 04 June 2018 15:31
        >     > To: users
        >     > Subject: Re: advanced networking with public IPs direct to VMs
        >     >
        >     > What type of failover are you talking about?
        >     > What ACS version are you using?
        >     > What hypervisor are you using?
        >     > How are you configuring your NICs in the hypervisor?
        >     > How are you configuring the traffic labels in ACS?
        >     >
        >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms.123@hotmail.co.uk
        > >
        >     > wrote:
        >     >
        >     > > Hi all
        >     > >
        >     > >
        >     > > I am close to giving up on basic networking as I just cannot get
        > failover
        >     > > working with multiple NICs (I am not even sure it is supported).
        >     > >
        >     > >
        >     > > What I would like is to use 3 NICs for management, storage and
        > guest
        >     > > traffic. I would like to assign public IPs direct to the VMs which
        > is
        >     > why I
        >     > > originally chose basic.
        >     > >
        >     > >
        >     > > If I switch to advanced networking do I just configure a guest VM
        > with
        >     > > public IPs on one NIC and not both with the public traffic -
        >     > >
        >     > >
        >     > > would this work ?
        >     > >
        >     >
        >     >
        >     >
        >     > --
        >     > Rafael Weingärtner
        >     >
        >
        >
        >
        >     --
        >     Rafael Weingärtner
        >
        >
        >


        --
        Rafael Weingärtner



    Dag.Sonstebo@shapeblue.com
    www.shapeblue.com<http://www.shapeblue.com>
    53 Chandos Place, Covent Garden, London  WC2N 4HSUK
    @shapeblue






Dag.Sonstebo@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
I did a quick run through and it looked like the same messages as I got with zone wide NFS when it didn't work.


I am going to do some more tests and capture full management logs so I can do a comparison to see if there are any differences and once I have done that I will redo the bug report.


Just to clarify the second point about labels.



When you configure the UI and use the manual setup with basic networking when you configure the physical network the "Management" and "Guest" icons are already under the physical network part and the "Storage" icon is under "Traffic types"


For both the 2 and 3 NIC setup I configure Management as cloudbr0 and Guest as cloudbr1.


For the 2 NIC setup that is all I do because by default storage runs across management so I assume I don't need to do anything else.


For the 3 NIC setup I then drag and drop the Storage icon onto the physical network part and configure it as cloudbr2.


Just wanted to make that clear in case I am doing it wrong.


Will let you know results of tests next week.



________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 15 June 2018 11:4
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Did you notice some problems in the log files when you tested with 2 NICs?
When using NFS cluster wide storage, the behavior should be the same as
with 3 NICs. There might be something in your configuration. The problem
for zone wide storage is what we discussed before though.

1) if I want to run the management/storage traffic over the same NIC the
NFS server needs to be in the management subnet
No. You should be able to setup different network ranges for each one of
them.

2) when I do the initial configuration I need to drag and drop the storage
icon and use the same label as the management traffic
If you are using only two NICs, for sure you need to configure the traffic
labels according. I mean, you have two only NICs, then you need to
configure the labels (cloudbr0 and cloudbr2) in that physical network tab
in the zone configuration.


On Thu, Jun 14, 2018 at 5:03 PM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> Hi Rafael
>
>
> I did log a bug but when rebuilding I found some slightly different
> behaviour so have temporarily removed it.
>
>
> So using cluster NFS and 3 NICs as already described VM HA works.
>
>
> Because the recommendation for basic network setup seems to be run
> storage/management over the same NIC and guest on another, so 2 NICs in
> total,  I set it up this way using cluster NFS and to my surprise VM HA did
> not work so it is obviously a bit more complicated than it first appeared.
>
>
> My NFS server is on a different subnet than the management server and when
> I set it up in the UI because the storage traffic runs over the management
> NIC by default I did not assign a label to the storage traffic, ie. I only
> assigned labels to management and guest.
>
>
> So two thoughts occur which I can test unless you can see the issue -
>
>
> 1) if I want to run the management/storage traffic over the same NIC the
> NFS server needs to be in the management subnet
>
>
> or
>
>
> 2) when I do the initial configuration I need to drag and drop the storage
> icon and use the same label as the management traffic
>
>
> Personally I can't see how 2) will help ie. the only time I should need to
> assign a label to storage is if I use a different NIC.
>
>
> Apologies for bringing this up again but am happy to run any tests and
> would like to file accurate bug report.
>
>
>
>
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 11 June 2018 10:58
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Well, it seems that you have found a bug. Can you fill out an issue report
> on Github?
>
> Thanks for the hard work on debugging and testing.
>
> On Fri, Jun 8, 2018 at 2:17 PM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > So based on Erik's suggestion (thanks Erik) I rebuilt the management
> > server and setup cluster wide primary storage as opposed to zone wide
> which
> > I have been using so far.
> >
> >
> > Still using 3 NICs (management/Guest/storage) and basic networking.
> >
> >
> > And VM HA now works. In addition it failed over quicker than it did when
> I
> > had zone wide NFS storage on a single NIC.
> >
> >
> > Still a bit confused about this output where it is still showing the
> > storage_ip_addresses as 172.30.3.x IPs which is the management subnet but
> > maybe I am reading it incorrectly.
> >
> >
> >
> > mysql> select * from cloud.host;
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > | id | name            | uuid                                 | status |
> > type               | private_ip_address | private_netmask |
> > private_mac_address | storage_ip_address | storage_netmask |
> > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> > storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> > cpus | speed | url                                 | fs_type |
> > hypervisor_type | hypervisor_version | ram        | resource | version  |
> > parent | total_size | capabilities | guid
> >                         | available | setup | dom0_memory | last_ping  |
> > mgmt_server_id | disconnected        | created             | removed |
> > update_count | resource_state | owner | lastUpdated | engine_state |
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > |  1 | dcp-cscn1.local | 372c738c-5370-4b46-9358-14b649c73d6b | Up     |
> > Routing            | 172.30.3.3         | 255.255.255.192 |
> > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> > 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.3        | 255.255.255.128 |
> > 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |   146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:19:36 | NULL    |            4 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  2 | s-1-VM          | 8f40f8a9-1e97-4fce-a143-a8c9489a6d5b | Up     |
> > SecondaryStorageVM | 172.30.3.55        | 255.255.255.192 |
> > 1e:00:ab:00:00:24   | 172.30.3.55        | 255.255.255.192 |
> > 1e:00:ab:00:00:24   | NULL                 | NULL                  | NULL
> >             |       NULL | 172.30.4.66       | 255.255.255.128 |
> > 1e:00:c4:00:00:67  |       NULL |              1 |      1 |        NULL |
> > NULL |  NULL | NoIqn                               | NULL    | NULL
> >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
> >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> >             |         1 |     0 |           0 | 1492635804 |
>  146457912294
> > | 2018-06-08 11:57:31 | 2018-06-08 11:21:51 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  3 | v-2-VM          | fe5b4f8b-b5b0-4018-8ee2-4c5384da073b | Up     |
> > ConsoleProxy       | 172.30.3.40        | 255.255.255.192 |
> > 1e:00:96:00:00:15   | 172.30.3.40        | 255.255.255.192 |
> > 1e:00:96:00:00:15   | NULL                 | NULL                  | NULL
> >             |       NULL | 172.30.4.62       | 255.255.255.128 |
> > 1e:00:01:00:00:63  |       NULL |              1 |      1 |        NULL |
> > NULL |  NULL | NoIqn                               | NULL    | NULL
> >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
> >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> >           |         1 |     0 |           0 | 1492635804 |
>  146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:22:03 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  4 | dcp-cscn2.local | 935260eb-a80c-4ead-85d7-3df8212e301b | Down   |
> > Routing            | 172.30.3.4         | 255.255.255.192 |
> > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> > 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.4        | 255.255.255.128 |
> > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |           NULL |
> > 2018-06-08 11:57:31 | 2018-06-08 11:35:07 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  5 | dcp-cscn3.local | f3cabc9e-9679-4d7e-8297-b6765eea2770 | Up     |
> > Routing            | 172.30.3.5         | 255.255.255.192 |
> > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> > 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.5        | 255.255.255.128 |
> > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |   146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:36:27 | NULL    |            4 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > 5 rows in set (0.00 sec)
> >
> > mysql>
> >
> > So some sort of bug maybe ?
> >
> > ________________________________
> > From: Erik Weber <te...@gmail.com>
> > Sent: 08 June 2018 10:15
> > To: users@cloudstack.apache.org
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > While someone ponders about the zone wide storage, you could try adding a
> > cluster wide nfs storage and see if it the rest works in that setup.
> >
> > Erik
> >
> > On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > Yes, all basic. I read a Shapeblue doc that recommended splitting
> traffic
> > > across multiple NICs even in basic networking mode so that is what I am
> > > trying to do.
> > >
> > >
> > > With single NIC you do not get the NFS storage message.
> > >
> > >
> > > I have the entire management server logs for both scenarios after I
> > pulled
> > > the power to one of the compute nodes but from the single NIC setup
> these
> > > seem to be the relevant lines -
> > >
> > >
> > > 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on
> 4
> > > with status Down
> > > 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource
> > state
> > > = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> > > 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> > > dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is
> down,
> > > name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> > > 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> > > (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> > > agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4:
> > Host
> > > with specified id is not in the right state: Down
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> > > returned status:Down for the investigated host:4
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> > > legacy state Down for host 4
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was
> able
> > to
> > > determine host 4 is in Down
> > > 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4
> > state
> > > determined is Down
> > > 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> > > 4-dcp-cscn2.local. Starting HA on the VMs
> > >
> > > At the moment I only need to assign public IPs direct to VMs rather
> than
> > > using NAT with the virtual router but would be happy to go with
> advanced
> > > networking if it would make things easier :)
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 10:35
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > Ah so, it is not an advanced setup; even when you use multiple NICs.
> > > Can you confirm that the message ""Agent investigation was requested on
> > > host, but host does not support investigation because it has no NFS
> > > storage. Skipping investigation." does not appear when you use a single
> > > NIC? Can you check other log entries that might appear when the host is
> > > marked as "down"?
> > >
> > > On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > It is all basic networking at the moment for all the setups.
> > > >
> > > >
> > > > If you want me to I can setup a single NIC solution again and run any
> > > > commands you need me to do.
> > > >
> > > >
> > > > FYI when I setup single NIC I use the guided  installtion option in
> the
> > > UI
> > > > rather than manual setup which I do for the multiple NIC scenario.
> > > >
> > > >
> > > > Happy to set it up if it helps.
> > > >
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > Sent: 07 June 2018 10:23
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > Ok, so that explains the log message. This is looking like a bug to
> me.
> > > It
> > > > seems that in Zone wide the host state (when disconnected) is not
> being
> > > > properly identified due to this NFS thing, and as a consequency it
> has
> > a
> > > > side effect in VM HA.
> > > >
> > > > We would need some inputs from guys that have advanced networking
> > > > deployments and Zone wide storage.
> > > >
> > > > I do not see how the all in one NIC deployment scenario is working
> > > though.
> > > > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is
> dead
> > > > simple, if there is no NFS in the cluster (NFS storage pools found
> for
> > a
> > > > host's cluster), KVM hosts will be detected as "disconnected" and not
> > > down
> > > > with that warning message you noticed.
> > > >
> > > > When you say "all in one NIC", is it an advanced network deployment
> > where
> > > > you put all traffic in a single network, or is it a basic networking
> > that
> > > > you are doing?
> > > >
> > > > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > > wrote:
> > > >
> > > > > zone wide.
> > > > >
> > > > >
> > > > > ________________________________
> > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > Sent: 07 June 2018 10:04
> > > > > To: users
> > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > > What type of storage are you using? Zone wide? Or cluster "wide"
> > > storage?
> > > > >
> > > > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <
> jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > >
> > > > > > Rafael
> > > > > >
> > > > > >
> > > > > > Here is the output as requested -
> > > > > >
> > > > > >
> > > > > >
> > > > > > mysql> mysql> select * from cloud.storage_pool where removed is
> > null;
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > | id | name | uuid                                 | pool_type
> > > >  |
> > > > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > > > capacity_bytes |
> > > > > > host_address | user_info | path            | created
>  |
> > > > > removed
> > > > > > | update_time | status | storage_provider_name | scope |
> > hypervisor |
> > > > > > managed | capacity_iops |
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > > > NetworkFilesystem |
> > > > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > > > 79133933568 |
> > > > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01
> |
> > > NULL
> > > > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM
> >   |
> > > > > >  0 |          NULL |
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > 1 row in set (0.00 sec)
> > > > > >
> > > > > > mysql>
> > > > > >
> > > > > > Do you think this problem is related to my NIC/bridge
> configuration
> > > or
> > > > > the
> > > > > > way I am configuring the zone ?
> > > > > >
> > > > > > Jon
> > > > > > ________________________________
> > > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > > Sent: 07 June 2018 06:45
> > > > > > To: users
> > > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > > >
> > > > > > Can you also post the result of:
> > > > > > select * from cloud.storage_pool where removed is null
> > > > > >
> > > > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > > > Dag.Sonstebo@shapeblue.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Jon,
> > > > > > >
> > > > > > > Still confused where your primary storage pools are - are you
> > sure
> > > > your
> > > > > > > hosts are in cluster 1?
> > > > > > >
> > > > > > > Quick question just to make sure - assuming management/storage
> is
> > > on
> > > > > the
> > > > > > > same NIC when I setup basic networking the physical network has
> > the
> > > > > > > management and guest icons already there and I just edit the
> KVM
> > > > > labels.
> > > > > > If
> > > > > > > I am running storage over management do I need to drag the
> > storage
> > > > icon
> > > > > > to
> > > > > > > the physical network and use the same KVM label (cloudbr0) as
> the
> > > > > > > management or does CS automatically just use the management NIC
> > > ie. I
> > > > > > would
> > > > > > > only need to drag the storage icon across in basic setup if I
> > > wanted
> > > > it
> > > > > > on
> > > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > > >
> > > > > > > >> I would do both - set up your 2/3 physical networks, name
> > isn't
> > > > that
> > > > > > > important - but then drag the traffic types to the correct one
> > and
> > > > make
> > > > > > > sure the labels are correct.
> > > > > > > Regards,
> > > > > > > Dag Sonstebo
> > > > > > > Cloud Architect
> > > > > > > ShapeBlue
> > > > > > >
> > > > > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk>
> > > wrote:
> > > > > > >
> > > > > > >     Dag
> > > > > > >
> > > > > > >
> > > > > > >     Do you mean  check the pools with "Infrastructure ->
> Primary
> > > > > Storage"
> > > > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > > > >
> > > > > > >
> > > > > > >     If so Primary Storage has a state of UP, secondary storage
> > does
> > > > not
> > > > > > > show a state as such so not sure where else to check it ?
> > > > > > >
> > > > > > >
> > > > > > >     Rerun of the command -
> > > > > > >
> > > > > > >     mysql> select * from cloud.storage_pool where cluster_id =
> 1;
> > > > > > >     Empty set (0.00 sec)
> > > > > > >
> > > > > > >     mysql>
> > > > > > >
> > > > > > >     I think it is something to do with my zone creation rather
> > than
> > > > the
> > > > > > > NIC, bridge setup although I can post those if needed.
> > > > > > >
> > > > > > >     I may try to setup just the 2 NIC solution you mentioned
> > > although
> > > > > as
> > > > > > I
> > > > > > > say I had the same issue with that ie. host goes to "Altert"
> > state
> > > > and
> > > > > > same
> > > > > > > error messages.  The only time I can get it to go to "Down"
> state
> > > is
> > > > > when
> > > > > > > it is all on the single NIC.
> > > > > > >
> > > > > > >     Quick question just to make sure - assuming
> > management/storage
> > > is
> > > > > on
> > > > > > > the same NIC when I setup basic networking the physical network
> > has
> > > > the
> > > > > > > management and guest icons already there and I just edit the
> KVM
> > > > > labels.
> > > > > > If
> > > > > > > I am running storage over management do I need to drag the
> > storage
> > > > icon
> > > > > > to
> > > > > > > the physical network and use the same KVM label (cloudbr0) as
> the
> > > > > > > management or does CS automatically just use the management NIC
> > > ie. I
> > > > > > would
> > > > > > > only need to drag the storage icon across in basic setup if I
> > > wanted
> > > > it
> > > > > > on
> > > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > > >
> > > > > > >     On the plus side I have been at this for so long now and
> done
> > > so
> > > > > many
> > > > > > > rebuilds I could do it in my sleep now ??
> > > > > > >
> > > > > > >
> > > > > > >     ________________________________
> > > > > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > > > > >     Sent: 06 June 2018 12:28
> > > > > > >     To: users@cloudstack.apache.org
> > > > > > >     Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > > > >
> > > > > > >     Looks OK to me Jon.
> > > > > > >
> > > > > > >     The one thing that throws me is your storage pools - can
> you
> > > > rerun
> > > > > > > your query: select * from cloud.storage_pool where cluster_id =
> > 1;
> > > > > > >
> > > > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > > > >
> > > > > > >     Regards,
> > > > > > >     Dag Sonstebo
> > > > > > >     Cloud Architect
> > > > > > >     ShapeBlue
> > > > > > >
> > > > > > >     On 06/06/2018, 12:08, "Jon Marshall" <
> jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > > > >
> > > > > > >         Don't know whether this helps or not but I logged into
> > the
> > > > SSVM
> > > > > > > and ran an ifconfig -
> > > > > > >
> > > > > > >
> > > > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 169.254.3.35  netmask 255.255.0.0
> broadcast
> > > > > > > 169.254.255.255
> > > > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> > > broadcast
> > > > > > > 172.30.3.63
> > > > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> > > broadcast
> > > > > > > 172.30.4.127
> > > > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> > > broadcast
> > > > > > > 172.30.5.15
> > > > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >
> > > > > > >         so it has interfaces in both the management and the
> > storage
> > > > > > > subnets (as well as guest).
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         ________________________________
> > > > > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > > > > >         Sent: 06 June 2018 11:08
> > > > > > >         To: users@cloudstack.apache.org
> > > > > > >         Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > > > > >
> > > > > > >         Hi Rafael
> > > > > > >
> > > > > > >
> > > > > > >         Thanks for the help, really appreciate it.
> > > > > > >
> > > > > > >
> > > > > > >         So rerunning that command with all servers up -
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         mysql> select * from cloud.storage_pool where
> cluster_id
> > =
> > > 1
> > > > > and
> > > > > > > removed is null;
> > > > > > >         Empty set (0.00 sec)
> > > > > > >
> > > > > > >         mysql>
> > > > > > >
> > > > > > >
> > > > > > >         As for the storage IP no I'm not setting it to be the
> > > > > management
> > > > > > > IP when I setup the zone but the output of the SQL command
> > suggests
> > > > > that
> > > > > > is
> > > > > > > what has happened.
> > > > > > >
> > > > > > >         As I said to Dag I am using a different subnet for
> > storage
> > > > ie.
> > > > > > >
> > > > > > >         172.30.3.0/26  - management subnet
> > > > > > >         172.30.4.0/25 -  guest VM subnet
> > > > > > >         172.30.5.0/28 - storage
> > > > > > >
> > > > > > >         the NFS server IP is 172.30.5.2
> > > > > > >
> > > > > > >         each compute node has 3 NICs with an IP from each
> subnet
> > (i
> > > > am
> > > > > > > assuming the management node only needs an IP in the management
> > > > network
> > > > > > ?)
> > > > > > >
> > > > > > >         When I add the zone in the UI I have one physical
> network
> > > > with
> > > > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > > > >         When I fill in the storage traffic page I use the range
> > > > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> > > allocated
> > > > to
> > > > > > the
> > > > > > > compute nodes and the NFS server.
> > > > > > >
> > > > > > >         I think maybe I am doing something wrong in the UI
> setup
> > > but
> > > > it
> > > > > > is
> > > > > > > not obvious to me what it is.
> > > > > > >
> > > > > > >         What I might try today unless you want me to keep the
> > > setup I
> > > > > > have
> > > > > > > for more outputs is to go back to 2 NICs, one for
> > > storage/management
> > > > > and
> > > > > > > one for guest VMs.
> > > > > > >
> > > > > > >         I think with the 2 NICs setup the mistake I made last
> > time
> > > > when
> > > > > > > adding the zone was to assume storage would just run over
> > > management
> > > > > so I
> > > > > > > did not drag and drop the storage icon and assign it to
> cloudbr0
> > as
> > > > > with
> > > > > > > the management which I think is what I should do ?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         ________________________________
> > > > > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > > > > >         Sent: 06 June 2018 10:54
> > > > > > >         To: users
> > > > > > >         Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > > > > >
> > > > > > >         Jon, do not panic we are here to help you :)
> > > > > > >         So, I might have mistyped the SQL query. You you use
> > > select *
> > > > > > from
> > > > > > >         cloud.storage_pool where cluster_id = 1 and removed is
> > not
> > > > null
> > > > > > ",
> > > > > > > you are
> > > > > > >         listing the storage pools removed. Therefore, the right
> > > query
> > > > > > > would be "
> > > > > > >         select * from cloud.storage_pool where cluster_id = 1
> and
> > > > > removed
> > > > > > > is null "
> > > > > > >
> > > > > > >         There is also something else I do not understand. You
> are
> > > > > setting
> > > > > > > the
> > > > > > >         storage IP in the management subnet? I am not sure if
> you
> > > > > should
> > > > > > > be doing
> > > > > > >         like this. Normally, I set all my storages
> (primary[when
> > > > > working
> > > > > > > with NFS]
> > > > > > >         and secondary) to IPs in the storage subnet.
> > > > > > >
> > > > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > > > Dag.Sonstebo@shapeblue.com>
> > > > > > >         wrote:
> > > > > > >
> > > > > > >         > Hi John,
> > > > > > >         >
> > > > > > >         > I'm late to this thread and have possibly missed some
> > > > things
> > > > > -
> > > > > > > but a
> > > > > > >         > couple of observations:
> > > > > > >         >
> > > > > > >         > "When I add the zone and get to the storage web page
> I
> > > > > exclude
> > > > > > > the IPs
> > > > > > >         > already used for the compute node NICs and the NFS
> > server
> > > > > > > itself. ....."
> > > > > > >         > "So the range is 172.30.5.1 -> 15 and the range I
> fill
> > in
> > > > is
> > > > > > > 172.30.5.10
> > > > > > >         > -> 172.30.5.14."
> > > > > > >         >
> > > > > > >         > I think you may have some confusion around the use of
> > the
> > > > > > > storage network.
> > > > > > >         > The important part here is to understand this is for
> > > > > *secondary
> > > > > > > storage*
> > > > > > >         > use only - it has nothing to do with primary storage.
> > > This
> > > > > > means
> > > > > > > this
> > > > > > >         > storage network needs to be accessible to the SSVM,
> to
> > > the
> > > > > > > hypervisors, and
> > > > > > >         > secondary storage NFS pools needs to be accessible on
> > > this
> > > > > > > network.
> > > > > > >         >
> > > > > > >         > The important part - this also means you *can not use
> > the
> > > > > same
> > > > > > > IP ranges
> > > > > > >         > for management and storage networks* - doing so means
> > you
> > > > > will
> > > > > > > have issues
> > > > > > >         > where effectively both hypervisors and SSVM can see
> the
> > > > same
> > > > > > > subnet on two
> > > > > > >         > NICs - and you end up in a routing black hole.
> > > > > > >         >
> > > > > > >         > So - you need to either:
> > > > > > >         >
> > > > > > >         > 1) Use different IP subnets on management and
> storage,
> > or
> > > > > > >         > 2) preferably just simplify your setup - stop using a
> > > > > secondary
> > > > > > > storage
> > > > > > >         > network altogether and just allow secondary storage
> to
> > > use
> > > > > the
> > > > > > > management
> > > > > > >         > network (which is default). Unless you have a very
> high
> > > I/O
> > > > > > > environment in
> > > > > > >         > production you are just adding complexity by running
> > > > separate
> > > > > > > management
> > > > > > >         > and storage.
> > > > > > >         >
> > > > > > >         > Regards,
> > > > > > >         > Dag Sonstebo
> > > > > > >         > Cloud Architect
> > > > > > >         > ShapeBlue
> > > > > > >         >
> > > > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > > > jms.123@hotmail.co.uk>
> > > > > > > wrote:
> > > > > > >         >
> > > > > > >         >     I will disconnect the host this morning and test
> > but
> > > > > before
> > > > > > > I do that
> > > > > > >         > I ran this command when all hosts are up -
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >      select * from cloud.host;
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     | id | name            | uuid
> > > > > > >  | status
> > > > > > >         > | type               | private_ip_address |
> > > > private_netmask |
> > > > > > >         > private_mac_address | storage_ip_address |
> > > storage_netmask
> > > > |
> > > > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > > > storage_mac_address_2 |
> > > > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > > > public_netmask  |
> > > > > > >         > public_mac_address | proxy_port | data_center_id |
> > > pod_id |
> > > > > > > cpu_sockets |
> > > > > > >         > cpus | speed | url                                 |
> > > > fs_type
> > > > > |
> > > > > > >         > hypervisor_type | hypervisor_version | ram        |
> > > > resource
> > > > > |
> > > > > > > version  |
> > > > > > >         > parent | total_size | capabilities | guid
> > > > > > >         >                         | available | setup |
> > > dom0_memory |
> > > > > > > last_ping  |
> > > > > > >         > mgmt_server_id | disconnected        | created
> > > >  |
> > > > > > > removed |
> > > > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > > > engine_state |
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > > > eabd60012284
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.3         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.3        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > > > f83e7a5e8efb-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > > > 4 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > > > 3f5e2c2215b0
> > > > > > > | Up
> > > > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |       NULL | 172.30.4.98       |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> > > 1 |
> > > > > > >   NULL |
> > > > > > >         > NULL |  NULL | NoIqn                               |
> > NULL
> > > > > |
> > > > > > > NULL
> > > > > > >         >     | NULL               |          0 | NULL     |
> > > > 4.11.0.0 |
> > > > > > > NULL   |
> > > > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > > > >         >           |         1 |     0 |           0 |
> > 1492390409
> > > |
> > > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > > > 7 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > > > ff3845bb556c
> > > > > > > | Up
> > > > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |       NULL | 172.30.4.86       |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> > > 1 |
> > > > > > >   NULL |
> > > > > > >         > NULL |  NULL | NoIqn                               |
> > NULL
> > > > > |
> > > > > > > NULL
> > > > > > >         >     | NULL               |          0 | NULL     |
> > > > 4.11.0.0 |
> > > > > > > NULL   |
> > > > > > >         >  NULL | NULL         | s-1-VM-
> > NfsSecondaryStorageResource
> > > > > > >         >             |         1 |     0 |           0 |
> > > 1492390407
> > > > |
> > > > > > >  146457912294
> > > > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL
>   |
> > > > > > >   7 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > > > 1a96ffac9794
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.4         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.4        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > > > 16d0f99b11c9-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > > > 8 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > > > ee29d2b5c274
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.5         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.5        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > > > 37f3eebad8fb-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > > > 6 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     5 rows in set (0.00 sec)
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     and you can see that it says the storage IP
> address
> > > is
> > > > > the
> > > > > > > same as the
> > > > > > >         > private IP address (the management network).
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I also ran the command you provided using the
> > Cluster
> > > > ID
> > > > > > > number from
> > > > > > >         > the table above -
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     mysql> select * from cloud.storage_pool where
> > > > cluster_id
> > > > > =
> > > > > > 1
> > > > > > > and
> > > > > > >         > removed is not null;
> > > > > > >         >     Empty set (0.00 sec)
> > > > > > >         >
> > > > > > >         >     mysql>
> > > > > > >         >
> > > > > > >         >     So assuming I am reading this correctly that
> seems
> > to
> > > > be
> > > > > > the
> > > > > > > issue.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I am at a loss as to why though.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I have a separate NIC for storage as described.
> > When
> > > I
> > > > > add
> > > > > > > the zone
> > > > > > >         > and get to the storage web page I exclude the IPs
> > already
> > > > > used
> > > > > > > for the
> > > > > > >         > compute node NICs and the NFS server itself. I do
> this
> > > > > because
> > > > > > > initially I
> > > > > > >         > didn't and the SSVM started using the IP address of
> the
> > > NFS
> > > > > > > server.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     So the range is 172.30.5.1 -> 15 and the range I
> > fill
> > > > in
> > > > > is
> > > > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     And I used the label "cloudbr2" for storage.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I must be doing this wrong somehow.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     Any pointers would be much appreciated.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     ________________________________
> > > > > > >         >     From: Rafael Weingärtner <
> > > rafaelweingartner@gmail.com>
> > > > > > >         >     Sent: 05 June 2018 16:13
> > > > > > >         >     To: users
> > > > > > >         >     Subject: Re: advanced networking with public IPs
> > > direct
> > > > > to
> > > > > > > VMs
> > > > > > >         >
> > > > > > >         >     That is interesting. Let's see the source of all
> > > > truth...
> > > > > > >         >     This is the code that is generating that odd
> > message.
> > > > > > >         >
> > > > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > > > getClusterId());
> > > > > > >         >     >         boolean hasNfs = false;
> > > > > > >         >     >         for (StoragePoolVO pool :
> clusterPools) {
> > > > > > >         >     >             if (pool.getPoolType() ==
> > > > StoragePoolType.
> > > > > > > NetworkFilesystem)
> > > > > > >         > {
> > > > > > >         >     >                 hasNfs = true;
> > > > > > >         >     >                 break;
> > > > > > >         >     >             }
> > > > > > >         >     >         }
> > > > > > >         >     >         if (!hasNfs) {
> > > > > > >         >     >             s_logger.warn(
> > > > > > >         >     >                     "Agent investigation was
> > > > requested
> > > > > on
> > > > > > > host " +
> > > > > > >         > agent +
> > > > > > >         >     > ", but host does not support investigation
> > because
> > > it
> > > > > has
> > > > > > > no NFS
> > > > > > >         > storage.
> > > > > > >         >     > Skipping investigation.");
> > > > > > >         >     >             return Status.Disconnected;
> > > > > > >         >     >         }
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >     There are two possibilities here. You do not have
> > any
> > > > NFS
> > > > > > > storage? Is
> > > > > > >         > that
> > > > > > >         >     the case? Or maybe, for some reason, the call
> > > > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > > > getClusterId())"
> > > > > > > is not
> > > > > > >         > returning
> > > > > > >         >     any NFS storage pools. Looking at the
> > > > > "listPoolsByCluster "
> > > > > > > we will see
> > > > > > >         >     that the following SQL is used:
> > > > > > >         >
> > > > > > >         >     Select * from storage_pool where cluster_id =
> > > > > > > <host'sClusterId> and
> > > > > > >         > removed
> > > > > > >         >     > is not null
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >     Can you run that SQL to see the its return when
> > your
> > > > > hosts
> > > > > > > are marked
> > > > > > >         > as
> > > > > > >         >     disconnected?
> > > > > > >         >
> > > > > > >         >
> > > > > > >         > Dag.Sonstebo@shapeblue.com
> > > > > > >         > www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > > technologies globally and are specialists in the design and
> > > > > implementation
> > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > implementations.
> > > > > >
> > > > > >
> > > > > >
> > > > > > >         Shapeblue - The CloudStack Company<
> > > http://www.shapeblue.com/
> > > > >
> > > > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > > > >         ShapeBlue are the largest independent integrator of
> > > > CloudStack
> > > > > > > technologies globally and are specialists in the design and
> > > > > > implementation
> > > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > > implementations.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > >         > @shapeblue
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > > > jms.123@hotmail.co.uk>
> > > > > > >         > wrote:
> > > > > > >         >
> > > > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > > > configured
> > > > > > > the zone
> > > > > > >         > through
> > > > > > >         >     > the UI I used the labels cloudbr0 for
> management,
> > > > > > cloudbr1
> > > > > > > for guest
> > > > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > > > response
> > > > > > > to you.
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > When I pull the power to the node
> > (dcp-cscn2.local)
> > > > > after
> > > > > > > about 5
> > > > > > >         > mins
> > > > > > >         >     > the  host status goes to "Alert" but never to
> > > "Down"
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > I get this in the logs -
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > [c.c.h.KVMInvestigator]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > Agent
> > > > > > > investigation
> > > > > > >         > was
> > > > > > >         >     > requested on host Host[-4-Routing], but host
> does
> > > not
> > > > > > > support
> > > > > > >         > investigation
> > > > > > >         >     > because it has no NFS storage. Skipping
> > > > investigation.
> > > > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > > > HighAvailabilityManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > > > KVMInvestigator was
> > > > > > >         > able to
> > > > > > >         >     > determine host 4 is in Disconnected
> > > > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > > > [c.c.a.m.AgentManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > The
> > > > > agent
> > > > > > > from host
> > > > > > >         > 4 state
> > > > > > >         >     > determined is Disconnected
> > > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > > > [c.c.a.m.AgentManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > Agent
> > > > > is
> > > > > > >         > disconnected but
> > > > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > > > >         >     >
> > > > > > >         >     > I don't understand why it thinks there is no
> NFS
> > > > > storage
> > > > > > > as each
> > > > > > >         > compute
> > > > > > >         >     > node has a dedicated storage NIC.
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > I also don't understand why it thinks the host
> is
> > > > still
> > > > > > up
> > > > > > > ie. what
> > > > > > >         > test
> > > > > > >         >     > is it doing to determine that ?
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > Am I just trying to get something working that
> is
> > > not
> > > > > > > supported ?
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > ________________________________
> > > > > > >         >     > From: Rafael Weingärtner <
> > > > rafaelweingartner@gmail.com>
> > > > > > >         >     > Sent: 04 June 2018 15:31
> > > > > > >         >     > To: users
> > > > > > >         >     > Subject: Re: advanced networking with public
> IPs
> > > > direct
> > > > > > to
> > > > > > > VMs
> > > > > > >         >     >
> > > > > > >         >     > What type of failover are you talking about?
> > > > > > >         >     > What ACS version are you using?
> > > > > > >         >     > What hypervisor are you using?
> > > > > > >         >     > How are you configuring your NICs in the
> > > hypervisor?
> > > > > > >         >     > How are you configuring the traffic labels in
> > ACS?
> > > > > > >         >     >
> > > > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > > > jms.123@hotmail.co.uk
> > > > > > >         > >
> > > > > > >         >     > wrote:
> > > > > > >         >     >
> > > > > > >         >     > > Hi all
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > I am close to giving up on basic networking
> as
> > I
> > > > just
> > > > > > > cannot get
> > > > > > >         > failover
> > > > > > >         >     > > working with multiple NICs (I am not even
> sure
> > it
> > > > is
> > > > > > > supported).
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > What I would like is to use 3 NICs for
> > > management,
> > > > > > > storage and
> > > > > > >         > guest
> > > > > > >         >     > > traffic. I would like to assign public IPs
> > direct
> > > > to
> > > > > > the
> > > > > > > VMs which
> > > > > > >         > is
> > > > > > >         >     > why I
> > > > > > >         >     > > originally chose basic.
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > If I switch to advanced networking do I just
> > > > > configure
> > > > > > a
> > > > > > > guest VM
> > > > > > >         > with
> > > > > > >         >     > > public IPs on one NIC and not both with the
> > > public
> > > > > > > traffic -
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > would this work ?
> > > > > > >         >     > >
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > --
> > > > > > >         >     > Rafael Weingärtner
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     --
> > > > > > >         >     Rafael Weingärtner
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >
> > > > > > >
> > > > > > >         --
> > > > > > >         Rafael Weingärtner
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >     Dag.Sonstebo@shapeblue.com
> > > > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > >     @shapeblue
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Dag.Sonstebo@shapeblue.com
> > > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > > @shapeblue
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rafael Weingärtner
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rafael Weingärtner
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
>
>
>
> --
> Rafael Weingärtner
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Hi Rafael


Just to let you know I reran the 2 NIC setup and it worked fine this time so it must have been something I did in the setup.


Many thanks for all the help


Jon


________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 15 June 2018 11:40
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Did you notice some problems in the log files when you tested with 2 NICs?
When using NFS cluster wide storage, the behavior should be the same as
with 3 NICs. There might be something in your configuration. The problem
for zone wide storage is what we discussed before though.

1) if I want to run the management/storage traffic over the same NIC the
NFS server needs to be in the management subnet
No. You should be able to setup different network ranges for each one of
them.

2) when I do the initial configuration I need to drag and drop the storage
icon and use the same label as the management traffic
If you are using only two NICs, for sure you need to configure the traffic
labels according. I mean, you have two only NICs, then you need to
configure the labels (cloudbr0 and cloudbr2) in that physical network tab
in the zone configuration.


On Thu, Jun 14, 2018 at 5:03 PM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> Hi Rafael
>
>
> I did log a bug but when rebuilding I found some slightly different
> behaviour so have temporarily removed it.
>
>
> So using cluster NFS and 3 NICs as already described VM HA works.
>
>
> Because the recommendation for basic network setup seems to be run
> storage/management over the same NIC and guest on another, so 2 NICs in
> total,  I set it up this way using cluster NFS and to my surprise VM HA did
> not work so it is obviously a bit more complicated than it first appeared.
>
>
> My NFS server is on a different subnet than the management server and when
> I set it up in the UI because the storage traffic runs over the management
> NIC by default I did not assign a label to the storage traffic, ie. I only
> assigned labels to management and guest.
>
>
> So two thoughts occur which I can test unless you can see the issue -
>
>
> 1) if I want to run the management/storage traffic over the same NIC the
> NFS server needs to be in the management subnet
>
>
> or
>
>
> 2) when I do the initial configuration I need to drag and drop the storage
> icon and use the same label as the management traffic
>
>
> Personally I can't see how 2) will help ie. the only time I should need to
> assign a label to storage is if I use a different NIC.
>
>
> Apologies for bringing this up again but am happy to run any tests and
> would like to file accurate bug report.
>
>
>
>
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 11 June 2018 10:58
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Well, it seems that you have found a bug. Can you fill out an issue report
> on Github?
>
> Thanks for the hard work on debugging and testing.
>
> On Fri, Jun 8, 2018 at 2:17 PM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > So based on Erik's suggestion (thanks Erik) I rebuilt the management
> > server and setup cluster wide primary storage as opposed to zone wide
> which
> > I have been using so far.
> >
> >
> > Still using 3 NICs (management/Guest/storage) and basic networking.
> >
> >
> > And VM HA now works. In addition it failed over quicker than it did when
> I
> > had zone wide NFS storage on a single NIC.
> >
> >
> > Still a bit confused about this output where it is still showing the
> > storage_ip_addresses as 172.30.3.x IPs which is the management subnet but
> > maybe I am reading it incorrectly.
> >
> >
> >
> > mysql> select * from cloud.host;
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > | id | name            | uuid                                 | status |
> > type               | private_ip_address | private_netmask |
> > private_mac_address | storage_ip_address | storage_netmask |
> > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> > storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> > cpus | speed | url                                 | fs_type |
> > hypervisor_type | hypervisor_version | ram        | resource | version  |
> > parent | total_size | capabilities | guid
> >                         | available | setup | dom0_memory | last_ping  |
> > mgmt_server_id | disconnected        | created             | removed |
> > update_count | resource_state | owner | lastUpdated | engine_state |
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > |  1 | dcp-cscn1.local | 372c738c-5370-4b46-9358-14b649c73d6b | Up     |
> > Routing            | 172.30.3.3         | 255.255.255.192 |
> > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> > 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.3        | 255.255.255.128 |
> > 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |   146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:19:36 | NULL    |            4 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  2 | s-1-VM          | 8f40f8a9-1e97-4fce-a143-a8c9489a6d5b | Up     |
> > SecondaryStorageVM | 172.30.3.55        | 255.255.255.192 |
> > 1e:00:ab:00:00:24   | 172.30.3.55        | 255.255.255.192 |
> > 1e:00:ab:00:00:24   | NULL                 | NULL                  | NULL
> >             |       NULL | 172.30.4.66       | 255.255.255.128 |
> > 1e:00:c4:00:00:67  |       NULL |              1 |      1 |        NULL |
> > NULL |  NULL | NoIqn                               | NULL    | NULL
> >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
> >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> >             |         1 |     0 |           0 | 1492635804 |
>  146457912294
> > | 2018-06-08 11:57:31 | 2018-06-08 11:21:51 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  3 | v-2-VM          | fe5b4f8b-b5b0-4018-8ee2-4c5384da073b | Up     |
> > ConsoleProxy       | 172.30.3.40        | 255.255.255.192 |
> > 1e:00:96:00:00:15   | 172.30.3.40        | 255.255.255.192 |
> > 1e:00:96:00:00:15   | NULL                 | NULL                  | NULL
> >             |       NULL | 172.30.4.62       | 255.255.255.128 |
> > 1e:00:01:00:00:63  |       NULL |              1 |      1 |        NULL |
> > NULL |  NULL | NoIqn                               | NULL    | NULL
> >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
> >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> >           |         1 |     0 |           0 | 1492635804 |
>  146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:22:03 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  4 | dcp-cscn2.local | 935260eb-a80c-4ead-85d7-3df8212e301b | Down   |
> > Routing            | 172.30.3.4         | 255.255.255.192 |
> > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> > 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.4        | 255.255.255.128 |
> > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |           NULL |
> > 2018-06-08 11:57:31 | 2018-06-08 11:35:07 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  5 | dcp-cscn3.local | f3cabc9e-9679-4d7e-8297-b6765eea2770 | Up     |
> > Routing            | 172.30.3.5         | 255.255.255.192 |
> > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> > 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.5        | 255.255.255.128 |
> > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |   146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:36:27 | NULL    |            4 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > 5 rows in set (0.00 sec)
> >
> > mysql>
> >
> > So some sort of bug maybe ?
> >
> > ________________________________
> > From: Erik Weber <te...@gmail.com>
> > Sent: 08 June 2018 10:15
> > To: users@cloudstack.apache.org
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > While someone ponders about the zone wide storage, you could try adding a
> > cluster wide nfs storage and see if it the rest works in that setup.
> >
> > Erik
> >
> > On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > Yes, all basic. I read a Shapeblue doc that recommended splitting
> traffic
> > > across multiple NICs even in basic networking mode so that is what I am
> > > trying to do.
> > >
> > >
> > > With single NIC you do not get the NFS storage message.
> > >
> > >
> > > I have the entire management server logs for both scenarios after I
> > pulled
> > > the power to one of the compute nodes but from the single NIC setup
> these
> > > seem to be the relevant lines -
> > >
> > >
> > > 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on
> 4
> > > with status Down
> > > 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource
> > state
> > > = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> > > 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> > > dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is
> down,
> > > name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> > > 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> > > (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> > > agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4:
> > Host
> > > with specified id is not in the right state: Down
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> > > returned status:Down for the investigated host:4
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> > > legacy state Down for host 4
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was
> able
> > to
> > > determine host 4 is in Down
> > > 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4
> > state
> > > determined is Down
> > > 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> > > 4-dcp-cscn2.local. Starting HA on the VMs
> > >
> > > At the moment I only need to assign public IPs direct to VMs rather
> than
> > > using NAT with the virtual router but would be happy to go with
> advanced
> > > networking if it would make things easier :)
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 10:35
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > Ah so, it is not an advanced setup; even when you use multiple NICs.
> > > Can you confirm that the message ""Agent investigation was requested on
> > > host, but host does not support investigation because it has no NFS
> > > storage. Skipping investigation." does not appear when you use a single
> > > NIC? Can you check other log entries that might appear when the host is
> > > marked as "down"?
> > >
> > > On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > It is all basic networking at the moment for all the setups.
> > > >
> > > >
> > > > If you want me to I can setup a single NIC solution again and run any
> > > > commands you need me to do.
> > > >
> > > >
> > > > FYI when I setup single NIC I use the guided  installtion option in
> the
> > > UI
> > > > rather than manual setup which I do for the multiple NIC scenario.
> > > >
> > > >
> > > > Happy to set it up if it helps.
> > > >
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > Sent: 07 June 2018 10:23
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > Ok, so that explains the log message. This is looking like a bug to
> me.
> > > It
> > > > seems that in Zone wide the host state (when disconnected) is not
> being
> > > > properly identified due to this NFS thing, and as a consequency it
> has
> > a
> > > > side effect in VM HA.
> > > >
> > > > We would need some inputs from guys that have advanced networking
> > > > deployments and Zone wide storage.
> > > >
> > > > I do not see how the all in one NIC deployment scenario is working
> > > though.
> > > > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is
> dead
> > > > simple, if there is no NFS in the cluster (NFS storage pools found
> for
> > a
> > > > host's cluster), KVM hosts will be detected as "disconnected" and not
> > > down
> > > > with that warning message you noticed.
> > > >
> > > > When you say "all in one NIC", is it an advanced network deployment
> > where
> > > > you put all traffic in a single network, or is it a basic networking
> > that
> > > > you are doing?
> > > >
> > > > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > > wrote:
> > > >
> > > > > zone wide.
> > > > >
> > > > >
> > > > > ________________________________
> > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > Sent: 07 June 2018 10:04
> > > > > To: users
> > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > > What type of storage are you using? Zone wide? Or cluster "wide"
> > > storage?
> > > > >
> > > > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <
> jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > >
> > > > > > Rafael
> > > > > >
> > > > > >
> > > > > > Here is the output as requested -
> > > > > >
> > > > > >
> > > > > >
> > > > > > mysql> mysql> select * from cloud.storage_pool where removed is
> > null;
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > | id | name | uuid                                 | pool_type
> > > >  |
> > > > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > > > capacity_bytes |
> > > > > > host_address | user_info | path            | created
>  |
> > > > > removed
> > > > > > | update_time | status | storage_provider_name | scope |
> > hypervisor |
> > > > > > managed | capacity_iops |
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > > > NetworkFilesystem |
> > > > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > > > 79133933568 |
> > > > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01
> |
> > > NULL
> > > > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM
> >   |
> > > > > >  0 |          NULL |
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > 1 row in set (0.00 sec)
> > > > > >
> > > > > > mysql>
> > > > > >
> > > > > > Do you think this problem is related to my NIC/bridge
> configuration
> > > or
> > > > > the
> > > > > > way I am configuring the zone ?
> > > > > >
> > > > > > Jon
> > > > > > ________________________________
> > > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > > Sent: 07 June 2018 06:45
> > > > > > To: users
> > > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > > >
> > > > > > Can you also post the result of:
> > > > > > select * from cloud.storage_pool where removed is null
> > > > > >
> > > > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > > > Dag.Sonstebo@shapeblue.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Jon,
> > > > > > >
> > > > > > > Still confused where your primary storage pools are - are you
> > sure
> > > > your
> > > > > > > hosts are in cluster 1?
> > > > > > >
> > > > > > > Quick question just to make sure - assuming management/storage
> is
> > > on
> > > > > the
> > > > > > > same NIC when I setup basic networking the physical network has
> > the
> > > > > > > management and guest icons already there and I just edit the
> KVM
> > > > > labels.
> > > > > > If
> > > > > > > I am running storage over management do I need to drag the
> > storage
> > > > icon
> > > > > > to
> > > > > > > the physical network and use the same KVM label (cloudbr0) as
> the
> > > > > > > management or does CS automatically just use the management NIC
> > > ie. I
> > > > > > would
> > > > > > > only need to drag the storage icon across in basic setup if I
> > > wanted
> > > > it
> > > > > > on
> > > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > > >
> > > > > > > >> I would do both - set up your 2/3 physical networks, name
> > isn't
> > > > that
> > > > > > > important - but then drag the traffic types to the correct one
> > and
> > > > make
> > > > > > > sure the labels are correct.
> > > > > > > Regards,
> > > > > > > Dag Sonstebo
> > > > > > > Cloud Architect
> > > > > > > ShapeBlue
> > > > > > >
> > > > > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk>
> > > wrote:
> > > > > > >
> > > > > > >     Dag
> > > > > > >
> > > > > > >
> > > > > > >     Do you mean  check the pools with "Infrastructure ->
> Primary
> > > > > Storage"
> > > > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > > > >
> > > > > > >
> > > > > > >     If so Primary Storage has a state of UP, secondary storage
> > does
> > > > not
> > > > > > > show a state as such so not sure where else to check it ?
> > > > > > >
> > > > > > >
> > > > > > >     Rerun of the command -
> > > > > > >
> > > > > > >     mysql> select * from cloud.storage_pool where cluster_id =
> 1;
> > > > > > >     Empty set (0.00 sec)
> > > > > > >
> > > > > > >     mysql>
> > > > > > >
> > > > > > >     I think it is something to do with my zone creation rather
> > than
> > > > the
> > > > > > > NIC, bridge setup although I can post those if needed.
> > > > > > >
> > > > > > >     I may try to setup just the 2 NIC solution you mentioned
> > > although
> > > > > as
> > > > > > I
> > > > > > > say I had the same issue with that ie. host goes to "Altert"
> > state
> > > > and
> > > > > > same
> > > > > > > error messages.  The only time I can get it to go to "Down"
> state
> > > is
> > > > > when
> > > > > > > it is all on the single NIC.
> > > > > > >
> > > > > > >     Quick question just to make sure - assuming
> > management/storage
> > > is
> > > > > on
> > > > > > > the same NIC when I setup basic networking the physical network
> > has
> > > > the
> > > > > > > management and guest icons already there and I just edit the
> KVM
> > > > > labels.
> > > > > > If
> > > > > > > I am running storage over management do I need to drag the
> > storage
> > > > icon
> > > > > > to
> > > > > > > the physical network and use the same KVM label (cloudbr0) as
> the
> > > > > > > management or does CS automatically just use the management NIC
> > > ie. I
> > > > > > would
> > > > > > > only need to drag the storage icon across in basic setup if I
> > > wanted
> > > > it
> > > > > > on
> > > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > > >
> > > > > > >     On the plus side I have been at this for so long now and
> done
> > > so
> > > > > many
> > > > > > > rebuilds I could do it in my sleep now ??
> > > > > > >
> > > > > > >
> > > > > > >     ________________________________
> > > > > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > > > > >     Sent: 06 June 2018 12:28
> > > > > > >     To: users@cloudstack.apache.org
> > > > > > >     Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > > > >
> > > > > > >     Looks OK to me Jon.
> > > > > > >
> > > > > > >     The one thing that throws me is your storage pools - can
> you
> > > > rerun
> > > > > > > your query: select * from cloud.storage_pool where cluster_id =
> > 1;
> > > > > > >
> > > > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > > > >
> > > > > > >     Regards,
> > > > > > >     Dag Sonstebo
> > > > > > >     Cloud Architect
> > > > > > >     ShapeBlue
> > > > > > >
> > > > > > >     On 06/06/2018, 12:08, "Jon Marshall" <
> jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > > > >
> > > > > > >         Don't know whether this helps or not but I logged into
> > the
> > > > SSVM
> > > > > > > and ran an ifconfig -
> > > > > > >
> > > > > > >
> > > > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 169.254.3.35  netmask 255.255.0.0
> broadcast
> > > > > > > 169.254.255.255
> > > > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> > > broadcast
> > > > > > > 172.30.3.63
> > > > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> > > broadcast
> > > > > > > 172.30.4.127
> > > > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> > > broadcast
> > > > > > > 172.30.5.15
> > > > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >
> > > > > > >         so it has interfaces in both the management and the
> > storage
> > > > > > > subnets (as well as guest).
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         ________________________________
> > > > > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > > > > >         Sent: 06 June 2018 11:08
> > > > > > >         To: users@cloudstack.apache.org
> > > > > > >         Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > > > > >
> > > > > > >         Hi Rafael
> > > > > > >
> > > > > > >
> > > > > > >         Thanks for the help, really appreciate it.
> > > > > > >
> > > > > > >
> > > > > > >         So rerunning that command with all servers up -
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         mysql> select * from cloud.storage_pool where
> cluster_id
> > =
> > > 1
> > > > > and
> > > > > > > removed is null;
> > > > > > >         Empty set (0.00 sec)
> > > > > > >
> > > > > > >         mysql>
> > > > > > >
> > > > > > >
> > > > > > >         As for the storage IP no I'm not setting it to be the
> > > > > management
> > > > > > > IP when I setup the zone but the output of the SQL command
> > suggests
> > > > > that
> > > > > > is
> > > > > > > what has happened.
> > > > > > >
> > > > > > >         As I said to Dag I am using a different subnet for
> > storage
> > > > ie.
> > > > > > >
> > > > > > >         172.30.3.0/26  - management subnet
> > > > > > >         172.30.4.0/25 -  guest VM subnet
> > > > > > >         172.30.5.0/28 - storage
> > > > > > >
> > > > > > >         the NFS server IP is 172.30.5.2
> > > > > > >
> > > > > > >         each compute node has 3 NICs with an IP from each
> subnet
> > (i
> > > > am
> > > > > > > assuming the management node only needs an IP in the management
> > > > network
> > > > > > ?)
> > > > > > >
> > > > > > >         When I add the zone in the UI I have one physical
> network
> > > > with
> > > > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > > > >         When I fill in the storage traffic page I use the range
> > > > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> > > allocated
> > > > to
> > > > > > the
> > > > > > > compute nodes and the NFS server.
> > > > > > >
> > > > > > >         I think maybe I am doing something wrong in the UI
> setup
> > > but
> > > > it
> > > > > > is
> > > > > > > not obvious to me what it is.
> > > > > > >
> > > > > > >         What I might try today unless you want me to keep the
> > > setup I
> > > > > > have
> > > > > > > for more outputs is to go back to 2 NICs, one for
> > > storage/management
> > > > > and
> > > > > > > one for guest VMs.
> > > > > > >
> > > > > > >         I think with the 2 NICs setup the mistake I made last
> > time
> > > > when
> > > > > > > adding the zone was to assume storage would just run over
> > > management
> > > > > so I
> > > > > > > did not drag and drop the storage icon and assign it to
> cloudbr0
> > as
> > > > > with
> > > > > > > the management which I think is what I should do ?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         ________________________________
> > > > > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > > > > >         Sent: 06 June 2018 10:54
> > > > > > >         To: users
> > > > > > >         Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > > > > >
> > > > > > >         Jon, do not panic we are here to help you :)
> > > > > > >         So, I might have mistyped the SQL query. You you use
> > > select *
> > > > > > from
> > > > > > >         cloud.storage_pool where cluster_id = 1 and removed is
> > not
> > > > null
> > > > > > ",
> > > > > > > you are
> > > > > > >         listing the storage pools removed. Therefore, the right
> > > query
> > > > > > > would be "
> > > > > > >         select * from cloud.storage_pool where cluster_id = 1
> and
> > > > > removed
> > > > > > > is null "
> > > > > > >
> > > > > > >         There is also something else I do not understand. You
> are
> > > > > setting
> > > > > > > the
> > > > > > >         storage IP in the management subnet? I am not sure if
> you
> > > > > should
> > > > > > > be doing
> > > > > > >         like this. Normally, I set all my storages
> (primary[when
> > > > > working
> > > > > > > with NFS]
> > > > > > >         and secondary) to IPs in the storage subnet.
> > > > > > >
> > > > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > > > Dag.Sonstebo@shapeblue.com>
> > > > > > >         wrote:
> > > > > > >
> > > > > > >         > Hi John,
> > > > > > >         >
> > > > > > >         > I'm late to this thread and have possibly missed some
> > > > things
> > > > > -
> > > > > > > but a
> > > > > > >         > couple of observations:
> > > > > > >         >
> > > > > > >         > "When I add the zone and get to the storage web page
> I
> > > > > exclude
> > > > > > > the IPs
> > > > > > >         > already used for the compute node NICs and the NFS
> > server
> > > > > > > itself. ....."
> > > > > > >         > "So the range is 172.30.5.1 -> 15 and the range I
> fill
> > in
> > > > is
> > > > > > > 172.30.5.10
> > > > > > >         > -> 172.30.5.14."
> > > > > > >         >
> > > > > > >         > I think you may have some confusion around the use of
> > the
> > > > > > > storage network.
> > > > > > >         > The important part here is to understand this is for
> > > > > *secondary
> > > > > > > storage*
> > > > > > >         > use only - it has nothing to do with primary storage.
> > > This
> > > > > > means
> > > > > > > this
> > > > > > >         > storage network needs to be accessible to the SSVM,
> to
> > > the
> > > > > > > hypervisors, and
> > > > > > >         > secondary storage NFS pools needs to be accessible on
> > > this
> > > > > > > network.
> > > > > > >         >
> > > > > > >         > The important part - this also means you *can not use
> > the
> > > > > same
> > > > > > > IP ranges
> > > > > > >         > for management and storage networks* - doing so means
> > you
> > > > > will
> > > > > > > have issues
> > > > > > >         > where effectively both hypervisors and SSVM can see
> the
> > > > same
> > > > > > > subnet on two
> > > > > > >         > NICs - and you end up in a routing black hole.
> > > > > > >         >
> > > > > > >         > So - you need to either:
> > > > > > >         >
> > > > > > >         > 1) Use different IP subnets on management and
> storage,
> > or
> > > > > > >         > 2) preferably just simplify your setup - stop using a
> > > > > secondary
> > > > > > > storage
> > > > > > >         > network altogether and just allow secondary storage
> to
> > > use
> > > > > the
> > > > > > > management
> > > > > > >         > network (which is default). Unless you have a very
> high
> > > I/O
> > > > > > > environment in
> > > > > > >         > production you are just adding complexity by running
> > > > separate
> > > > > > > management
> > > > > > >         > and storage.
> > > > > > >         >
> > > > > > >         > Regards,
> > > > > > >         > Dag Sonstebo
> > > > > > >         > Cloud Architect
> > > > > > >         > ShapeBlue
> > > > > > >         >
> > > > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > > > jms.123@hotmail.co.uk>
> > > > > > > wrote:
> > > > > > >         >
> > > > > > >         >     I will disconnect the host this morning and test
> > but
> > > > > before
> > > > > > > I do that
> > > > > > >         > I ran this command when all hosts are up -
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >      select * from cloud.host;
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     | id | name            | uuid
> > > > > > >  | status
> > > > > > >         > | type               | private_ip_address |
> > > > private_netmask |
> > > > > > >         > private_mac_address | storage_ip_address |
> > > storage_netmask
> > > > |
> > > > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > > > storage_mac_address_2 |
> > > > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > > > public_netmask  |
> > > > > > >         > public_mac_address | proxy_port | data_center_id |
> > > pod_id |
> > > > > > > cpu_sockets |
> > > > > > >         > cpus | speed | url                                 |
> > > > fs_type
> > > > > |
> > > > > > >         > hypervisor_type | hypervisor_version | ram        |
> > > > resource
> > > > > |
> > > > > > > version  |
> > > > > > >         > parent | total_size | capabilities | guid
> > > > > > >         >                         | available | setup |
> > > dom0_memory |
> > > > > > > last_ping  |
> > > > > > >         > mgmt_server_id | disconnected        | created
> > > >  |
> > > > > > > removed |
> > > > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > > > engine_state |
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > > > eabd60012284
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.3         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.3        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > > > f83e7a5e8efb-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > > > 4 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > > > 3f5e2c2215b0
> > > > > > > | Up
> > > > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |       NULL | 172.30.4.98       |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> > > 1 |
> > > > > > >   NULL |
> > > > > > >         > NULL |  NULL | NoIqn                               |
> > NULL
> > > > > |
> > > > > > > NULL
> > > > > > >         >     | NULL               |          0 | NULL     |
> > > > 4.11.0.0 |
> > > > > > > NULL   |
> > > > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > > > >         >           |         1 |     0 |           0 |
> > 1492390409
> > > |
> > > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > > > 7 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > > > ff3845bb556c
> > > > > > > | Up
> > > > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |       NULL | 172.30.4.86       |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> > > 1 |
> > > > > > >   NULL |
> > > > > > >         > NULL |  NULL | NoIqn                               |
> > NULL
> > > > > |
> > > > > > > NULL
> > > > > > >         >     | NULL               |          0 | NULL     |
> > > > 4.11.0.0 |
> > > > > > > NULL   |
> > > > > > >         >  NULL | NULL         | s-1-VM-
> > NfsSecondaryStorageResource
> > > > > > >         >             |         1 |     0 |           0 |
> > > 1492390407
> > > > |
> > > > > > >  146457912294
> > > > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL
>   |
> > > > > > >   7 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > > > 1a96ffac9794
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.4         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.4        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > > > 16d0f99b11c9-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > > > 8 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > > > ee29d2b5c274
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.5         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.5        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > > > 37f3eebad8fb-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > > > 6 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     5 rows in set (0.00 sec)
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     and you can see that it says the storage IP
> address
> > > is
> > > > > the
> > > > > > > same as the
> > > > > > >         > private IP address (the management network).
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I also ran the command you provided using the
> > Cluster
> > > > ID
> > > > > > > number from
> > > > > > >         > the table above -
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     mysql> select * from cloud.storage_pool where
> > > > cluster_id
> > > > > =
> > > > > > 1
> > > > > > > and
> > > > > > >         > removed is not null;
> > > > > > >         >     Empty set (0.00 sec)
> > > > > > >         >
> > > > > > >         >     mysql>
> > > > > > >         >
> > > > > > >         >     So assuming I am reading this correctly that
> seems
> > to
> > > > be
> > > > > > the
> > > > > > > issue.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I am at a loss as to why though.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I have a separate NIC for storage as described.
> > When
> > > I
> > > > > add
> > > > > > > the zone
> > > > > > >         > and get to the storage web page I exclude the IPs
> > already
> > > > > used
> > > > > > > for the
> > > > > > >         > compute node NICs and the NFS server itself. I do
> this
> > > > > because
> > > > > > > initially I
> > > > > > >         > didn't and the SSVM started using the IP address of
> the
> > > NFS
> > > > > > > server.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     So the range is 172.30.5.1 -> 15 and the range I
> > fill
> > > > in
> > > > > is
> > > > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     And I used the label "cloudbr2" for storage.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I must be doing this wrong somehow.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     Any pointers would be much appreciated.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     ________________________________
> > > > > > >         >     From: Rafael Weingärtner <
> > > rafaelweingartner@gmail.com>
> > > > > > >         >     Sent: 05 June 2018 16:13
> > > > > > >         >     To: users
> > > > > > >         >     Subject: Re: advanced networking with public IPs
> > > direct
> > > > > to
> > > > > > > VMs
> > > > > > >         >
> > > > > > >         >     That is interesting. Let's see the source of all
> > > > truth...
> > > > > > >         >     This is the code that is generating that odd
> > message.
> > > > > > >         >
> > > > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > > > getClusterId());
> > > > > > >         >     >         boolean hasNfs = false;
> > > > > > >         >     >         for (StoragePoolVO pool :
> clusterPools) {
> > > > > > >         >     >             if (pool.getPoolType() ==
> > > > StoragePoolType.
> > > > > > > NetworkFilesystem)
> > > > > > >         > {
> > > > > > >         >     >                 hasNfs = true;
> > > > > > >         >     >                 break;
> > > > > > >         >     >             }
> > > > > > >         >     >         }
> > > > > > >         >     >         if (!hasNfs) {
> > > > > > >         >     >             s_logger.warn(
> > > > > > >         >     >                     "Agent investigation was
> > > > requested
> > > > > on
> > > > > > > host " +
> > > > > > >         > agent +
> > > > > > >         >     > ", but host does not support investigation
> > because
> > > it
> > > > > has
> > > > > > > no NFS
> > > > > > >         > storage.
> > > > > > >         >     > Skipping investigation.");
> > > > > > >         >     >             return Status.Disconnected;
> > > > > > >         >     >         }
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >     There are two possibilities here. You do not have
> > any
> > > > NFS
> > > > > > > storage? Is
> > > > > > >         > that
> > > > > > >         >     the case? Or maybe, for some reason, the call
> > > > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > > > getClusterId())"
> > > > > > > is not
> > > > > > >         > returning
> > > > > > >         >     any NFS storage pools. Looking at the
> > > > > "listPoolsByCluster "
> > > > > > > we will see
> > > > > > >         >     that the following SQL is used:
> > > > > > >         >
> > > > > > >         >     Select * from storage_pool where cluster_id =
> > > > > > > <host'sClusterId> and
> > > > > > >         > removed
> > > > > > >         >     > is not null
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >     Can you run that SQL to see the its return when
> > your
> > > > > hosts
> > > > > > > are marked
> > > > > > >         > as
> > > > > > >         >     disconnected?
> > > > > > >         >
> > > > > > >         >
> > > > > > >         > Dag.Sonstebo@shapeblue.com
> > > > > > >         > www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > > technologies globally and are specialists in the design and
> > > > > implementation
> > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > implementations.
> > > > > >
> > > > > >
> > > > > >
> > > > > > >         Shapeblue - The CloudStack Company<
> > > http://www.shapeblue.com/
> > > > >
> > > > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > > > >         ShapeBlue are the largest independent integrator of
> > > > CloudStack
> > > > > > > technologies globally and are specialists in the design and
> > > > > > implementation
> > > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > > implementations.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > >         > @shapeblue
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > > > jms.123@hotmail.co.uk>
> > > > > > >         > wrote:
> > > > > > >         >
> > > > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > > > configured
> > > > > > > the zone
> > > > > > >         > through
> > > > > > >         >     > the UI I used the labels cloudbr0 for
> management,
> > > > > > cloudbr1
> > > > > > > for guest
> > > > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > > > response
> > > > > > > to you.
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > When I pull the power to the node
> > (dcp-cscn2.local)
> > > > > after
> > > > > > > about 5
> > > > > > >         > mins
> > > > > > >         >     > the  host status goes to "Alert" but never to
> > > "Down"
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > I get this in the logs -
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > [c.c.h.KVMInvestigator]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > Agent
> > > > > > > investigation
> > > > > > >         > was
> > > > > > >         >     > requested on host Host[-4-Routing], but host
> does
> > > not
> > > > > > > support
> > > > > > >         > investigation
> > > > > > >         >     > because it has no NFS storage. Skipping
> > > > investigation.
> > > > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > > > HighAvailabilityManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > > > KVMInvestigator was
> > > > > > >         > able to
> > > > > > >         >     > determine host 4 is in Disconnected
> > > > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > > > [c.c.a.m.AgentManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > The
> > > > > agent
> > > > > > > from host
> > > > > > >         > 4 state
> > > > > > >         >     > determined is Disconnected
> > > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > > > [c.c.a.m.AgentManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > Agent
> > > > > is
> > > > > > >         > disconnected but
> > > > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > > > >         >     >
> > > > > > >         >     > I don't understand why it thinks there is no
> NFS
> > > > > storage
> > > > > > > as each
> > > > > > >         > compute
> > > > > > >         >     > node has a dedicated storage NIC.
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > I also don't understand why it thinks the host
> is
> > > > still
> > > > > > up
> > > > > > > ie. what
> > > > > > >         > test
> > > > > > >         >     > is it doing to determine that ?
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > Am I just trying to get something working that
> is
> > > not
> > > > > > > supported ?
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > ________________________________
> > > > > > >         >     > From: Rafael Weingärtner <
> > > > rafaelweingartner@gmail.com>
> > > > > > >         >     > Sent: 04 June 2018 15:31
> > > > > > >         >     > To: users
> > > > > > >         >     > Subject: Re: advanced networking with public
> IPs
> > > > direct
> > > > > > to
> > > > > > > VMs
> > > > > > >         >     >
> > > > > > >         >     > What type of failover are you talking about?
> > > > > > >         >     > What ACS version are you using?
> > > > > > >         >     > What hypervisor are you using?
> > > > > > >         >     > How are you configuring your NICs in the
> > > hypervisor?
> > > > > > >         >     > How are you configuring the traffic labels in
> > ACS?
> > > > > > >         >     >
> > > > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > > > jms.123@hotmail.co.uk
> > > > > > >         > >
> > > > > > >         >     > wrote:
> > > > > > >         >     >
> > > > > > >         >     > > Hi all
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > I am close to giving up on basic networking
> as
> > I
> > > > just
> > > > > > > cannot get
> > > > > > >         > failover
> > > > > > >         >     > > working with multiple NICs (I am not even
> sure
> > it
> > > > is
> > > > > > > supported).
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > What I would like is to use 3 NICs for
> > > management,
> > > > > > > storage and
> > > > > > >         > guest
> > > > > > >         >     > > traffic. I would like to assign public IPs
> > direct
> > > > to
> > > > > > the
> > > > > > > VMs which
> > > > > > >         > is
> > > > > > >         >     > why I
> > > > > > >         >     > > originally chose basic.
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > If I switch to advanced networking do I just
> > > > > configure
> > > > > > a
> > > > > > > guest VM
> > > > > > >         > with
> > > > > > >         >     > > public IPs on one NIC and not both with the
> > > public
> > > > > > > traffic -
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > would this work ?
> > > > > > >         >     > >
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > --
> > > > > > >         >     > Rafael Weingärtner
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     --
> > > > > > >         >     Rafael Weingärtner
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >
> > > > > > >
> > > > > > >         --
> > > > > > >         Rafael Weingärtner
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >     Dag.Sonstebo@shapeblue.com
> > > > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > >     @shapeblue
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Dag.Sonstebo@shapeblue.com
> > > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > > @shapeblue
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rafael Weingärtner
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rafael Weingärtner
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
>
>
>
> --
> Rafael Weingärtner
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
Did you notice some problems in the log files when you tested with 2 NICs?
When using NFS cluster wide storage, the behavior should be the same as
with 3 NICs. There might be something in your configuration. The problem
for zone wide storage is what we discussed before though.

1) if I want to run the management/storage traffic over the same NIC the
NFS server needs to be in the management subnet
No. You should be able to setup different network ranges for each one of
them.

2) when I do the initial configuration I need to drag and drop the storage
icon and use the same label as the management traffic
If you are using only two NICs, for sure you need to configure the traffic
labels according. I mean, you have two only NICs, then you need to
configure the labels (cloudbr0 and cloudbr2) in that physical network tab
in the zone configuration.


On Thu, Jun 14, 2018 at 5:03 PM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> Hi Rafael
>
>
> I did log a bug but when rebuilding I found some slightly different
> behaviour so have temporarily removed it.
>
>
> So using cluster NFS and 3 NICs as already described VM HA works.
>
>
> Because the recommendation for basic network setup seems to be run
> storage/management over the same NIC and guest on another, so 2 NICs in
> total,  I set it up this way using cluster NFS and to my surprise VM HA did
> not work so it is obviously a bit more complicated than it first appeared.
>
>
> My NFS server is on a different subnet than the management server and when
> I set it up in the UI because the storage traffic runs over the management
> NIC by default I did not assign a label to the storage traffic, ie. I only
> assigned labels to management and guest.
>
>
> So two thoughts occur which I can test unless you can see the issue -
>
>
> 1) if I want to run the management/storage traffic over the same NIC the
> NFS server needs to be in the management subnet
>
>
> or
>
>
> 2) when I do the initial configuration I need to drag and drop the storage
> icon and use the same label as the management traffic
>
>
> Personally I can't see how 2) will help ie. the only time I should need to
> assign a label to storage is if I use a different NIC.
>
>
> Apologies for bringing this up again but am happy to run any tests and
> would like to file accurate bug report.
>
>
>
>
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 11 June 2018 10:58
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Well, it seems that you have found a bug. Can you fill out an issue report
> on Github?
>
> Thanks for the hard work on debugging and testing.
>
> On Fri, Jun 8, 2018 at 2:17 PM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > So based on Erik's suggestion (thanks Erik) I rebuilt the management
> > server and setup cluster wide primary storage as opposed to zone wide
> which
> > I have been using so far.
> >
> >
> > Still using 3 NICs (management/Guest/storage) and basic networking.
> >
> >
> > And VM HA now works. In addition it failed over quicker than it did when
> I
> > had zone wide NFS storage on a single NIC.
> >
> >
> > Still a bit confused about this output where it is still showing the
> > storage_ip_addresses as 172.30.3.x IPs which is the management subnet but
> > maybe I am reading it incorrectly.
> >
> >
> >
> > mysql> select * from cloud.host;
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > | id | name            | uuid                                 | status |
> > type               | private_ip_address | private_netmask |
> > private_mac_address | storage_ip_address | storage_netmask |
> > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> > storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> > cpus | speed | url                                 | fs_type |
> > hypervisor_type | hypervisor_version | ram        | resource | version  |
> > parent | total_size | capabilities | guid
> >                         | available | setup | dom0_memory | last_ping  |
> > mgmt_server_id | disconnected        | created             | removed |
> > update_count | resource_state | owner | lastUpdated | engine_state |
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > |  1 | dcp-cscn1.local | 372c738c-5370-4b46-9358-14b649c73d6b | Up     |
> > Routing            | 172.30.3.3         | 255.255.255.192 |
> > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> > 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.3        | 255.255.255.128 |
> > 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |   146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:19:36 | NULL    |            4 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  2 | s-1-VM          | 8f40f8a9-1e97-4fce-a143-a8c9489a6d5b | Up     |
> > SecondaryStorageVM | 172.30.3.55        | 255.255.255.192 |
> > 1e:00:ab:00:00:24   | 172.30.3.55        | 255.255.255.192 |
> > 1e:00:ab:00:00:24   | NULL                 | NULL                  | NULL
> >             |       NULL | 172.30.4.66       | 255.255.255.128 |
> > 1e:00:c4:00:00:67  |       NULL |              1 |      1 |        NULL |
> > NULL |  NULL | NoIqn                               | NULL    | NULL
> >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
> >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> >             |         1 |     0 |           0 | 1492635804 |
>  146457912294
> > | 2018-06-08 11:57:31 | 2018-06-08 11:21:51 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  3 | v-2-VM          | fe5b4f8b-b5b0-4018-8ee2-4c5384da073b | Up     |
> > ConsoleProxy       | 172.30.3.40        | 255.255.255.192 |
> > 1e:00:96:00:00:15   | 172.30.3.40        | 255.255.255.192 |
> > 1e:00:96:00:00:15   | NULL                 | NULL                  | NULL
> >             |       NULL | 172.30.4.62       | 255.255.255.128 |
> > 1e:00:01:00:00:63  |       NULL |              1 |      1 |        NULL |
> > NULL |  NULL | NoIqn                               | NULL    | NULL
> >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
> >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> >           |         1 |     0 |           0 | 1492635804 |
>  146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:22:03 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  4 | dcp-cscn2.local | 935260eb-a80c-4ead-85d7-3df8212e301b | Down   |
> > Routing            | 172.30.3.4         | 255.255.255.192 |
> > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> > 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.4        | 255.255.255.128 |
> > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |           NULL |
> > 2018-06-08 11:57:31 | 2018-06-08 11:35:07 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  5 | dcp-cscn3.local | f3cabc9e-9679-4d7e-8297-b6765eea2770 | Up     |
> > Routing            | 172.30.3.5         | 255.255.255.192 |
> > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> > 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.5        | 255.255.255.128 |
> > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |   146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:36:27 | NULL    |            4 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > 5 rows in set (0.00 sec)
> >
> > mysql>
> >
> > So some sort of bug maybe ?
> >
> > ________________________________
> > From: Erik Weber <te...@gmail.com>
> > Sent: 08 June 2018 10:15
> > To: users@cloudstack.apache.org
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > While someone ponders about the zone wide storage, you could try adding a
> > cluster wide nfs storage and see if it the rest works in that setup.
> >
> > Erik
> >
> > On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > Yes, all basic. I read a Shapeblue doc that recommended splitting
> traffic
> > > across multiple NICs even in basic networking mode so that is what I am
> > > trying to do.
> > >
> > >
> > > With single NIC you do not get the NFS storage message.
> > >
> > >
> > > I have the entire management server logs for both scenarios after I
> > pulled
> > > the power to one of the compute nodes but from the single NIC setup
> these
> > > seem to be the relevant lines -
> > >
> > >
> > > 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on
> 4
> > > with status Down
> > > 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource
> > state
> > > = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> > > 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> > > dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is
> down,
> > > name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> > > 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> > > (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> > > agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4:
> > Host
> > > with specified id is not in the right state: Down
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> > > returned status:Down for the investigated host:4
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> > > legacy state Down for host 4
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was
> able
> > to
> > > determine host 4 is in Down
> > > 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4
> > state
> > > determined is Down
> > > 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> > > 4-dcp-cscn2.local. Starting HA on the VMs
> > >
> > > At the moment I only need to assign public IPs direct to VMs rather
> than
> > > using NAT with the virtual router but would be happy to go with
> advanced
> > > networking if it would make things easier :)
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 10:35
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > Ah so, it is not an advanced setup; even when you use multiple NICs.
> > > Can you confirm that the message ""Agent investigation was requested on
> > > host, but host does not support investigation because it has no NFS
> > > storage. Skipping investigation." does not appear when you use a single
> > > NIC? Can you check other log entries that might appear when the host is
> > > marked as "down"?
> > >
> > > On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > It is all basic networking at the moment for all the setups.
> > > >
> > > >
> > > > If you want me to I can setup a single NIC solution again and run any
> > > > commands you need me to do.
> > > >
> > > >
> > > > FYI when I setup single NIC I use the guided  installtion option in
> the
> > > UI
> > > > rather than manual setup which I do for the multiple NIC scenario.
> > > >
> > > >
> > > > Happy to set it up if it helps.
> > > >
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > Sent: 07 June 2018 10:23
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > Ok, so that explains the log message. This is looking like a bug to
> me.
> > > It
> > > > seems that in Zone wide the host state (when disconnected) is not
> being
> > > > properly identified due to this NFS thing, and as a consequency it
> has
> > a
> > > > side effect in VM HA.
> > > >
> > > > We would need some inputs from guys that have advanced networking
> > > > deployments and Zone wide storage.
> > > >
> > > > I do not see how the all in one NIC deployment scenario is working
> > > though.
> > > > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is
> dead
> > > > simple, if there is no NFS in the cluster (NFS storage pools found
> for
> > a
> > > > host's cluster), KVM hosts will be detected as "disconnected" and not
> > > down
> > > > with that warning message you noticed.
> > > >
> > > > When you say "all in one NIC", is it an advanced network deployment
> > where
> > > > you put all traffic in a single network, or is it a basic networking
> > that
> > > > you are doing?
> > > >
> > > > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > > wrote:
> > > >
> > > > > zone wide.
> > > > >
> > > > >
> > > > > ________________________________
> > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > Sent: 07 June 2018 10:04
> > > > > To: users
> > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > > What type of storage are you using? Zone wide? Or cluster "wide"
> > > storage?
> > > > >
> > > > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <
> jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > >
> > > > > > Rafael
> > > > > >
> > > > > >
> > > > > > Here is the output as requested -
> > > > > >
> > > > > >
> > > > > >
> > > > > > mysql> mysql> select * from cloud.storage_pool where removed is
> > null;
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > | id | name | uuid                                 | pool_type
> > > >  |
> > > > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > > > capacity_bytes |
> > > > > > host_address | user_info | path            | created
>  |
> > > > > removed
> > > > > > | update_time | status | storage_provider_name | scope |
> > hypervisor |
> > > > > > managed | capacity_iops |
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > > > NetworkFilesystem |
> > > > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > > > 79133933568 |
> > > > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01
> |
> > > NULL
> > > > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM
> >   |
> > > > > >  0 |          NULL |
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > 1 row in set (0.00 sec)
> > > > > >
> > > > > > mysql>
> > > > > >
> > > > > > Do you think this problem is related to my NIC/bridge
> configuration
> > > or
> > > > > the
> > > > > > way I am configuring the zone ?
> > > > > >
> > > > > > Jon
> > > > > > ________________________________
> > > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > > Sent: 07 June 2018 06:45
> > > > > > To: users
> > > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > > >
> > > > > > Can you also post the result of:
> > > > > > select * from cloud.storage_pool where removed is null
> > > > > >
> > > > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > > > Dag.Sonstebo@shapeblue.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Jon,
> > > > > > >
> > > > > > > Still confused where your primary storage pools are – are you
> > sure
> > > > your
> > > > > > > hosts are in cluster 1?
> > > > > > >
> > > > > > > Quick question just to make sure - assuming management/storage
> is
> > > on
> > > > > the
> > > > > > > same NIC when I setup basic networking the physical network has
> > the
> > > > > > > management and guest icons already there and I just edit the
> KVM
> > > > > labels.
> > > > > > If
> > > > > > > I am running storage over management do I need to drag the
> > storage
> > > > icon
> > > > > > to
> > > > > > > the physical network and use the same KVM label (cloudbr0) as
> the
> > > > > > > management or does CS automatically just use the management NIC
> > > ie. I
> > > > > > would
> > > > > > > only need to drag the storage icon across in basic setup if I
> > > wanted
> > > > it
> > > > > > on
> > > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > > >
> > > > > > > >> I would do both – set up your 2/3 physical networks, name
> > isn’t
> > > > that
> > > > > > > important – but then drag the traffic types to the correct one
> > and
> > > > make
> > > > > > > sure the labels are correct.
> > > > > > > Regards,
> > > > > > > Dag Sonstebo
> > > > > > > Cloud Architect
> > > > > > > ShapeBlue
> > > > > > >
> > > > > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk>
> > > wrote:
> > > > > > >
> > > > > > >     Dag
> > > > > > >
> > > > > > >
> > > > > > >     Do you mean  check the pools with "Infrastructure ->
> Primary
> > > > > Storage"
> > > > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > > > >
> > > > > > >
> > > > > > >     If so Primary Storage has a state of UP, secondary storage
> > does
> > > > not
> > > > > > > show a state as such so not sure where else to check it ?
> > > > > > >
> > > > > > >
> > > > > > >     Rerun of the command -
> > > > > > >
> > > > > > >     mysql> select * from cloud.storage_pool where cluster_id =
> 1;
> > > > > > >     Empty set (0.00 sec)
> > > > > > >
> > > > > > >     mysql>
> > > > > > >
> > > > > > >     I think it is something to do with my zone creation rather
> > than
> > > > the
> > > > > > > NIC, bridge setup although I can post those if needed.
> > > > > > >
> > > > > > >     I may try to setup just the 2 NIC solution you mentioned
> > > although
> > > > > as
> > > > > > I
> > > > > > > say I had the same issue with that ie. host goes to "Altert"
> > state
> > > > and
> > > > > > same
> > > > > > > error messages.  The only time I can get it to go to "Down"
> state
> > > is
> > > > > when
> > > > > > > it is all on the single NIC.
> > > > > > >
> > > > > > >     Quick question just to make sure - assuming
> > management/storage
> > > is
> > > > > on
> > > > > > > the same NIC when I setup basic networking the physical network
> > has
> > > > the
> > > > > > > management and guest icons already there and I just edit the
> KVM
> > > > > labels.
> > > > > > If
> > > > > > > I am running storage over management do I need to drag the
> > storage
> > > > icon
> > > > > > to
> > > > > > > the physical network and use the same KVM label (cloudbr0) as
> the
> > > > > > > management or does CS automatically just use the management NIC
> > > ie. I
> > > > > > would
> > > > > > > only need to drag the storage icon across in basic setup if I
> > > wanted
> > > > it
> > > > > > on
> > > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > > >
> > > > > > >     On the plus side I have been at this for so long now and
> done
> > > so
> > > > > many
> > > > > > > rebuilds I could do it in my sleep now ??
> > > > > > >
> > > > > > >
> > > > > > >     ________________________________
> > > > > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > > > > >     Sent: 06 June 2018 12:28
> > > > > > >     To: users@cloudstack.apache.org
> > > > > > >     Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > > > >
> > > > > > >     Looks OK to me Jon.
> > > > > > >
> > > > > > >     The one thing that throws me is your storage pools – can
> you
> > > > rerun
> > > > > > > your query: select * from cloud.storage_pool where cluster_id =
> > 1;
> > > > > > >
> > > > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > > > >
> > > > > > >     Regards,
> > > > > > >     Dag Sonstebo
> > > > > > >     Cloud Architect
> > > > > > >     ShapeBlue
> > > > > > >
> > > > > > >     On 06/06/2018, 12:08, "Jon Marshall" <
> jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > > > >
> > > > > > >         Don't know whether this helps or not but I logged into
> > the
> > > > SSVM
> > > > > > > and ran an ifconfig -
> > > > > > >
> > > > > > >
> > > > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 169.254.3.35  netmask 255.255.0.0
> broadcast
> > > > > > > 169.254.255.255
> > > > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> > > broadcast
> > > > > > > 172.30.3.63
> > > > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> > > broadcast
> > > > > > > 172.30.4.127
> > > > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> > > broadcast
> > > > > > > 172.30.5.15
> > > > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >
> > > > > > >         so it has interfaces in both the management and the
> > storage
> > > > > > > subnets (as well as guest).
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         ________________________________
> > > > > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > > > > >         Sent: 06 June 2018 11:08
> > > > > > >         To: users@cloudstack.apache.org
> > > > > > >         Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > > > > >
> > > > > > >         Hi Rafael
> > > > > > >
> > > > > > >
> > > > > > >         Thanks for the help, really appreciate it.
> > > > > > >
> > > > > > >
> > > > > > >         So rerunning that command with all servers up -
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         mysql> select * from cloud.storage_pool where
> cluster_id
> > =
> > > 1
> > > > > and
> > > > > > > removed is null;
> > > > > > >         Empty set (0.00 sec)
> > > > > > >
> > > > > > >         mysql>
> > > > > > >
> > > > > > >
> > > > > > >         As for the storage IP no I'm not setting it to be the
> > > > > management
> > > > > > > IP when I setup the zone but the output of the SQL command
> > suggests
> > > > > that
> > > > > > is
> > > > > > > what has happened.
> > > > > > >
> > > > > > >         As I said to Dag I am using a different subnet for
> > storage
> > > > ie.
> > > > > > >
> > > > > > >         172.30.3.0/26  - management subnet
> > > > > > >         172.30.4.0/25 -  guest VM subnet
> > > > > > >         172.30.5.0/28 - storage
> > > > > > >
> > > > > > >         the NFS server IP is 172.30.5.2
> > > > > > >
> > > > > > >         each compute node has 3 NICs with an IP from each
> subnet
> > (i
> > > > am
> > > > > > > assuming the management node only needs an IP in the management
> > > > network
> > > > > > ?)
> > > > > > >
> > > > > > >         When I add the zone in the UI I have one physical
> network
> > > > with
> > > > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > > > >         When I fill in the storage traffic page I use the range
> > > > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> > > allocated
> > > > to
> > > > > > the
> > > > > > > compute nodes and the NFS server.
> > > > > > >
> > > > > > >         I think maybe I am doing something wrong in the UI
> setup
> > > but
> > > > it
> > > > > > is
> > > > > > > not obvious to me what it is.
> > > > > > >
> > > > > > >         What I might try today unless you want me to keep the
> > > setup I
> > > > > > have
> > > > > > > for more outputs is to go back to 2 NICs, one for
> > > storage/management
> > > > > and
> > > > > > > one for guest VMs.
> > > > > > >
> > > > > > >         I think with the 2 NICs setup the mistake I made last
> > time
> > > > when
> > > > > > > adding the zone was to assume storage would just run over
> > > management
> > > > > so I
> > > > > > > did not drag and drop the storage icon and assign it to
> cloudbr0
> > as
> > > > > with
> > > > > > > the management which I think is what I should do ?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         ________________________________
> > > > > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > > > > >         Sent: 06 June 2018 10:54
> > > > > > >         To: users
> > > > > > >         Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > > > > >
> > > > > > >         Jon, do not panic we are here to help you :)
> > > > > > >         So, I might have mistyped the SQL query. You you use
> > > select *
> > > > > > from
> > > > > > >         cloud.storage_pool where cluster_id = 1 and removed is
> > not
> > > > null
> > > > > > ",
> > > > > > > you are
> > > > > > >         listing the storage pools removed. Therefore, the right
> > > query
> > > > > > > would be "
> > > > > > >         select * from cloud.storage_pool where cluster_id = 1
> and
> > > > > removed
> > > > > > > is null "
> > > > > > >
> > > > > > >         There is also something else I do not understand. You
> are
> > > > > setting
> > > > > > > the
> > > > > > >         storage IP in the management subnet? I am not sure if
> you
> > > > > should
> > > > > > > be doing
> > > > > > >         like this. Normally, I set all my storages
> (primary[when
> > > > > working
> > > > > > > with NFS]
> > > > > > >         and secondary) to IPs in the storage subnet.
> > > > > > >
> > > > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > > > Dag.Sonstebo@shapeblue.com>
> > > > > > >         wrote:
> > > > > > >
> > > > > > >         > Hi John,
> > > > > > >         >
> > > > > > >         > I’m late to this thread and have possibly missed some
> > > > things
> > > > > –
> > > > > > > but a
> > > > > > >         > couple of observations:
> > > > > > >         >
> > > > > > >         > “When I add the zone and get to the storage web page
> I
> > > > > exclude
> > > > > > > the IPs
> > > > > > >         > already used for the compute node NICs and the NFS
> > server
> > > > > > > itself. …..”
> > > > > > >         > “So the range is 172.30.5.1 -> 15 and the range I
> fill
> > in
> > > > is
> > > > > > > 172.30.5.10
> > > > > > >         > -> 172.30.5.14.”
> > > > > > >         >
> > > > > > >         > I think you may have some confusion around the use of
> > the
> > > > > > > storage network.
> > > > > > >         > The important part here is to understand this is for
> > > > > *secondary
> > > > > > > storage*
> > > > > > >         > use only – it has nothing to do with primary storage.
> > > This
> > > > > > means
> > > > > > > this
> > > > > > >         > storage network needs to be accessible to the SSVM,
> to
> > > the
> > > > > > > hypervisors, and
> > > > > > >         > secondary storage NFS pools needs to be accessible on
> > > this
> > > > > > > network.
> > > > > > >         >
> > > > > > >         > The important part – this also means you *can not use
> > the
> > > > > same
> > > > > > > IP ranges
> > > > > > >         > for management and storage networks* - doing so means
> > you
> > > > > will
> > > > > > > have issues
> > > > > > >         > where effectively both hypervisors and SSVM can see
> the
> > > > same
> > > > > > > subnet on two
> > > > > > >         > NICs – and you end up in a routing black hole.
> > > > > > >         >
> > > > > > >         > So – you need to either:
> > > > > > >         >
> > > > > > >         > 1) Use different IP subnets on management and
> storage,
> > or
> > > > > > >         > 2) preferably just simplify your setup – stop using a
> > > > > secondary
> > > > > > > storage
> > > > > > >         > network altogether and just allow secondary storage
> to
> > > use
> > > > > the
> > > > > > > management
> > > > > > >         > network (which is default). Unless you have a very
> high
> > > I/O
> > > > > > > environment in
> > > > > > >         > production you are just adding complexity by running
> > > > separate
> > > > > > > management
> > > > > > >         > and storage.
> > > > > > >         >
> > > > > > >         > Regards,
> > > > > > >         > Dag Sonstebo
> > > > > > >         > Cloud Architect
> > > > > > >         > ShapeBlue
> > > > > > >         >
> > > > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > > > jms.123@hotmail.co.uk>
> > > > > > > wrote:
> > > > > > >         >
> > > > > > >         >     I will disconnect the host this morning and test
> > but
> > > > > before
> > > > > > > I do that
> > > > > > >         > I ran this command when all hosts are up -
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >      select * from cloud.host;
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     | id | name            | uuid
> > > > > > >  | status
> > > > > > >         > | type               | private_ip_address |
> > > > private_netmask |
> > > > > > >         > private_mac_address | storage_ip_address |
> > > storage_netmask
> > > > |
> > > > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > > > storage_mac_address_2 |
> > > > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > > > public_netmask  |
> > > > > > >         > public_mac_address | proxy_port | data_center_id |
> > > pod_id |
> > > > > > > cpu_sockets |
> > > > > > >         > cpus | speed | url                                 |
> > > > fs_type
> > > > > |
> > > > > > >         > hypervisor_type | hypervisor_version | ram        |
> > > > resource
> > > > > |
> > > > > > > version  |
> > > > > > >         > parent | total_size | capabilities | guid
> > > > > > >         >                         | available | setup |
> > > dom0_memory |
> > > > > > > last_ping  |
> > > > > > >         > mgmt_server_id | disconnected        | created
> > > >  |
> > > > > > > removed |
> > > > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > > > engine_state |
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > > > eabd60012284
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.3         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.3        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > > > f83e7a5e8efb-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > > > 4 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > > > 3f5e2c2215b0
> > > > > > > | Up
> > > > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |       NULL | 172.30.4.98       |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> > > 1 |
> > > > > > >   NULL |
> > > > > > >         > NULL |  NULL | NoIqn                               |
> > NULL
> > > > > |
> > > > > > > NULL
> > > > > > >         >     | NULL               |          0 | NULL     |
> > > > 4.11.0.0 |
> > > > > > > NULL   |
> > > > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > > > >         >           |         1 |     0 |           0 |
> > 1492390409
> > > |
> > > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > > > 7 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > > > ff3845bb556c
> > > > > > > | Up
> > > > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |       NULL | 172.30.4.86       |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> > > 1 |
> > > > > > >   NULL |
> > > > > > >         > NULL |  NULL | NoIqn                               |
> > NULL
> > > > > |
> > > > > > > NULL
> > > > > > >         >     | NULL               |          0 | NULL     |
> > > > 4.11.0.0 |
> > > > > > > NULL   |
> > > > > > >         >  NULL | NULL         | s-1-VM-
> > NfsSecondaryStorageResource
> > > > > > >         >             |         1 |     0 |           0 |
> > > 1492390407
> > > > |
> > > > > > >  146457912294
> > > > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL
>   |
> > > > > > >   7 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > > > 1a96ffac9794
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.4         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.4        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > > > 16d0f99b11c9-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > > > 8 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > > > ee29d2b5c274
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.5         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.5        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > > > 37f3eebad8fb-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > > > 6 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     5 rows in set (0.00 sec)
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     and you can see that it says the storage IP
> address
> > > is
> > > > > the
> > > > > > > same as the
> > > > > > >         > private IP address (the management network).
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I also ran the command you provided using the
> > Cluster
> > > > ID
> > > > > > > number from
> > > > > > >         > the table above -
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     mysql> select * from cloud.storage_pool where
> > > > cluster_id
> > > > > =
> > > > > > 1
> > > > > > > and
> > > > > > >         > removed is not null;
> > > > > > >         >     Empty set (0.00 sec)
> > > > > > >         >
> > > > > > >         >     mysql>
> > > > > > >         >
> > > > > > >         >     So assuming I am reading this correctly that
> seems
> > to
> > > > be
> > > > > > the
> > > > > > > issue.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I am at a loss as to why though.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I have a separate NIC for storage as described.
> > When
> > > I
> > > > > add
> > > > > > > the zone
> > > > > > >         > and get to the storage web page I exclude the IPs
> > already
> > > > > used
> > > > > > > for the
> > > > > > >         > compute node NICs and the NFS server itself. I do
> this
> > > > > because
> > > > > > > initially I
> > > > > > >         > didn't and the SSVM started using the IP address of
> the
> > > NFS
> > > > > > > server.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     So the range is 172.30.5.1 -> 15 and the range I
> > fill
> > > > in
> > > > > is
> > > > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     And I used the label "cloudbr2" for storage.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I must be doing this wrong somehow.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     Any pointers would be much appreciated.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     ________________________________
> > > > > > >         >     From: Rafael Weingärtner <
> > > rafaelweingartner@gmail.com>
> > > > > > >         >     Sent: 05 June 2018 16:13
> > > > > > >         >     To: users
> > > > > > >         >     Subject: Re: advanced networking with public IPs
> > > direct
> > > > > to
> > > > > > > VMs
> > > > > > >         >
> > > > > > >         >     That is interesting. Let's see the source of all
> > > > truth...
> > > > > > >         >     This is the code that is generating that odd
> > message.
> > > > > > >         >
> > > > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > > > getClusterId());
> > > > > > >         >     >         boolean hasNfs = false;
> > > > > > >         >     >         for (StoragePoolVO pool :
> clusterPools) {
> > > > > > >         >     >             if (pool.getPoolType() ==
> > > > StoragePoolType.
> > > > > > > NetworkFilesystem)
> > > > > > >         > {
> > > > > > >         >     >                 hasNfs = true;
> > > > > > >         >     >                 break;
> > > > > > >         >     >             }
> > > > > > >         >     >         }
> > > > > > >         >     >         if (!hasNfs) {
> > > > > > >         >     >             s_logger.warn(
> > > > > > >         >     >                     "Agent investigation was
> > > > requested
> > > > > on
> > > > > > > host " +
> > > > > > >         > agent +
> > > > > > >         >     > ", but host does not support investigation
> > because
> > > it
> > > > > has
> > > > > > > no NFS
> > > > > > >         > storage.
> > > > > > >         >     > Skipping investigation.");
> > > > > > >         >     >             return Status.Disconnected;
> > > > > > >         >     >         }
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >     There are two possibilities here. You do not have
> > any
> > > > NFS
> > > > > > > storage? Is
> > > > > > >         > that
> > > > > > >         >     the case? Or maybe, for some reason, the call
> > > > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > > > getClusterId())"
> > > > > > > is not
> > > > > > >         > returning
> > > > > > >         >     any NFS storage pools. Looking at the
> > > > > "listPoolsByCluster "
> > > > > > > we will see
> > > > > > >         >     that the following SQL is used:
> > > > > > >         >
> > > > > > >         >     Select * from storage_pool where cluster_id =
> > > > > > > <host'sClusterId> and
> > > > > > >         > removed
> > > > > > >         >     > is not null
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >     Can you run that SQL to see the its return when
> > your
> > > > > hosts
> > > > > > > are marked
> > > > > > >         > as
> > > > > > >         >     disconnected?
> > > > > > >         >
> > > > > > >         >
> > > > > > >         > Dag.Sonstebo@shapeblue.com
> > > > > > >         > www.shapeblue.com<http://www.shapeblue.com>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > > technologies globally and are specialists in the design and
> > > > > implementation
> > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > implementations.
> > > > > >
> > > > > >
> > > > > >
> > > > > > >         Shapeblue - The CloudStack Company<
> > > http://www.shapeblue.com/
> > > > >
> > > > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > > > >         ShapeBlue are the largest independent integrator of
> > > > CloudStack
> > > > > > > technologies globally and are specialists in the design and
> > > > > > implementation
> > > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > > implementations.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > >         > @shapeblue
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > > > jms.123@hotmail.co.uk>
> > > > > > >         > wrote:
> > > > > > >         >
> > > > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > > > configured
> > > > > > > the zone
> > > > > > >         > through
> > > > > > >         >     > the UI I used the labels cloudbr0 for
> management,
> > > > > > cloudbr1
> > > > > > > for guest
> > > > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > > > response
> > > > > > > to you.
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > When I pull the power to the node
> > (dcp-cscn2.local)
> > > > > after
> > > > > > > about 5
> > > > > > >         > mins
> > > > > > >         >     > the  host status goes to "Alert" but never to
> > > "Down"
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > I get this in the logs -
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > [c.c.h.KVMInvestigator]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > Agent
> > > > > > > investigation
> > > > > > >         > was
> > > > > > >         >     > requested on host Host[-4-Routing], but host
> does
> > > not
> > > > > > > support
> > > > > > >         > investigation
> > > > > > >         >     > because it has no NFS storage. Skipping
> > > > investigation.
> > > > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > > > HighAvailabilityManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > > > KVMInvestigator was
> > > > > > >         > able to
> > > > > > >         >     > determine host 4 is in Disconnected
> > > > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > > > [c.c.a.m.AgentManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > The
> > > > > agent
> > > > > > > from host
> > > > > > >         > 4 state
> > > > > > >         >     > determined is Disconnected
> > > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > > > [c.c.a.m.AgentManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > Agent
> > > > > is
> > > > > > >         > disconnected but
> > > > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > > > >         >     >
> > > > > > >         >     > I don't understand why it thinks there is no
> NFS
> > > > > storage
> > > > > > > as each
> > > > > > >         > compute
> > > > > > >         >     > node has a dedicated storage NIC.
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > I also don't understand why it thinks the host
> is
> > > > still
> > > > > > up
> > > > > > > ie. what
> > > > > > >         > test
> > > > > > >         >     > is it doing to determine that ?
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > Am I just trying to get something working that
> is
> > > not
> > > > > > > supported ?
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > ________________________________
> > > > > > >         >     > From: Rafael Weingärtner <
> > > > rafaelweingartner@gmail.com>
> > > > > > >         >     > Sent: 04 June 2018 15:31
> > > > > > >         >     > To: users
> > > > > > >         >     > Subject: Re: advanced networking with public
> IPs
> > > > direct
> > > > > > to
> > > > > > > VMs
> > > > > > >         >     >
> > > > > > >         >     > What type of failover are you talking about?
> > > > > > >         >     > What ACS version are you using?
> > > > > > >         >     > What hypervisor are you using?
> > > > > > >         >     > How are you configuring your NICs in the
> > > hypervisor?
> > > > > > >         >     > How are you configuring the traffic labels in
> > ACS?
> > > > > > >         >     >
> > > > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > > > jms.123@hotmail.co.uk
> > > > > > >         > >
> > > > > > >         >     > wrote:
> > > > > > >         >     >
> > > > > > >         >     > > Hi all
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > I am close to giving up on basic networking
> as
> > I
> > > > just
> > > > > > > cannot get
> > > > > > >         > failover
> > > > > > >         >     > > working with multiple NICs (I am not even
> sure
> > it
> > > > is
> > > > > > > supported).
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > What I would like is to use 3 NICs for
> > > management,
> > > > > > > storage and
> > > > > > >         > guest
> > > > > > >         >     > > traffic. I would like to assign public IPs
> > direct
> > > > to
> > > > > > the
> > > > > > > VMs which
> > > > > > >         > is
> > > > > > >         >     > why I
> > > > > > >         >     > > originally chose basic.
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > If I switch to advanced networking do I just
> > > > > configure
> > > > > > a
> > > > > > > guest VM
> > > > > > >         > with
> > > > > > >         >     > > public IPs on one NIC and not both with the
> > > public
> > > > > > > traffic -
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > would this work ?
> > > > > > >         >     > >
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > --
> > > > > > >         >     > Rafael Weingärtner
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     --
> > > > > > >         >     Rafael Weingärtner
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >
> > > > > > >
> > > > > > >         --
> > > > > > >         Rafael Weingärtner
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >     Dag.Sonstebo@shapeblue.com
> > > > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > >     @shapeblue
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Dag.Sonstebo@shapeblue.com
> > > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > > @shapeblue
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rafael Weingärtner
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rafael Weingärtner
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
>
>
>
> --
> Rafael Weingärtner
>



-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Hi Rafael


I did log a bug but when rebuilding I found some slightly different behaviour so have temporarily removed it.


So using cluster NFS and 3 NICs as already described VM HA works.


Because the recommendation for basic network setup seems to be run storage/management over the same NIC and guest on another, so 2 NICs in total,  I set it up this way using cluster NFS and to my surprise VM HA did not work so it is obviously a bit more complicated than it first appeared.


My NFS server is on a different subnet than the management server and when I set it up in the UI because the storage traffic runs over the management NIC by default I did not assign a label to the storage traffic, ie. I only assigned labels to management and guest.


So two thoughts occur which I can test unless you can see the issue -


1) if I want to run the management/storage traffic over the same NIC the NFS server needs to be in the management subnet


or


2) when I do the initial configuration I need to drag and drop the storage icon and use the same label as the management traffic


Personally I can't see how 2) will help ie. the only time I should need to assign a label to storage is if I use a different NIC.


Apologies for bringing this up again but am happy to run any tests and would like to file accurate bug report.






________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 11 June 2018 10:58
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Well, it seems that you have found a bug. Can you fill out an issue report
on Github?

Thanks for the hard work on debugging and testing.

On Fri, Jun 8, 2018 at 2:17 PM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> So based on Erik's suggestion (thanks Erik) I rebuilt the management
> server and setup cluster wide primary storage as opposed to zone wide which
> I have been using so far.
>
>
> Still using 3 NICs (management/Guest/storage) and basic networking.
>
>
> And VM HA now works. In addition it failed over quicker than it did when I
> had zone wide NFS storage on a single NIC.
>
>
> Still a bit confused about this output where it is still showing the
> storage_ip_addresses as 172.30.3.x IPs which is the management subnet but
> maybe I am reading it incorrectly.
>
>
>
> mysql> select * from cloud.host;
> +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
> | id | name            | uuid                                 | status |
> type               | private_ip_address | private_netmask |
> private_mac_address | storage_ip_address | storage_netmask |
> storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> cpus | speed | url                                 | fs_type |
> hypervisor_type | hypervisor_version | ram        | resource | version  |
> parent | total_size | capabilities | guid
>                         | available | setup | dom0_memory | last_ping  |
> mgmt_server_id | disconnected        | created             | removed |
> update_count | resource_state | owner | lastUpdated | engine_state |
> +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
> |  1 | dcp-cscn1.local | 372c738c-5370-4b46-9358-14b649c73d6b | Up     |
> Routing            | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.3        | 255.255.255.128 |
> 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492635804 |   146457912294 |
> 2018-06-08 11:57:31 | 2018-06-08 11:19:36 | NULL    |            4 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  2 | s-1-VM          | 8f40f8a9-1e97-4fce-a143-a8c9489a6d5b | Up     |
> SecondaryStorageVM | 172.30.3.55        | 255.255.255.192 |
> 1e:00:ab:00:00:24   | 172.30.3.55        | 255.255.255.192 |
> 1e:00:ab:00:00:24   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.66       | 255.255.255.128 |
> 1e:00:c4:00:00:67  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
>             |         1 |     0 |           0 | 1492635804 |   146457912294
> | 2018-06-08 11:57:31 | 2018-06-08 11:21:51 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  3 | v-2-VM          | fe5b4f8b-b5b0-4018-8ee2-4c5384da073b | Up     |
> ConsoleProxy       | 172.30.3.40        | 255.255.255.192 |
> 1e:00:96:00:00:15   | 172.30.3.40        | 255.255.255.192 |
> 1e:00:96:00:00:15   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.62       | 255.255.255.128 |
> 1e:00:01:00:00:63  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | Proxy.2-ConsoleProxyResource
>           |         1 |     0 |           0 | 1492635804 |   146457912294 |
> 2018-06-08 11:57:31 | 2018-06-08 11:22:03 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  4 | dcp-cscn2.local | 935260eb-a80c-4ead-85d7-3df8212e301b | Down   |
> Routing            | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.4        | 255.255.255.128 |
> 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
> |         1 |     0 |           0 | 1492635804 |           NULL |
> 2018-06-08 11:57:31 | 2018-06-08 11:35:07 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  5 | dcp-cscn3.local | f3cabc9e-9679-4d7e-8297-b6765eea2770 | Up     |
> Routing            | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.5        | 255.255.255.128 |
> 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492635804 |   146457912294 |
> 2018-06-08 11:57:31 | 2018-06-08 11:36:27 | NULL    |            4 |
> Enabled        | NULL  | NULL        | Disabled     |
> +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
> 5 rows in set (0.00 sec)
>
> mysql>
>
> So some sort of bug maybe ?
>
> ________________________________
> From: Erik Weber <te...@gmail.com>
> Sent: 08 June 2018 10:15
> To: users@cloudstack.apache.org
> Subject: Re: advanced networking with public IPs direct to VMs
>
> While someone ponders about the zone wide storage, you could try adding a
> cluster wide nfs storage and see if it the rest works in that setup.
>
> Erik
>
> On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Yes, all basic. I read a Shapeblue doc that recommended splitting traffic
> > across multiple NICs even in basic networking mode so that is what I am
> > trying to do.
> >
> >
> > With single NIC you do not get the NFS storage message.
> >
> >
> > I have the entire management server logs for both scenarios after I
> pulled
> > the power to one of the compute nodes but from the single NIC setup these
> > seem to be the relevant lines -
> >
> >
> > 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on 4
> > with status Down
> > 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource
> state
> > = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> > 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> > dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is down,
> > name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> > 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> > (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> > agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4:
> Host
> > with specified id is not in the right state: Down
> > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> > returned status:Down for the investigated host:4
> > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> > legacy state Down for host 4
> > 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was able
> to
> > determine host 4 is in Down
> > 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4
> state
> > determined is Down
> > 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> > 4-dcp-cscn2.local. Starting HA on the VMs
> >
> > At the moment I only need to assign public IPs direct to VMs rather than
> > using NAT with the virtual router but would be happy to go with advanced
> > networking if it would make things easier :)
> >
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 07 June 2018 10:35
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > Ah so, it is not an advanced setup; even when you use multiple NICs.
> > Can you confirm that the message ""Agent investigation was requested on
> > host, but host does not support investigation because it has no NFS
> > storage. Skipping investigation." does not appear when you use a single
> > NIC? Can you check other log entries that might appear when the host is
> > marked as "down"?
> >
> > On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > It is all basic networking at the moment for all the setups.
> > >
> > >
> > > If you want me to I can setup a single NIC solution again and run any
> > > commands you need me to do.
> > >
> > >
> > > FYI when I setup single NIC I use the guided  installtion option in the
> > UI
> > > rather than manual setup which I do for the multiple NIC scenario.
> > >
> > >
> > > Happy to set it up if it helps.
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 10:23
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > Ok, so that explains the log message. This is looking like a bug to me.
> > It
> > > seems that in Zone wide the host state (when disconnected) is not being
> > > properly identified due to this NFS thing, and as a consequency it has
> a
> > > side effect in VM HA.
> > >
> > > We would need some inputs from guys that have advanced networking
> > > deployments and Zone wide storage.
> > >
> > > I do not see how the all in one NIC deployment scenario is working
> > though.
> > > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
> > > simple, if there is no NFS in the cluster (NFS storage pools found for
> a
> > > host's cluster), KVM hosts will be detected as "disconnected" and not
> > down
> > > with that warning message you noticed.
> > >
> > > When you say "all in one NIC", is it an advanced network deployment
> where
> > > you put all traffic in a single network, or is it a basic networking
> that
> > > you are doing?
> > >
> > > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > zone wide.
> > > >
> > > >
> > > > ________________________________
> > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > Sent: 07 June 2018 10:04
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > What type of storage are you using? Zone wide? Or cluster "wide"
> > storage?
> > > >
> > > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > > wrote:
> > > >
> > > > > Rafael
> > > > >
> > > > >
> > > > > Here is the output as requested -
> > > > >
> > > > >
> > > > >
> > > > > mysql> mysql> select * from cloud.storage_pool where removed is
> null;
> > > > > +----+------+--------------------------------------+--------
> > > > > -----------+------+----------------+--------+------------+--
> > > > > ----------+----------------+--------------+-----------+-----
> > > > > ------------+---------------------+---------+-------------+-
> > > > > -------+-----------------------+-------+------------+-------
> > > > > --+---------------+
> > > > > | id | name | uuid                                 | pool_type
> > >  |
> > > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > > capacity_bytes |
> > > > > host_address | user_info | path            | created             |
> > > > removed
> > > > > | update_time | status | storage_provider_name | scope |
> hypervisor |
> > > > > managed | capacity_iops |
> > > > > +----+------+--------------------------------------+--------
> > > > > -----------+------+----------------+--------+------------+--
> > > > > ----------+----------------+--------------+-----------+-----
> > > > > ------------+---------------------+---------+-------------+-
> > > > > -------+-----------------------+-------+------------+-------
> > > > > --+---------------+
> > > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > > NetworkFilesystem |
> > > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > > 79133933568 |
> > > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 |
> > NULL
> > > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM
>   |
> > > > >  0 |          NULL |
> > > > > +----+------+--------------------------------------+--------
> > > > > -----------+------+----------------+--------+------------+--
> > > > > ----------+----------------+--------------+-----------+-----
> > > > > ------------+---------------------+---------+-------------+-
> > > > > -------+-----------------------+-------+------------+-------
> > > > > --+---------------+
> > > > > 1 row in set (0.00 sec)
> > > > >
> > > > > mysql>
> > > > >
> > > > > Do you think this problem is related to my NIC/bridge configuration
> > or
> > > > the
> > > > > way I am configuring the zone ?
> > > > >
> > > > > Jon
> > > > > ________________________________
> > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > Sent: 07 June 2018 06:45
> > > > > To: users
> > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > > Can you also post the result of:
> > > > > select * from cloud.storage_pool where removed is null
> > > > >
> > > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > > Dag.Sonstebo@shapeblue.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Jon,
> > > > > >
> > > > > > Still confused where your primary storage pools are – are you
> sure
> > > your
> > > > > > hosts are in cluster 1?
> > > > > >
> > > > > > Quick question just to make sure - assuming management/storage is
> > on
> > > > the
> > > > > > same NIC when I setup basic networking the physical network has
> the
> > > > > > management and guest icons already there and I just edit the KVM
> > > > labels.
> > > > > If
> > > > > > I am running storage over management do I need to drag the
> storage
> > > icon
> > > > > to
> > > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > > management or does CS automatically just use the management NIC
> > ie. I
> > > > > would
> > > > > > only need to drag the storage icon across in basic setup if I
> > wanted
> > > it
> > > > > on
> > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > >
> > > > > > >> I would do both – set up your 2/3 physical networks, name
> isn’t
> > > that
> > > > > > important – but then drag the traffic types to the correct one
> and
> > > make
> > > > > > sure the labels are correct.
> > > > > > Regards,
> > > > > > Dag Sonstebo
> > > > > > Cloud Architect
> > > > > > ShapeBlue
> > > > > >
> > > > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk>
> > wrote:
> > > > > >
> > > > > >     Dag
> > > > > >
> > > > > >
> > > > > >     Do you mean  check the pools with "Infrastructure -> Primary
> > > > Storage"
> > > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > > >
> > > > > >
> > > > > >     If so Primary Storage has a state of UP, secondary storage
> does
> > > not
> > > > > > show a state as such so not sure where else to check it ?
> > > > > >
> > > > > >
> > > > > >     Rerun of the command -
> > > > > >
> > > > > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > > > > >     Empty set (0.00 sec)
> > > > > >
> > > > > >     mysql>
> > > > > >
> > > > > >     I think it is something to do with my zone creation rather
> than
> > > the
> > > > > > NIC, bridge setup although I can post those if needed.
> > > > > >
> > > > > >     I may try to setup just the 2 NIC solution you mentioned
> > although
> > > > as
> > > > > I
> > > > > > say I had the same issue with that ie. host goes to "Altert"
> state
> > > and
> > > > > same
> > > > > > error messages.  The only time I can get it to go to "Down" state
> > is
> > > > when
> > > > > > it is all on the single NIC.
> > > > > >
> > > > > >     Quick question just to make sure - assuming
> management/storage
> > is
> > > > on
> > > > > > the same NIC when I setup basic networking the physical network
> has
> > > the
> > > > > > management and guest icons already there and I just edit the KVM
> > > > labels.
> > > > > If
> > > > > > I am running storage over management do I need to drag the
> storage
> > > icon
> > > > > to
> > > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > > management or does CS automatically just use the management NIC
> > ie. I
> > > > > would
> > > > > > only need to drag the storage icon across in basic setup if I
> > wanted
> > > it
> > > > > on
> > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > >
> > > > > >     On the plus side I have been at this for so long now and done
> > so
> > > > many
> > > > > > rebuilds I could do it in my sleep now ??
> > > > > >
> > > > > >
> > > > > >     ________________________________
> > > > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > > > >     Sent: 06 June 2018 12:28
> > > > > >     To: users@cloudstack.apache.org
> > > > > >     Subject: Re: advanced networking with public IPs direct to
> VMs
> > > > > >
> > > > > >     Looks OK to me Jon.
> > > > > >
> > > > > >     The one thing that throws me is your storage pools – can you
> > > rerun
> > > > > > your query: select * from cloud.storage_pool where cluster_id =
> 1;
> > > > > >
> > > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > > >
> > > > > >     Regards,
> > > > > >     Dag Sonstebo
> > > > > >     Cloud Architect
> > > > > >     ShapeBlue
> > > > > >
> > > > > >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk>
> > > > wrote:
> > > > > >
> > > > > >         Don't know whether this helps or not but I logged into
> the
> > > SSVM
> > > > > > and ran an ifconfig -
> > > > > >
> > > > > >
> > > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > > > > 169.254.255.255
> > > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> > broadcast
> > > > > > 172.30.3.63
> > > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> > broadcast
> > > > > > 172.30.4.127
> > > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> > broadcast
> > > > > > 172.30.5.15
> > > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >
> > > > > >         so it has interfaces in both the management and the
> storage
> > > > > > subnets (as well as guest).
> > > > > >
> > > > > >
> > > > > >
> > > > > >         ________________________________
> > > > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > > > >         Sent: 06 June 2018 11:08
> > > > > >         To: users@cloudstack.apache.org
> > > > > >         Subject: Re: advanced networking with public IPs direct
> to
> > > VMs
> > > > > >
> > > > > >         Hi Rafael
> > > > > >
> > > > > >
> > > > > >         Thanks for the help, really appreciate it.
> > > > > >
> > > > > >
> > > > > >         So rerunning that command with all servers up -
> > > > > >
> > > > > >
> > > > > >
> > > > > >         mysql> select * from cloud.storage_pool where cluster_id
> =
> > 1
> > > > and
> > > > > > removed is null;
> > > > > >         Empty set (0.00 sec)
> > > > > >
> > > > > >         mysql>
> > > > > >
> > > > > >
> > > > > >         As for the storage IP no I'm not setting it to be the
> > > > management
> > > > > > IP when I setup the zone but the output of the SQL command
> suggests
> > > > that
> > > > > is
> > > > > > what has happened.
> > > > > >
> > > > > >         As I said to Dag I am using a different subnet for
> storage
> > > ie.
> > > > > >
> > > > > >         172.30.3.0/26  - management subnet
> > > > > >         172.30.4.0/25 -  guest VM subnet
> > > > > >         172.30.5.0/28 - storage
> > > > > >
> > > > > >         the NFS server IP is 172.30.5.2
> > > > > >
> > > > > >         each compute node has 3 NICs with an IP from each subnet
> (i
> > > am
> > > > > > assuming the management node only needs an IP in the management
> > > network
> > > > > ?)
> > > > > >
> > > > > >         When I add the zone in the UI I have one physical network
> > > with
> > > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > > >         When I fill in the storage traffic page I use the range
> > > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> > allocated
> > > to
> > > > > the
> > > > > > compute nodes and the NFS server.
> > > > > >
> > > > > >         I think maybe I am doing something wrong in the UI setup
> > but
> > > it
> > > > > is
> > > > > > not obvious to me what it is.
> > > > > >
> > > > > >         What I might try today unless you want me to keep the
> > setup I
> > > > > have
> > > > > > for more outputs is to go back to 2 NICs, one for
> > storage/management
> > > > and
> > > > > > one for guest VMs.
> > > > > >
> > > > > >         I think with the 2 NICs setup the mistake I made last
> time
> > > when
> > > > > > adding the zone was to assume storage would just run over
> > management
> > > > so I
> > > > > > did not drag and drop the storage icon and assign it to cloudbr0
> as
> > > > with
> > > > > > the management which I think is what I should do ?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >         ________________________________
> > > > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > > > >         Sent: 06 June 2018 10:54
> > > > > >         To: users
> > > > > >         Subject: Re: advanced networking with public IPs direct
> to
> > > VMs
> > > > > >
> > > > > >         Jon, do not panic we are here to help you :)
> > > > > >         So, I might have mistyped the SQL query. You you use
> > select *
> > > > > from
> > > > > >         cloud.storage_pool where cluster_id = 1 and removed is
> not
> > > null
> > > > > ",
> > > > > > you are
> > > > > >         listing the storage pools removed. Therefore, the right
> > query
> > > > > > would be "
> > > > > >         select * from cloud.storage_pool where cluster_id = 1 and
> > > > removed
> > > > > > is null "
> > > > > >
> > > > > >         There is also something else I do not understand. You are
> > > > setting
> > > > > > the
> > > > > >         storage IP in the management subnet? I am not sure if you
> > > > should
> > > > > > be doing
> > > > > >         like this. Normally, I set all my storages (primary[when
> > > > working
> > > > > > with NFS]
> > > > > >         and secondary) to IPs in the storage subnet.
> > > > > >
> > > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > > Dag.Sonstebo@shapeblue.com>
> > > > > >         wrote:
> > > > > >
> > > > > >         > Hi John,
> > > > > >         >
> > > > > >         > I’m late to this thread and have possibly missed some
> > > things
> > > > –
> > > > > > but a
> > > > > >         > couple of observations:
> > > > > >         >
> > > > > >         > “When I add the zone and get to the storage web page I
> > > > exclude
> > > > > > the IPs
> > > > > >         > already used for the compute node NICs and the NFS
> server
> > > > > > itself. …..”
> > > > > >         > “So the range is 172.30.5.1 -> 15 and the range I fill
> in
> > > is
> > > > > > 172.30.5.10
> > > > > >         > -> 172.30.5.14.”
> > > > > >         >
> > > > > >         > I think you may have some confusion around the use of
> the
> > > > > > storage network.
> > > > > >         > The important part here is to understand this is for
> > > > *secondary
> > > > > > storage*
> > > > > >         > use only – it has nothing to do with primary storage.
> > This
> > > > > means
> > > > > > this
> > > > > >         > storage network needs to be accessible to the SSVM, to
> > the
> > > > > > hypervisors, and
> > > > > >         > secondary storage NFS pools needs to be accessible on
> > this
> > > > > > network.
> > > > > >         >
> > > > > >         > The important part – this also means you *can not use
> the
> > > > same
> > > > > > IP ranges
> > > > > >         > for management and storage networks* - doing so means
> you
> > > > will
> > > > > > have issues
> > > > > >         > where effectively both hypervisors and SSVM can see the
> > > same
> > > > > > subnet on two
> > > > > >         > NICs – and you end up in a routing black hole.
> > > > > >         >
> > > > > >         > So – you need to either:
> > > > > >         >
> > > > > >         > 1) Use different IP subnets on management and storage,
> or
> > > > > >         > 2) preferably just simplify your setup – stop using a
> > > > secondary
> > > > > > storage
> > > > > >         > network altogether and just allow secondary storage to
> > use
> > > > the
> > > > > > management
> > > > > >         > network (which is default). Unless you have a very high
> > I/O
> > > > > > environment in
> > > > > >         > production you are just adding complexity by running
> > > separate
> > > > > > management
> > > > > >         > and storage.
> > > > > >         >
> > > > > >         > Regards,
> > > > > >         > Dag Sonstebo
> > > > > >         > Cloud Architect
> > > > > >         > ShapeBlue
> > > > > >         >
> > > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > > jms.123@hotmail.co.uk>
> > > > > > wrote:
> > > > > >         >
> > > > > >         >     I will disconnect the host this morning and test
> but
> > > > before
> > > > > > I do that
> > > > > >         > I ran this command when all hosts are up -
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >      select * from cloud.host;
> > > > > >         >     +----+-----------------+------
> > > > > ------------------------------
> > > > > >         > --+--------+------------------
> > > --+--------------------+------
> > > > > >         > -----------+------------------
> > > ---+--------------------+-----
> > > > > >         > ------------+-----------------
> > > ----+----------------------+--
> > > > > >         > ---------------------+--------
> > > -----------+------------+-----
> > > > > >         > --------------+---------------
> > > --+--------------------+------
> > > > > >         > ------+----------------+------
> > > --+-------------+------+------
> > > > > >         > -+----------------------------
> > > ---------+---------+----------
> > > > > >         > -------+--------------------+-
> > > -----------+----------+-------
> > > > > >         > ---+--------+------------+----
> > > ----------+-------------------
> > > > > >         > ------------------------------
> > > --------------+-----------+---
> > > > > >         > ----+-------------+-----------
> > > -+----------------+-----------
> > > > > >         > ----------+-------------------
> > > --+---------+--------------+--
> > > > > >         > --------------+-------+-------------+--------------+
> > > > > >         >     | id | name            | uuid
> > > > > >  | status
> > > > > >         > | type               | private_ip_address |
> > > private_netmask |
> > > > > >         > private_mac_address | storage_ip_address |
> > storage_netmask
> > > |
> > > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > > storage_mac_address_2 |
> > > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > > public_netmask  |
> > > > > >         > public_mac_address | proxy_port | data_center_id |
> > pod_id |
> > > > > > cpu_sockets |
> > > > > >         > cpus | speed | url                                 |
> > > fs_type
> > > > |
> > > > > >         > hypervisor_type | hypervisor_version | ram        |
> > > resource
> > > > |
> > > > > > version  |
> > > > > >         > parent | total_size | capabilities | guid
> > > > > >         >                         | available | setup |
> > dom0_memory |
> > > > > > last_ping  |
> > > > > >         > mgmt_server_id | disconnected        | created
> > >  |
> > > > > > removed |
> > > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > > engine_state |
> > > > > >         >     +----+-----------------+------
> > > > > ------------------------------
> > > > > >         > --+--------+------------------
> > > --+--------------------+------
> > > > > >         > -----------+------------------
> > > ---+--------------------+-----
> > > > > >         > ------------+-----------------
> > > ----+----------------------+--
> > > > > >         > ---------------------+--------
> > > -----------+------------+-----
> > > > > >         > --------------+---------------
> > > --+--------------------+------
> > > > > >         > ------+----------------+------
> > > --+-------------+------+------
> > > > > >         > -+----------------------------
> > > ---------+---------+----------
> > > > > >         > -------+--------------------+-
> > > -----------+----------+-------
> > > > > >         > ---+--------+------------+----
> > > ----------+-------------------
> > > > > >         > ------------------------------
> > > --------------+-----------+---
> > > > > >         > ----+-------------+-----------
> > > -+----------------+-----------
> > > > > >         > ----------+-------------------
> > > --+---------+--------------+--
> > > > > >         > --------------+-------+-------------+--------------+
> > > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > > eabd60012284
> > > > > > | Up
> > > > > >         >  | Routing            | 172.30.3.3         |
> > > 255.255.255.192
> > > > |
> > > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> > 255.255.255.192
> > > |
> > > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |          1 | 172.30.4.3        |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> > 1 |
> > > > > >      1 |
> > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 |
> NULL
> > > > |
> > > > > > KVM
> > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > 4.11.0.0
> > > > |
> > > > > > NULL   |
> > > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > > f83e7a5e8efb-
> > > > > > LibvirtComputingResource
> > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > > 4 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > > 3f5e2c2215b0
> > > > > > | Up
> > > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > > 255.255.255.192
> > > > |
> > > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> > 255.255.255.192
> > > |
> > > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |       NULL | 172.30.4.98       |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> > 1 |
> > > > > >   NULL |
> > > > > >         > NULL |  NULL | NoIqn                               |
> NULL
> > > > |
> > > > > > NULL
> > > > > >         >     | NULL               |          0 | NULL     |
> > > 4.11.0.0 |
> > > > > > NULL   |
> > > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > > >         >           |         1 |     0 |           0 |
> 1492390409
> > |
> > > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > > 7 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > > ff3845bb556c
> > > > > > | Up
> > > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > > 255.255.255.192
> > > > |
> > > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> > 255.255.255.192
> > > |
> > > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |       NULL | 172.30.4.86       |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> > 1 |
> > > > > >   NULL |
> > > > > >         > NULL |  NULL | NoIqn                               |
> NULL
> > > > |
> > > > > > NULL
> > > > > >         >     | NULL               |          0 | NULL     |
> > > 4.11.0.0 |
> > > > > > NULL   |
> > > > > >         >  NULL | NULL         | s-1-VM-
> NfsSecondaryStorageResource
> > > > > >         >             |         1 |     0 |           0 |
> > 1492390407
> > > |
> > > > > >  146457912294
> > > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > > > > >   7 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > > 1a96ffac9794
> > > > > > | Up
> > > > > >         >  | Routing            | 172.30.3.4         |
> > > 255.255.255.192
> > > > |
> > > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> > 255.255.255.192
> > > |
> > > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |          1 | 172.30.4.4        |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> > 1 |
> > > > > >      1 |
> > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 |
> NULL
> > > > |
> > > > > > KVM
> > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > 4.11.0.0
> > > > |
> > > > > > NULL   |
> > > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > > 16d0f99b11c9-
> > > > > > LibvirtComputingResource
> > > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > > 8 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > > ee29d2b5c274
> > > > > > | Up
> > > > > >         >  | Routing            | 172.30.3.5         |
> > > 255.255.255.192
> > > > |
> > > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> > 255.255.255.192
> > > |
> > > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |          1 | 172.30.4.5        |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> > 1 |
> > > > > >      1 |
> > > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c |
> NULL
> > > > |
> > > > > > KVM
> > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > 4.11.0.0
> > > > |
> > > > > > NULL   |
> > > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > > 37f3eebad8fb-
> > > > > > LibvirtComputingResource
> > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > > 6 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     +----+-----------------+------
> > > > > ------------------------------
> > > > > >         > --+--------+------------------
> > > --+--------------------+------
> > > > > >         > -----------+------------------
> > > ---+--------------------+-----
> > > > > >         > ------------+-----------------
> > > ----+----------------------+--
> > > > > >         > ---------------------+--------
> > > -----------+------------+-----
> > > > > >         > --------------+---------------
> > > --+--------------------+------
> > > > > >         > ------+----------------+------
> > > --+-------------+------+------
> > > > > >         > -+----------------------------
> > > ---------+---------+----------
> > > > > >         > -------+--------------------+-
> > > -----------+----------+-------
> > > > > >         > ---+--------+------------+----
> > > ----------+-------------------
> > > > > >         > ------------------------------
> > > --------------+-----------+---
> > > > > >         > ----+-------------+-----------
> > > -+----------------+-----------
> > > > > >         > ----------+-------------------
> > > --+---------+--------------+--
> > > > > >         > --------------+-------+-------------+--------------+
> > > > > >         >     5 rows in set (0.00 sec)
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     and you can see that it says the storage IP address
> > is
> > > > the
> > > > > > same as the
> > > > > >         > private IP address (the management network).
> > > > > >         >
> > > > > >         >
> > > > > >         >     I also ran the command you provided using the
> Cluster
> > > ID
> > > > > > number from
> > > > > >         > the table above -
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     mysql> select * from cloud.storage_pool where
> > > cluster_id
> > > > =
> > > > > 1
> > > > > > and
> > > > > >         > removed is not null;
> > > > > >         >     Empty set (0.00 sec)
> > > > > >         >
> > > > > >         >     mysql>
> > > > > >         >
> > > > > >         >     So assuming I am reading this correctly that seems
> to
> > > be
> > > > > the
> > > > > > issue.
> > > > > >         >
> > > > > >         >
> > > > > >         >     I am at a loss as to why though.
> > > > > >         >
> > > > > >         >
> > > > > >         >     I have a separate NIC for storage as described.
> When
> > I
> > > > add
> > > > > > the zone
> > > > > >         > and get to the storage web page I exclude the IPs
> already
> > > > used
> > > > > > for the
> > > > > >         > compute node NICs and the NFS server itself. I do this
> > > > because
> > > > > > initially I
> > > > > >         > didn't and the SSVM started using the IP address of the
> > NFS
> > > > > > server.
> > > > > >         >
> > > > > >         >
> > > > > >         >     So the range is 172.30.5.1 -> 15 and the range I
> fill
> > > in
> > > > is
> > > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > > >         >
> > > > > >         >
> > > > > >         >     And I used the label "cloudbr2" for storage.
> > > > > >         >
> > > > > >         >
> > > > > >         >     I must be doing this wrong somehow.
> > > > > >         >
> > > > > >         >
> > > > > >         >     Any pointers would be much appreciated.
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     ________________________________
> > > > > >         >     From: Rafael Weingärtner <
> > rafaelweingartner@gmail.com>
> > > > > >         >     Sent: 05 June 2018 16:13
> > > > > >         >     To: users
> > > > > >         >     Subject: Re: advanced networking with public IPs
> > direct
> > > > to
> > > > > > VMs
> > > > > >         >
> > > > > >         >     That is interesting. Let's see the source of all
> > > truth...
> > > > > >         >     This is the code that is generating that odd
> message.
> > > > > >         >
> > > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > > getClusterId());
> > > > > >         >     >         boolean hasNfs = false;
> > > > > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > > > > >         >     >             if (pool.getPoolType() ==
> > > StoragePoolType.
> > > > > > NetworkFilesystem)
> > > > > >         > {
> > > > > >         >     >                 hasNfs = true;
> > > > > >         >     >                 break;
> > > > > >         >     >             }
> > > > > >         >     >         }
> > > > > >         >     >         if (!hasNfs) {
> > > > > >         >     >             s_logger.warn(
> > > > > >         >     >                     "Agent investigation was
> > > requested
> > > > on
> > > > > > host " +
> > > > > >         > agent +
> > > > > >         >     > ", but host does not support investigation
> because
> > it
> > > > has
> > > > > > no NFS
> > > > > >         > storage.
> > > > > >         >     > Skipping investigation.");
> > > > > >         >     >             return Status.Disconnected;
> > > > > >         >     >         }
> > > > > >         >     >
> > > > > >         >
> > > > > >         >     There are two possibilities here. You do not have
> any
> > > NFS
> > > > > > storage? Is
> > > > > >         > that
> > > > > >         >     the case? Or maybe, for some reason, the call
> > > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > > getClusterId())"
> > > > > > is not
> > > > > >         > returning
> > > > > >         >     any NFS storage pools. Looking at the
> > > > "listPoolsByCluster "
> > > > > > we will see
> > > > > >         >     that the following SQL is used:
> > > > > >         >
> > > > > >         >     Select * from storage_pool where cluster_id =
> > > > > > <host'sClusterId> and
> > > > > >         > removed
> > > > > >         >     > is not null
> > > > > >         >     >
> > > > > >         >
> > > > > >         >     Can you run that SQL to see the its return when
> your
> > > > hosts
> > > > > > are marked
> > > > > >         > as
> > > > > >         >     disconnected?
> > > > > >         >
> > > > > >         >
> > > > > >         > Dag.Sonstebo@shapeblue.com
> > > > > >         > www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > > >         Shapeblue - The CloudStack Company<
> > http://www.shapeblue.com/
> > > >
> > > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > > >         ShapeBlue are the largest independent integrator of
> > > CloudStack
> > > > > > technologies globally and are specialists in the design and
> > > > > implementation
> > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > implementations.
> > > > > >
> > > > > >
> > > > > >
> > > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > >         > @shapeblue
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > > jms.123@hotmail.co.uk>
> > > > > >         > wrote:
> > > > > >         >
> > > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > > configured
> > > > > > the zone
> > > > > >         > through
> > > > > >         >     > the UI I used the labels cloudbr0 for management,
> > > > > cloudbr1
> > > > > > for guest
> > > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > > response
> > > > > > to you.
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > When I pull the power to the node
> (dcp-cscn2.local)
> > > > after
> > > > > > about 5
> > > > > >         > mins
> > > > > >         >     > the  host status goes to "Alert" but never to
> > "Down"
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > I get this in the logs -
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > [c.c.h.KVMInvestigator]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > Agent
> > > > > > investigation
> > > > > >         > was
> > > > > >         >     > requested on host Host[-4-Routing], but host does
> > not
> > > > > > support
> > > > > >         > investigation
> > > > > >         >     > because it has no NFS storage. Skipping
> > > investigation.
> > > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > > HighAvailabilityManagerImpl]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > > KVMInvestigator was
> > > > > >         > able to
> > > > > >         >     > determine host 4 is in Disconnected
> > > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > > [c.c.a.m.AgentManagerImpl]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> The
> > > > agent
> > > > > > from host
> > > > > >         > 4 state
> > > > > >         >     > determined is Disconnected
> > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > > [c.c.a.m.AgentManagerImpl]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > Agent
> > > > is
> > > > > >         > disconnected but
> > > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > > >         >     >
> > > > > >         >     > I don't understand why it thinks there is no NFS
> > > > storage
> > > > > > as each
> > > > > >         > compute
> > > > > >         >     > node has a dedicated storage NIC.
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > I also don't understand why it thinks the host is
> > > still
> > > > > up
> > > > > > ie. what
> > > > > >         > test
> > > > > >         >     > is it doing to determine that ?
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > Am I just trying to get something working that is
> > not
> > > > > > supported ?
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > ________________________________
> > > > > >         >     > From: Rafael Weingärtner <
> > > rafaelweingartner@gmail.com>
> > > > > >         >     > Sent: 04 June 2018 15:31
> > > > > >         >     > To: users
> > > > > >         >     > Subject: Re: advanced networking with public IPs
> > > direct
> > > > > to
> > > > > > VMs
> > > > > >         >     >
> > > > > >         >     > What type of failover are you talking about?
> > > > > >         >     > What ACS version are you using?
> > > > > >         >     > What hypervisor are you using?
> > > > > >         >     > How are you configuring your NICs in the
> > hypervisor?
> > > > > >         >     > How are you configuring the traffic labels in
> ACS?
> > > > > >         >     >
> > > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > > jms.123@hotmail.co.uk
> > > > > >         > >
> > > > > >         >     > wrote:
> > > > > >         >     >
> > > > > >         >     > > Hi all
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > I am close to giving up on basic networking as
> I
> > > just
> > > > > > cannot get
> > > > > >         > failover
> > > > > >         >     > > working with multiple NICs (I am not even sure
> it
> > > is
> > > > > > supported).
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > What I would like is to use 3 NICs for
> > management,
> > > > > > storage and
> > > > > >         > guest
> > > > > >         >     > > traffic. I would like to assign public IPs
> direct
> > > to
> > > > > the
> > > > > > VMs which
> > > > > >         > is
> > > > > >         >     > why I
> > > > > >         >     > > originally chose basic.
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > If I switch to advanced networking do I just
> > > > configure
> > > > > a
> > > > > > guest VM
> > > > > >         > with
> > > > > >         >     > > public IPs on one NIC and not both with the
> > public
> > > > > > traffic -
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > would this work ?
> > > > > >         >     > >
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > --
> > > > > >         >     > Rafael Weingärtner
> > > > > >         >     >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     --
> > > > > >         >     Rafael Weingärtner
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >
> > > > > >
> > > > > >         --
> > > > > >         Rafael Weingärtner
> > > > > >
> > > > > >
> > > > > >
> > > > > >     Dag.Sonstebo@shapeblue.com
> > > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > >     @shapeblue
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Dag.Sonstebo@shapeblue.com
> > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > @shapeblue
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rafael Weingärtner
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
 I don't have a github account but can setup one up and do a report
sometime this week if that is okay ?
Of course, feel free to do it when you have time

1) if I run management and storage over same interface the NFS server can
still be on a different subnet than the management subnet ie. the NFS
server does not have to have IP from the management subnet ?
Yes, as long as your XenServer hosts are able to access the storage IP you
should be fine.

2) If i add another cluster can I just create a different NFS share from
the same server ?
Sure, you can do that.
Then you will need to introduce this new folder/storage to ACS.


On Mon, Jun 11, 2018 at 12:24 PM, Jon Marshall <jm...@hotmail.co.uk>
wrote:

> Hi Rafael
>
>
> I don't have a github account but can setup one up and do a report
> sometime this week if that is okay ?
>
>
> No problem with the testing and thanks for the help.
>
>
> Before I leave this if I use NFS cluster mode couple of questions -
>
>
> 1) if I run management and storage over same interface the NFS server can
> still be on a different subnet than the management subnet ie. the NFS
> server does not have to have IP from the management subnet ?
>
>
> 2) If i add another cluster can I just create a different NFS share from
> the same server ?
>
>
> Finally many thanks to you and the others for the help provided.
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 11 June 2018 10:58
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Well, it seems that you have found a bug. Can you fill out an issue report
> on Github?
>
> Thanks for the hard work on debugging and testing.
>
> On Fri, Jun 8, 2018 at 2:17 PM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > So based on Erik's suggestion (thanks Erik) I rebuilt the management
> > server and setup cluster wide primary storage as opposed to zone wide
> which
> > I have been using so far.
> >
> >
> > Still using 3 NICs (management/Guest/storage) and basic networking.
> >
> >
> > And VM HA now works. In addition it failed over quicker than it did when
> I
> > had zone wide NFS storage on a single NIC.
> >
> >
> > Still a bit confused about this output where it is still showing the
> > storage_ip_addresses as 172.30.3.x IPs which is the management subnet but
> > maybe I am reading it incorrectly.
> >
> >
> >
> > mysql> select * from cloud.host;
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > | id | name            | uuid                                 | status |
> > type               | private_ip_address | private_netmask |
> > private_mac_address | storage_ip_address | storage_netmask |
> > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> > storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> > cpus | speed | url                                 | fs_type |
> > hypervisor_type | hypervisor_version | ram        | resource | version  |
> > parent | total_size | capabilities | guid
> >                         | available | setup | dom0_memory | last_ping  |
> > mgmt_server_id | disconnected        | created             | removed |
> > update_count | resource_state | owner | lastUpdated | engine_state |
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > |  1 | dcp-cscn1.local | 372c738c-5370-4b46-9358-14b649c73d6b | Up     |
> > Routing            | 172.30.3.3         | 255.255.255.192 |
> > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> > 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.3        | 255.255.255.128 |
> > 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |   146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:19:36 | NULL    |            4 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  2 | s-1-VM          | 8f40f8a9-1e97-4fce-a143-a8c9489a6d5b | Up     |
> > SecondaryStorageVM | 172.30.3.55        | 255.255.255.192 |
> > 1e:00:ab:00:00:24   | 172.30.3.55        | 255.255.255.192 |
> > 1e:00:ab:00:00:24   | NULL                 | NULL                  | NULL
> >             |       NULL | 172.30.4.66       | 255.255.255.128 |
> > 1e:00:c4:00:00:67  |       NULL |              1 |      1 |        NULL |
> > NULL |  NULL | NoIqn                               | NULL    | NULL
> >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
> >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> >             |         1 |     0 |           0 | 1492635804 |
>  146457912294
> > | 2018-06-08 11:57:31 | 2018-06-08 11:21:51 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  3 | v-2-VM          | fe5b4f8b-b5b0-4018-8ee2-4c5384da073b | Up     |
> > ConsoleProxy       | 172.30.3.40        | 255.255.255.192 |
> > 1e:00:96:00:00:15   | 172.30.3.40        | 255.255.255.192 |
> > 1e:00:96:00:00:15   | NULL                 | NULL                  | NULL
> >             |       NULL | 172.30.4.62       | 255.255.255.128 |
> > 1e:00:01:00:00:63  |       NULL |              1 |      1 |        NULL |
> > NULL |  NULL | NoIqn                               | NULL    | NULL
> >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
> >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> >           |         1 |     0 |           0 | 1492635804 |
>  146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:22:03 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  4 | dcp-cscn2.local | 935260eb-a80c-4ead-85d7-3df8212e301b | Down   |
> > Routing            | 172.30.3.4         | 255.255.255.192 |
> > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> > 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.4        | 255.255.255.128 |
> > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |           NULL |
> > 2018-06-08 11:57:31 | 2018-06-08 11:35:07 | NULL    |            7 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > |  5 | dcp-cscn3.local | f3cabc9e-9679-4d7e-8297-b6765eea2770 | Up     |
> > Routing            | 172.30.3.5         | 255.255.255.192 |
> > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> > 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
> >             |          1 | 172.30.4.5        | 255.255.255.128 |
> > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
> >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
> >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
> >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-
> LibvirtComputingResource
> > |         1 |     0 |           0 | 1492635804 |   146457912294 |
> > 2018-06-08 11:57:31 | 2018-06-08 11:36:27 | NULL    |            4 |
> > Enabled        | NULL  | NULL        | Disabled     |
> > +----+-----------------+------------------------------------
> > --+--------+--------------------+--------------------+------
> > -----------+---------------------+--------------------+-----
> > ------------+---------------------+----------------------+--
> > ---------------------+-------------------+------------+-----
> > --------------+-----------------+--------------------+------
> > ------+----------------+--------+-------------+------+------
> > -+-------------------------------------+---------+----------
> > -------+--------------------+------------+----------+-------
> > ---+--------+------------+--------------+-------------------
> > --------------------------------------------+-----------+---
> > ----+-------------+------------+----------------+-----------
> > ----------+---------------------+---------+--------------+--
> > --------------+-------+-------------+--------------+
> > 5 rows in set (0.00 sec)
> >
> > mysql>
> >
> > So some sort of bug maybe ?
> >
> > ________________________________
> > From: Erik Weber <te...@gmail.com>
> > Sent: 08 June 2018 10:15
> > To: users@cloudstack.apache.org
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > While someone ponders about the zone wide storage, you could try adding a
> > cluster wide nfs storage and see if it the rest works in that setup.
> >
> > Erik
> >
> > On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > Yes, all basic. I read a Shapeblue doc that recommended splitting
> traffic
> > > across multiple NICs even in basic networking mode so that is what I am
> > > trying to do.
> > >
> > >
> > > With single NIC you do not get the NFS storage message.
> > >
> > >
> > > I have the entire management server logs for both scenarios after I
> > pulled
> > > the power to one of the compute nodes but from the single NIC setup
> these
> > > seem to be the relevant lines -
> > >
> > >
> > > 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on
> 4
> > > with status Down
> > > 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource
> > state
> > > = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> > > 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> > > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> > > dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is
> down,
> > > name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> > > 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> > > (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> > > agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4:
> > Host
> > > with specified id is not in the right state: Down
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> > > returned status:Down for the investigated host:4
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> > > legacy state Down for host 4
> > > 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was
> able
> > to
> > > determine host 4 is in Down
> > > 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4
> > state
> > > determined is Down
> > > 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> > > 4-dcp-cscn2.local. Starting HA on the VMs
> > >
> > > At the moment I only need to assign public IPs direct to VMs rather
> than
> > > using NAT with the virtual router but would be happy to go with
> advanced
> > > networking if it would make things easier :)
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 10:35
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > Ah so, it is not an advanced setup; even when you use multiple NICs.
> > > Can you confirm that the message ""Agent investigation was requested on
> > > host, but host does not support investigation because it has no NFS
> > > storage. Skipping investigation." does not appear when you use a single
> > > NIC? Can you check other log entries that might appear when the host is
> > > marked as "down"?
> > >
> > > On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > It is all basic networking at the moment for all the setups.
> > > >
> > > >
> > > > If you want me to I can setup a single NIC solution again and run any
> > > > commands you need me to do.
> > > >
> > > >
> > > > FYI when I setup single NIC I use the guided  installtion option in
> the
> > > UI
> > > > rather than manual setup which I do for the multiple NIC scenario.
> > > >
> > > >
> > > > Happy to set it up if it helps.
> > > >
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > Sent: 07 June 2018 10:23
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > Ok, so that explains the log message. This is looking like a bug to
> me.
> > > It
> > > > seems that in Zone wide the host state (when disconnected) is not
> being
> > > > properly identified due to this NFS thing, and as a consequency it
> has
> > a
> > > > side effect in VM HA.
> > > >
> > > > We would need some inputs from guys that have advanced networking
> > > > deployments and Zone wide storage.
> > > >
> > > > I do not see how the all in one NIC deployment scenario is working
> > > though.
> > > > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is
> dead
> > > > simple, if there is no NFS in the cluster (NFS storage pools found
> for
> > a
> > > > host's cluster), KVM hosts will be detected as "disconnected" and not
> > > down
> > > > with that warning message you noticed.
> > > >
> > > > When you say "all in one NIC", is it an advanced network deployment
> > where
> > > > you put all traffic in a single network, or is it a basic networking
> > that
> > > > you are doing?
> > > >
> > > > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > > wrote:
> > > >
> > > > > zone wide.
> > > > >
> > > > >
> > > > > ________________________________
> > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > Sent: 07 June 2018 10:04
> > > > > To: users
> > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > > What type of storage are you using? Zone wide? Or cluster "wide"
> > > storage?
> > > > >
> > > > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <
> jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > >
> > > > > > Rafael
> > > > > >
> > > > > >
> > > > > > Here is the output as requested -
> > > > > >
> > > > > >
> > > > > >
> > > > > > mysql> mysql> select * from cloud.storage_pool where removed is
> > null;
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > | id | name | uuid                                 | pool_type
> > > >  |
> > > > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > > > capacity_bytes |
> > > > > > host_address | user_info | path            | created
>  |
> > > > > removed
> > > > > > | update_time | status | storage_provider_name | scope |
> > hypervisor |
> > > > > > managed | capacity_iops |
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > > > NetworkFilesystem |
> > > > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > > > 79133933568 |
> > > > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01
> |
> > > NULL
> > > > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM
> >   |
> > > > > >  0 |          NULL |
> > > > > > +----+------+--------------------------------------+--------
> > > > > > -----------+------+----------------+--------+------------+--
> > > > > > ----------+----------------+--------------+-----------+-----
> > > > > > ------------+---------------------+---------+-------------+-
> > > > > > -------+-----------------------+-------+------------+-------
> > > > > > --+---------------+
> > > > > > 1 row in set (0.00 sec)
> > > > > >
> > > > > > mysql>
> > > > > >
> > > > > > Do you think this problem is related to my NIC/bridge
> configuration
> > > or
> > > > > the
> > > > > > way I am configuring the zone ?
> > > > > >
> > > > > > Jon
> > > > > > ________________________________
> > > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > > Sent: 07 June 2018 06:45
> > > > > > To: users
> > > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > > >
> > > > > > Can you also post the result of:
> > > > > > select * from cloud.storage_pool where removed is null
> > > > > >
> > > > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > > > Dag.Sonstebo@shapeblue.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Jon,
> > > > > > >
> > > > > > > Still confused where your primary storage pools are – are you
> > sure
> > > > your
> > > > > > > hosts are in cluster 1?
> > > > > > >
> > > > > > > Quick question just to make sure - assuming management/storage
> is
> > > on
> > > > > the
> > > > > > > same NIC when I setup basic networking the physical network has
> > the
> > > > > > > management and guest icons already there and I just edit the
> KVM
> > > > > labels.
> > > > > > If
> > > > > > > I am running storage over management do I need to drag the
> > storage
> > > > icon
> > > > > > to
> > > > > > > the physical network and use the same KVM label (cloudbr0) as
> the
> > > > > > > management or does CS automatically just use the management NIC
> > > ie. I
> > > > > > would
> > > > > > > only need to drag the storage icon across in basic setup if I
> > > wanted
> > > > it
> > > > > > on
> > > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > > >
> > > > > > > >> I would do both – set up your 2/3 physical networks, name
> > isn’t
> > > > that
> > > > > > > important – but then drag the traffic types to the correct one
> > and
> > > > make
> > > > > > > sure the labels are correct.
> > > > > > > Regards,
> > > > > > > Dag Sonstebo
> > > > > > > Cloud Architect
> > > > > > > ShapeBlue
> > > > > > >
> > > > > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk>
> > > wrote:
> > > > > > >
> > > > > > >     Dag
> > > > > > >
> > > > > > >
> > > > > > >     Do you mean  check the pools with "Infrastructure ->
> Primary
> > > > > Storage"
> > > > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > > > >
> > > > > > >
> > > > > > >     If so Primary Storage has a state of UP, secondary storage
> > does
> > > > not
> > > > > > > show a state as such so not sure where else to check it ?
> > > > > > >
> > > > > > >
> > > > > > >     Rerun of the command -
> > > > > > >
> > > > > > >     mysql> select * from cloud.storage_pool where cluster_id =
> 1;
> > > > > > >     Empty set (0.00 sec)
> > > > > > >
> > > > > > >     mysql>
> > > > > > >
> > > > > > >     I think it is something to do with my zone creation rather
> > than
> > > > the
> > > > > > > NIC, bridge setup although I can post those if needed.
> > > > > > >
> > > > > > >     I may try to setup just the 2 NIC solution you mentioned
> > > although
> > > > > as
> > > > > > I
> > > > > > > say I had the same issue with that ie. host goes to "Altert"
> > state
> > > > and
> > > > > > same
> > > > > > > error messages.  The only time I can get it to go to "Down"
> state
> > > is
> > > > > when
> > > > > > > it is all on the single NIC.
> > > > > > >
> > > > > > >     Quick question just to make sure - assuming
> > management/storage
> > > is
> > > > > on
> > > > > > > the same NIC when I setup basic networking the physical network
> > has
> > > > the
> > > > > > > management and guest icons already there and I just edit the
> KVM
> > > > > labels.
> > > > > > If
> > > > > > > I am running storage over management do I need to drag the
> > storage
> > > > icon
> > > > > > to
> > > > > > > the physical network and use the same KVM label (cloudbr0) as
> the
> > > > > > > management or does CS automatically just use the management NIC
> > > ie. I
> > > > > > would
> > > > > > > only need to drag the storage icon across in basic setup if I
> > > wanted
> > > > it
> > > > > > on
> > > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > > >
> > > > > > >     On the plus side I have been at this for so long now and
> done
> > > so
> > > > > many
> > > > > > > rebuilds I could do it in my sleep now ??
> > > > > > >
> > > > > > >
> > > > > > >     ________________________________
> > > > > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > > > > >     Sent: 06 June 2018 12:28
> > > > > > >     To: users@cloudstack.apache.org
> > > > > > >     Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > > > >
> > > > > > >     Looks OK to me Jon.
> > > > > > >
> > > > > > >     The one thing that throws me is your storage pools – can
> you
> > > > rerun
> > > > > > > your query: select * from cloud.storage_pool where cluster_id =
> > 1;
> > > > > > >
> > > > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > > > >
> > > > > > >     Regards,
> > > > > > >     Dag Sonstebo
> > > > > > >     Cloud Architect
> > > > > > >     ShapeBlue
> > > > > > >
> > > > > > >     On 06/06/2018, 12:08, "Jon Marshall" <
> jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > > > >
> > > > > > >         Don't know whether this helps or not but I logged into
> > the
> > > > SSVM
> > > > > > > and ran an ifconfig -
> > > > > > >
> > > > > > >
> > > > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 169.254.3.35  netmask 255.255.0.0
> broadcast
> > > > > > > 169.254.255.255
> > > > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> > > broadcast
> > > > > > > 172.30.3.63
> > > > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> > > broadcast
> > > > > > > 172.30.4.127
> > > > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> > 1500
> > > > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> > > broadcast
> > > > > > > 172.30.5.15
> > > > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> > > (Ethernet)
> > > > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > > collisions
> > > > > > 0
> > > > > > >
> > > > > > >
> > > > > > >         so it has interfaces in both the management and the
> > storage
> > > > > > > subnets (as well as guest).
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         ________________________________
> > > > > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > > > > >         Sent: 06 June 2018 11:08
> > > > > > >         To: users@cloudstack.apache.org
> > > > > > >         Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > > > > >
> > > > > > >         Hi Rafael
> > > > > > >
> > > > > > >
> > > > > > >         Thanks for the help, really appreciate it.
> > > > > > >
> > > > > > >
> > > > > > >         So rerunning that command with all servers up -
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         mysql> select * from cloud.storage_pool where
> cluster_id
> > =
> > > 1
> > > > > and
> > > > > > > removed is null;
> > > > > > >         Empty set (0.00 sec)
> > > > > > >
> > > > > > >         mysql>
> > > > > > >
> > > > > > >
> > > > > > >         As for the storage IP no I'm not setting it to be the
> > > > > management
> > > > > > > IP when I setup the zone but the output of the SQL command
> > suggests
> > > > > that
> > > > > > is
> > > > > > > what has happened.
> > > > > > >
> > > > > > >         As I said to Dag I am using a different subnet for
> > storage
> > > > ie.
> > > > > > >
> > > > > > >         172.30.3.0/26  - management subnet
> > > > > > >         172.30.4.0/25 -  guest VM subnet
> > > > > > >         172.30.5.0/28 - storage
> > > > > > >
> > > > > > >         the NFS server IP is 172.30.5.2
> > > > > > >
> > > > > > >         each compute node has 3 NICs with an IP from each
> subnet
> > (i
> > > > am
> > > > > > > assuming the management node only needs an IP in the management
> > > > network
> > > > > > ?)
> > > > > > >
> > > > > > >         When I add the zone in the UI I have one physical
> network
> > > > with
> > > > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > > > >         When I fill in the storage traffic page I use the range
> > > > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> > > allocated
> > > > to
> > > > > > the
> > > > > > > compute nodes and the NFS server.
> > > > > > >
> > > > > > >         I think maybe I am doing something wrong in the UI
> setup
> > > but
> > > > it
> > > > > > is
> > > > > > > not obvious to me what it is.
> > > > > > >
> > > > > > >         What I might try today unless you want me to keep the
> > > setup I
> > > > > > have
> > > > > > > for more outputs is to go back to 2 NICs, one for
> > > storage/management
> > > > > and
> > > > > > > one for guest VMs.
> > > > > > >
> > > > > > >         I think with the 2 NICs setup the mistake I made last
> > time
> > > > when
> > > > > > > adding the zone was to assume storage would just run over
> > > management
> > > > > so I
> > > > > > > did not drag and drop the storage icon and assign it to
> cloudbr0
> > as
> > > > > with
> > > > > > > the management which I think is what I should do ?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         ________________________________
> > > > > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > > > > >         Sent: 06 June 2018 10:54
> > > > > > >         To: users
> > > > > > >         Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > > > > >
> > > > > > >         Jon, do not panic we are here to help you :)
> > > > > > >         So, I might have mistyped the SQL query. You you use
> > > select *
> > > > > > from
> > > > > > >         cloud.storage_pool where cluster_id = 1 and removed is
> > not
> > > > null
> > > > > > ",
> > > > > > > you are
> > > > > > >         listing the storage pools removed. Therefore, the right
> > > query
> > > > > > > would be "
> > > > > > >         select * from cloud.storage_pool where cluster_id = 1
> and
> > > > > removed
> > > > > > > is null "
> > > > > > >
> > > > > > >         There is also something else I do not understand. You
> are
> > > > > setting
> > > > > > > the
> > > > > > >         storage IP in the management subnet? I am not sure if
> you
> > > > > should
> > > > > > > be doing
> > > > > > >         like this. Normally, I set all my storages
> (primary[when
> > > > > working
> > > > > > > with NFS]
> > > > > > >         and secondary) to IPs in the storage subnet.
> > > > > > >
> > > > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > > > Dag.Sonstebo@shapeblue.com>
> > > > > > >         wrote:
> > > > > > >
> > > > > > >         > Hi John,
> > > > > > >         >
> > > > > > >         > I’m late to this thread and have possibly missed some
> > > > things
> > > > > –
> > > > > > > but a
> > > > > > >         > couple of observations:
> > > > > > >         >
> > > > > > >         > “When I add the zone and get to the storage web page
> I
> > > > > exclude
> > > > > > > the IPs
> > > > > > >         > already used for the compute node NICs and the NFS
> > server
> > > > > > > itself. …..”
> > > > > > >         > “So the range is 172.30.5.1 -> 15 and the range I
> fill
> > in
> > > > is
> > > > > > > 172.30.5.10
> > > > > > >         > -> 172.30.5.14.”
> > > > > > >         >
> > > > > > >         > I think you may have some confusion around the use of
> > the
> > > > > > > storage network.
> > > > > > >         > The important part here is to understand this is for
> > > > > *secondary
> > > > > > > storage*
> > > > > > >         > use only – it has nothing to do with primary storage.
> > > This
> > > > > > means
> > > > > > > this
> > > > > > >         > storage network needs to be accessible to the SSVM,
> to
> > > the
> > > > > > > hypervisors, and
> > > > > > >         > secondary storage NFS pools needs to be accessible on
> > > this
> > > > > > > network.
> > > > > > >         >
> > > > > > >         > The important part – this also means you *can not use
> > the
> > > > > same
> > > > > > > IP ranges
> > > > > > >         > for management and storage networks* - doing so means
> > you
> > > > > will
> > > > > > > have issues
> > > > > > >         > where effectively both hypervisors and SSVM can see
> the
> > > > same
> > > > > > > subnet on two
> > > > > > >         > NICs – and you end up in a routing black hole.
> > > > > > >         >
> > > > > > >         > So – you need to either:
> > > > > > >         >
> > > > > > >         > 1) Use different IP subnets on management and
> storage,
> > or
> > > > > > >         > 2) preferably just simplify your setup – stop using a
> > > > > secondary
> > > > > > > storage
> > > > > > >         > network altogether and just allow secondary storage
> to
> > > use
> > > > > the
> > > > > > > management
> > > > > > >         > network (which is default). Unless you have a very
> high
> > > I/O
> > > > > > > environment in
> > > > > > >         > production you are just adding complexity by running
> > > > separate
> > > > > > > management
> > > > > > >         > and storage.
> > > > > > >         >
> > > > > > >         > Regards,
> > > > > > >         > Dag Sonstebo
> > > > > > >         > Cloud Architect
> > > > > > >         > ShapeBlue
> > > > > > >         >
> > > > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > > > jms.123@hotmail.co.uk>
> > > > > > > wrote:
> > > > > > >         >
> > > > > > >         >     I will disconnect the host this morning and test
> > but
> > > > > before
> > > > > > > I do that
> > > > > > >         > I ran this command when all hosts are up -
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >      select * from cloud.host;
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     | id | name            | uuid
> > > > > > >  | status
> > > > > > >         > | type               | private_ip_address |
> > > > private_netmask |
> > > > > > >         > private_mac_address | storage_ip_address |
> > > storage_netmask
> > > > |
> > > > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > > > storage_mac_address_2 |
> > > > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > > > public_netmask  |
> > > > > > >         > public_mac_address | proxy_port | data_center_id |
> > > pod_id |
> > > > > > > cpu_sockets |
> > > > > > >         > cpus | speed | url                                 |
> > > > fs_type
> > > > > |
> > > > > > >         > hypervisor_type | hypervisor_version | ram        |
> > > > resource
> > > > > |
> > > > > > > version  |
> > > > > > >         > parent | total_size | capabilities | guid
> > > > > > >         >                         | available | setup |
> > > dom0_memory |
> > > > > > > last_ping  |
> > > > > > >         > mgmt_server_id | disconnected        | created
> > > >  |
> > > > > > > removed |
> > > > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > > > engine_state |
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > > > eabd60012284
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.3         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.3        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > > > f83e7a5e8efb-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > > > 4 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > > > 3f5e2c2215b0
> > > > > > > | Up
> > > > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |       NULL | 172.30.4.98       |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> > > 1 |
> > > > > > >   NULL |
> > > > > > >         > NULL |  NULL | NoIqn                               |
> > NULL
> > > > > |
> > > > > > > NULL
> > > > > > >         >     | NULL               |          0 | NULL     |
> > > > 4.11.0.0 |
> > > > > > > NULL   |
> > > > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > > > >         >           |         1 |     0 |           0 |
> > 1492390409
> > > |
> > > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > > > 7 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > > > ff3845bb556c
> > > > > > > | Up
> > > > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |       NULL | 172.30.4.86       |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> > > 1 |
> > > > > > >   NULL |
> > > > > > >         > NULL |  NULL | NoIqn                               |
> > NULL
> > > > > |
> > > > > > > NULL
> > > > > > >         >     | NULL               |          0 | NULL     |
> > > > 4.11.0.0 |
> > > > > > > NULL   |
> > > > > > >         >  NULL | NULL         | s-1-VM-
> > NfsSecondaryStorageResource
> > > > > > >         >             |         1 |     0 |           0 |
> > > 1492390407
> > > > |
> > > > > > >  146457912294
> > > > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL
>   |
> > > > > > >   7 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > > > 1a96ffac9794
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.4         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.4        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > > > 16d0f99b11c9-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > > > 8 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > > > ee29d2b5c274
> > > > > > > | Up
> > > > > > >         >  | Routing            | 172.30.3.5         |
> > > > 255.255.255.192
> > > > > |
> > > > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> > > 255.255.255.192
> > > > |
> > > > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > > > >   | NULL
> > > > > > >         >             |          1 | 172.30.4.5        |
> > > > > 255.255.255.128
> > > > > > |
> > > > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> > > 1 |
> > > > > > >      1 |
> > > > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c |
> > NULL
> > > > > |
> > > > > > > KVM
> > > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > > 4.11.0.0
> > > > > |
> > > > > > > NULL   |
> > > > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > > > 37f3eebad8fb-
> > > > > > > LibvirtComputingResource
> > > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > > >  146457912294 |
> > > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > > > 6 |
> > > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > > >         >     +----+-----------------+------
> > > > > > ------------------------------
> > > > > > >         > --+--------+------------------
> > > > --+--------------------+------
> > > > > > >         > -----------+------------------
> > > > ---+--------------------+-----
> > > > > > >         > ------------+-----------------
> > > > ----+----------------------+--
> > > > > > >         > ---------------------+--------
> > > > -----------+------------+-----
> > > > > > >         > --------------+---------------
> > > > --+--------------------+------
> > > > > > >         > ------+----------------+------
> > > > --+-------------+------+------
> > > > > > >         > -+----------------------------
> > > > ---------+---------+----------
> > > > > > >         > -------+--------------------+-
> > > > -----------+----------+-------
> > > > > > >         > ---+--------+------------+----
> > > > ----------+-------------------
> > > > > > >         > ------------------------------
> > > > --------------+-----------+---
> > > > > > >         > ----+-------------+-----------
> > > > -+----------------+-----------
> > > > > > >         > ----------+-------------------
> > > > --+---------+--------------+--
> > > > > > >         > --------------+-------+-------------+--------------+
> > > > > > >         >     5 rows in set (0.00 sec)
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     and you can see that it says the storage IP
> address
> > > is
> > > > > the
> > > > > > > same as the
> > > > > > >         > private IP address (the management network).
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I also ran the command you provided using the
> > Cluster
> > > > ID
> > > > > > > number from
> > > > > > >         > the table above -
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     mysql> select * from cloud.storage_pool where
> > > > cluster_id
> > > > > =
> > > > > > 1
> > > > > > > and
> > > > > > >         > removed is not null;
> > > > > > >         >     Empty set (0.00 sec)
> > > > > > >         >
> > > > > > >         >     mysql>
> > > > > > >         >
> > > > > > >         >     So assuming I am reading this correctly that
> seems
> > to
> > > > be
> > > > > > the
> > > > > > > issue.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I am at a loss as to why though.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I have a separate NIC for storage as described.
> > When
> > > I
> > > > > add
> > > > > > > the zone
> > > > > > >         > and get to the storage web page I exclude the IPs
> > already
> > > > > used
> > > > > > > for the
> > > > > > >         > compute node NICs and the NFS server itself. I do
> this
> > > > > because
> > > > > > > initially I
> > > > > > >         > didn't and the SSVM started using the IP address of
> the
> > > NFS
> > > > > > > server.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     So the range is 172.30.5.1 -> 15 and the range I
> > fill
> > > > in
> > > > > is
> > > > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     And I used the label "cloudbr2" for storage.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     I must be doing this wrong somehow.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     Any pointers would be much appreciated.
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     ________________________________
> > > > > > >         >     From: Rafael Weingärtner <
> > > rafaelweingartner@gmail.com>
> > > > > > >         >     Sent: 05 June 2018 16:13
> > > > > > >         >     To: users
> > > > > > >         >     Subject: Re: advanced networking with public IPs
> > > direct
> > > > > to
> > > > > > > VMs
> > > > > > >         >
> > > > > > >         >     That is interesting. Let's see the source of all
> > > > truth...
> > > > > > >         >     This is the code that is generating that odd
> > message.
> > > > > > >         >
> > > > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > > > getClusterId());
> > > > > > >         >     >         boolean hasNfs = false;
> > > > > > >         >     >         for (StoragePoolVO pool :
> clusterPools) {
> > > > > > >         >     >             if (pool.getPoolType() ==
> > > > StoragePoolType.
> > > > > > > NetworkFilesystem)
> > > > > > >         > {
> > > > > > >         >     >                 hasNfs = true;
> > > > > > >         >     >                 break;
> > > > > > >         >     >             }
> > > > > > >         >     >         }
> > > > > > >         >     >         if (!hasNfs) {
> > > > > > >         >     >             s_logger.warn(
> > > > > > >         >     >                     "Agent investigation was
> > > > requested
> > > > > on
> > > > > > > host " +
> > > > > > >         > agent +
> > > > > > >         >     > ", but host does not support investigation
> > because
> > > it
> > > > > has
> > > > > > > no NFS
> > > > > > >         > storage.
> > > > > > >         >     > Skipping investigation.");
> > > > > > >         >     >             return Status.Disconnected;
> > > > > > >         >     >         }
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >     There are two possibilities here. You do not have
> > any
> > > > NFS
> > > > > > > storage? Is
> > > > > > >         > that
> > > > > > >         >     the case? Or maybe, for some reason, the call
> > > > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > > > getClusterId())"
> > > > > > > is not
> > > > > > >         > returning
> > > > > > >         >     any NFS storage pools. Looking at the
> > > > > "listPoolsByCluster "
> > > > > > > we will see
> > > > > > >         >     that the following SQL is used:
> > > > > > >         >
> > > > > > >         >     Select * from storage_pool where cluster_id =
> > > > > > > <host'sClusterId> and
> > > > > > >         > removed
> > > > > > >         >     > is not null
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >     Can you run that SQL to see the its return when
> > your
> > > > > hosts
> > > > > > > are marked
> > > > > > >         > as
> > > > > > >         >     disconnected?
> > > > > > >         >
> > > > > > >         >
> > > > > > >         > Dag.Sonstebo@shapeblue.com
> > > > > > >         > www.shapeblue.com<http://www.shapeblue.com>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > > technologies globally and are specialists in the design and
> > > > > implementation
> > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > implementations.
> > > > > >
> > > > > >
> > > > > >
> > > > > > >         Shapeblue - The CloudStack Company<
> > > http://www.shapeblue.com/
> > > > >
> > > > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > > > >         ShapeBlue are the largest independent integrator of
> > > > CloudStack
> > > > > > > technologies globally and are specialists in the design and
> > > > > > implementation
> > > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > > implementations.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > >         > @shapeblue
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > > > jms.123@hotmail.co.uk>
> > > > > > >         > wrote:
> > > > > > >         >
> > > > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > > > configured
> > > > > > > the zone
> > > > > > >         > through
> > > > > > >         >     > the UI I used the labels cloudbr0 for
> management,
> > > > > > cloudbr1
> > > > > > > for guest
> > > > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > > > response
> > > > > > > to you.
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > When I pull the power to the node
> > (dcp-cscn2.local)
> > > > > after
> > > > > > > about 5
> > > > > > >         > mins
> > > > > > >         >     > the  host status goes to "Alert" but never to
> > > "Down"
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > I get this in the logs -
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > [c.c.h.KVMInvestigator]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > Agent
> > > > > > > investigation
> > > > > > >         > was
> > > > > > >         >     > requested on host Host[-4-Routing], but host
> does
> > > not
> > > > > > > support
> > > > > > >         > investigation
> > > > > > >         >     > because it has no NFS storage. Skipping
> > > > investigation.
> > > > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > > > HighAvailabilityManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > > > KVMInvestigator was
> > > > > > >         > able to
> > > > > > >         >     > determine host 4 is in Disconnected
> > > > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > > > [c.c.a.m.AgentManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > The
> > > > > agent
> > > > > > > from host
> > > > > > >         > 4 state
> > > > > > >         >     > determined is Disconnected
> > > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > > > [c.c.a.m.AgentManagerImpl]
> > > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > Agent
> > > > > is
> > > > > > >         > disconnected but
> > > > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > > > >         >     >
> > > > > > >         >     > I don't understand why it thinks there is no
> NFS
> > > > > storage
> > > > > > > as each
> > > > > > >         > compute
> > > > > > >         >     > node has a dedicated storage NIC.
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > I also don't understand why it thinks the host
> is
> > > > still
> > > > > > up
> > > > > > > ie. what
> > > > > > >         > test
> > > > > > >         >     > is it doing to determine that ?
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > Am I just trying to get something working that
> is
> > > not
> > > > > > > supported ?
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > ________________________________
> > > > > > >         >     > From: Rafael Weingärtner <
> > > > rafaelweingartner@gmail.com>
> > > > > > >         >     > Sent: 04 June 2018 15:31
> > > > > > >         >     > To: users
> > > > > > >         >     > Subject: Re: advanced networking with public
> IPs
> > > > direct
> > > > > > to
> > > > > > > VMs
> > > > > > >         >     >
> > > > > > >         >     > What type of failover are you talking about?
> > > > > > >         >     > What ACS version are you using?
> > > > > > >         >     > What hypervisor are you using?
> > > > > > >         >     > How are you configuring your NICs in the
> > > hypervisor?
> > > > > > >         >     > How are you configuring the traffic labels in
> > ACS?
> > > > > > >         >     >
> > > > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > > > jms.123@hotmail.co.uk
> > > > > > >         > >
> > > > > > >         >     > wrote:
> > > > > > >         >     >
> > > > > > >         >     > > Hi all
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > I am close to giving up on basic networking
> as
> > I
> > > > just
> > > > > > > cannot get
> > > > > > >         > failover
> > > > > > >         >     > > working with multiple NICs (I am not even
> sure
> > it
> > > > is
> > > > > > > supported).
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > What I would like is to use 3 NICs for
> > > management,
> > > > > > > storage and
> > > > > > >         > guest
> > > > > > >         >     > > traffic. I would like to assign public IPs
> > direct
> > > > to
> > > > > > the
> > > > > > > VMs which
> > > > > > >         > is
> > > > > > >         >     > why I
> > > > > > >         >     > > originally chose basic.
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > If I switch to advanced networking do I just
> > > > > configure
> > > > > > a
> > > > > > > guest VM
> > > > > > >         > with
> > > > > > >         >     > > public IPs on one NIC and not both with the
> > > public
> > > > > > > traffic -
> > > > > > >         >     > >
> > > > > > >         >     > >
> > > > > > >         >     > > would this work ?
> > > > > > >         >     > >
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     >
> > > > > > >         >     > --
> > > > > > >         >     > Rafael Weingärtner
> > > > > > >         >     >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >     --
> > > > > > >         >     Rafael Weingärtner
> > > > > > >         >
> > > > > > >         >
> > > > > > >         >
> > > > > > >
> > > > > > >
> > > > > > >         --
> > > > > > >         Rafael Weingärtner
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >     Dag.Sonstebo@shapeblue.com
> > > > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > >     @shapeblue
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Dag.Sonstebo@shapeblue.com
> > > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > > @shapeblue
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Rafael Weingärtner
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rafael Weingärtner
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
>
>
>
> --
> Rafael Weingärtner
>



-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Hi Rafael


I don't have a github account but can setup one up and do a report sometime this week if that is okay ?


No problem with the testing and thanks for the help.


Before I leave this if I use NFS cluster mode couple of questions -


1) if I run management and storage over same interface the NFS server can still be on a different subnet than the management subnet ie. the NFS server does not have to have IP from the management subnet ?


2) If i add another cluster can I just create a different NFS share from the same server ?


Finally many thanks to you and the others for the help provided.

________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 11 June 2018 10:58
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Well, it seems that you have found a bug. Can you fill out an issue report
on Github?

Thanks for the hard work on debugging and testing.

On Fri, Jun 8, 2018 at 2:17 PM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> So based on Erik's suggestion (thanks Erik) I rebuilt the management
> server and setup cluster wide primary storage as opposed to zone wide which
> I have been using so far.
>
>
> Still using 3 NICs (management/Guest/storage) and basic networking.
>
>
> And VM HA now works. In addition it failed over quicker than it did when I
> had zone wide NFS storage on a single NIC.
>
>
> Still a bit confused about this output where it is still showing the
> storage_ip_addresses as 172.30.3.x IPs which is the management subnet but
> maybe I am reading it incorrectly.
>
>
>
> mysql> select * from cloud.host;
> +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
> | id | name            | uuid                                 | status |
> type               | private_ip_address | private_netmask |
> private_mac_address | storage_ip_address | storage_netmask |
> storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> cpus | speed | url                                 | fs_type |
> hypervisor_type | hypervisor_version | ram        | resource | version  |
> parent | total_size | capabilities | guid
>                         | available | setup | dom0_memory | last_ping  |
> mgmt_server_id | disconnected        | created             | removed |
> update_count | resource_state | owner | lastUpdated | engine_state |
> +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
> |  1 | dcp-cscn1.local | 372c738c-5370-4b46-9358-14b649c73d6b | Up     |
> Routing            | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.3        | 255.255.255.128 |
> 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492635804 |   146457912294 |
> 2018-06-08 11:57:31 | 2018-06-08 11:19:36 | NULL    |            4 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  2 | s-1-VM          | 8f40f8a9-1e97-4fce-a143-a8c9489a6d5b | Up     |
> SecondaryStorageVM | 172.30.3.55        | 255.255.255.192 |
> 1e:00:ab:00:00:24   | 172.30.3.55        | 255.255.255.192 |
> 1e:00:ab:00:00:24   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.66       | 255.255.255.128 |
> 1e:00:c4:00:00:67  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
>             |         1 |     0 |           0 | 1492635804 |   146457912294
> | 2018-06-08 11:57:31 | 2018-06-08 11:21:51 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  3 | v-2-VM          | fe5b4f8b-b5b0-4018-8ee2-4c5384da073b | Up     |
> ConsoleProxy       | 172.30.3.40        | 255.255.255.192 |
> 1e:00:96:00:00:15   | 172.30.3.40        | 255.255.255.192 |
> 1e:00:96:00:00:15   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.62       | 255.255.255.128 |
> 1e:00:01:00:00:63  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | Proxy.2-ConsoleProxyResource
>           |         1 |     0 |           0 | 1492635804 |   146457912294 |
> 2018-06-08 11:57:31 | 2018-06-08 11:22:03 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  4 | dcp-cscn2.local | 935260eb-a80c-4ead-85d7-3df8212e301b | Down   |
> Routing            | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.4        | 255.255.255.128 |
> 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
> |         1 |     0 |           0 | 1492635804 |           NULL |
> 2018-06-08 11:57:31 | 2018-06-08 11:35:07 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  5 | dcp-cscn3.local | f3cabc9e-9679-4d7e-8297-b6765eea2770 | Up     |
> Routing            | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.5        | 255.255.255.128 |
> 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492635804 |   146457912294 |
> 2018-06-08 11:57:31 | 2018-06-08 11:36:27 | NULL    |            4 |
> Enabled        | NULL  | NULL        | Disabled     |
> +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
> 5 rows in set (0.00 sec)
>
> mysql>
>
> So some sort of bug maybe ?
>
> ________________________________
> From: Erik Weber <te...@gmail.com>
> Sent: 08 June 2018 10:15
> To: users@cloudstack.apache.org
> Subject: Re: advanced networking with public IPs direct to VMs
>
> While someone ponders about the zone wide storage, you could try adding a
> cluster wide nfs storage and see if it the rest works in that setup.
>
> Erik
>
> On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Yes, all basic. I read a Shapeblue doc that recommended splitting traffic
> > across multiple NICs even in basic networking mode so that is what I am
> > trying to do.
> >
> >
> > With single NIC you do not get the NFS storage message.
> >
> >
> > I have the entire management server logs for both scenarios after I
> pulled
> > the power to one of the compute nodes but from the single NIC setup these
> > seem to be the relevant lines -
> >
> >
> > 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on 4
> > with status Down
> > 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource
> state
> > = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> > 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> > dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is down,
> > name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> > 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> > (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> > agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4:
> Host
> > with specified id is not in the right state: Down
> > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> > returned status:Down for the investigated host:4
> > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> > legacy state Down for host 4
> > 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was able
> to
> > determine host 4 is in Down
> > 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4
> state
> > determined is Down
> > 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> > 4-dcp-cscn2.local. Starting HA on the VMs
> >
> > At the moment I only need to assign public IPs direct to VMs rather than
> > using NAT with the virtual router but would be happy to go with advanced
> > networking if it would make things easier :)
> >
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 07 June 2018 10:35
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > Ah so, it is not an advanced setup; even when you use multiple NICs.
> > Can you confirm that the message ""Agent investigation was requested on
> > host, but host does not support investigation because it has no NFS
> > storage. Skipping investigation." does not appear when you use a single
> > NIC? Can you check other log entries that might appear when the host is
> > marked as "down"?
> >
> > On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > It is all basic networking at the moment for all the setups.
> > >
> > >
> > > If you want me to I can setup a single NIC solution again and run any
> > > commands you need me to do.
> > >
> > >
> > > FYI when I setup single NIC I use the guided  installtion option in the
> > UI
> > > rather than manual setup which I do for the multiple NIC scenario.
> > >
> > >
> > > Happy to set it up if it helps.
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 10:23
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > Ok, so that explains the log message. This is looking like a bug to me.
> > It
> > > seems that in Zone wide the host state (when disconnected) is not being
> > > properly identified due to this NFS thing, and as a consequency it has
> a
> > > side effect in VM HA.
> > >
> > > We would need some inputs from guys that have advanced networking
> > > deployments and Zone wide storage.
> > >
> > > I do not see how the all in one NIC deployment scenario is working
> > though.
> > > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
> > > simple, if there is no NFS in the cluster (NFS storage pools found for
> a
> > > host's cluster), KVM hosts will be detected as "disconnected" and not
> > down
> > > with that warning message you noticed.
> > >
> > > When you say "all in one NIC", is it an advanced network deployment
> where
> > > you put all traffic in a single network, or is it a basic networking
> that
> > > you are doing?
> > >
> > > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > zone wide.
> > > >
> > > >
> > > > ________________________________
> > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > Sent: 07 June 2018 10:04
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > What type of storage are you using? Zone wide? Or cluster "wide"
> > storage?
> > > >
> > > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > > wrote:
> > > >
> > > > > Rafael
> > > > >
> > > > >
> > > > > Here is the output as requested -
> > > > >
> > > > >
> > > > >
> > > > > mysql> mysql> select * from cloud.storage_pool where removed is
> null;
> > > > > +----+------+--------------------------------------+--------
> > > > > -----------+------+----------------+--------+------------+--
> > > > > ----------+----------------+--------------+-----------+-----
> > > > > ------------+---------------------+---------+-------------+-
> > > > > -------+-----------------------+-------+------------+-------
> > > > > --+---------------+
> > > > > | id | name | uuid                                 | pool_type
> > >  |
> > > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > > capacity_bytes |
> > > > > host_address | user_info | path            | created             |
> > > > removed
> > > > > | update_time | status | storage_provider_name | scope |
> hypervisor |
> > > > > managed | capacity_iops |
> > > > > +----+------+--------------------------------------+--------
> > > > > -----------+------+----------------+--------+------------+--
> > > > > ----------+----------------+--------------+-----------+-----
> > > > > ------------+---------------------+---------+-------------+-
> > > > > -------+-----------------------+-------+------------+-------
> > > > > --+---------------+
> > > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > > NetworkFilesystem |
> > > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > > 79133933568 |
> > > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 |
> > NULL
> > > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM
>   |
> > > > >  0 |          NULL |
> > > > > +----+------+--------------------------------------+--------
> > > > > -----------+------+----------------+--------+------------+--
> > > > > ----------+----------------+--------------+-----------+-----
> > > > > ------------+---------------------+---------+-------------+-
> > > > > -------+-----------------------+-------+------------+-------
> > > > > --+---------------+
> > > > > 1 row in set (0.00 sec)
> > > > >
> > > > > mysql>
> > > > >
> > > > > Do you think this problem is related to my NIC/bridge configuration
> > or
> > > > the
> > > > > way I am configuring the zone ?
> > > > >
> > > > > Jon
> > > > > ________________________________
> > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > Sent: 07 June 2018 06:45
> > > > > To: users
> > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > > Can you also post the result of:
> > > > > select * from cloud.storage_pool where removed is null
> > > > >
> > > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > > Dag.Sonstebo@shapeblue.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Jon,
> > > > > >
> > > > > > Still confused where your primary storage pools are – are you
> sure
> > > your
> > > > > > hosts are in cluster 1?
> > > > > >
> > > > > > Quick question just to make sure - assuming management/storage is
> > on
> > > > the
> > > > > > same NIC when I setup basic networking the physical network has
> the
> > > > > > management and guest icons already there and I just edit the KVM
> > > > labels.
> > > > > If
> > > > > > I am running storage over management do I need to drag the
> storage
> > > icon
> > > > > to
> > > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > > management or does CS automatically just use the management NIC
> > ie. I
> > > > > would
> > > > > > only need to drag the storage icon across in basic setup if I
> > wanted
> > > it
> > > > > on
> > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > >
> > > > > > >> I would do both – set up your 2/3 physical networks, name
> isn’t
> > > that
> > > > > > important – but then drag the traffic types to the correct one
> and
> > > make
> > > > > > sure the labels are correct.
> > > > > > Regards,
> > > > > > Dag Sonstebo
> > > > > > Cloud Architect
> > > > > > ShapeBlue
> > > > > >
> > > > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk>
> > wrote:
> > > > > >
> > > > > >     Dag
> > > > > >
> > > > > >
> > > > > >     Do you mean  check the pools with "Infrastructure -> Primary
> > > > Storage"
> > > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > > >
> > > > > >
> > > > > >     If so Primary Storage has a state of UP, secondary storage
> does
> > > not
> > > > > > show a state as such so not sure where else to check it ?
> > > > > >
> > > > > >
> > > > > >     Rerun of the command -
> > > > > >
> > > > > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > > > > >     Empty set (0.00 sec)
> > > > > >
> > > > > >     mysql>
> > > > > >
> > > > > >     I think it is something to do with my zone creation rather
> than
> > > the
> > > > > > NIC, bridge setup although I can post those if needed.
> > > > > >
> > > > > >     I may try to setup just the 2 NIC solution you mentioned
> > although
> > > > as
> > > > > I
> > > > > > say I had the same issue with that ie. host goes to "Altert"
> state
> > > and
> > > > > same
> > > > > > error messages.  The only time I can get it to go to "Down" state
> > is
> > > > when
> > > > > > it is all on the single NIC.
> > > > > >
> > > > > >     Quick question just to make sure - assuming
> management/storage
> > is
> > > > on
> > > > > > the same NIC when I setup basic networking the physical network
> has
> > > the
> > > > > > management and guest icons already there and I just edit the KVM
> > > > labels.
> > > > > If
> > > > > > I am running storage over management do I need to drag the
> storage
> > > icon
> > > > > to
> > > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > > management or does CS automatically just use the management NIC
> > ie. I
> > > > > would
> > > > > > only need to drag the storage icon across in basic setup if I
> > wanted
> > > it
> > > > > on
> > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > >
> > > > > >     On the plus side I have been at this for so long now and done
> > so
> > > > many
> > > > > > rebuilds I could do it in my sleep now ??
> > > > > >
> > > > > >
> > > > > >     ________________________________
> > > > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > > > >     Sent: 06 June 2018 12:28
> > > > > >     To: users@cloudstack.apache.org
> > > > > >     Subject: Re: advanced networking with public IPs direct to
> VMs
> > > > > >
> > > > > >     Looks OK to me Jon.
> > > > > >
> > > > > >     The one thing that throws me is your storage pools – can you
> > > rerun
> > > > > > your query: select * from cloud.storage_pool where cluster_id =
> 1;
> > > > > >
> > > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > > >
> > > > > >     Regards,
> > > > > >     Dag Sonstebo
> > > > > >     Cloud Architect
> > > > > >     ShapeBlue
> > > > > >
> > > > > >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk>
> > > > wrote:
> > > > > >
> > > > > >         Don't know whether this helps or not but I logged into
> the
> > > SSVM
> > > > > > and ran an ifconfig -
> > > > > >
> > > > > >
> > > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > > > > 169.254.255.255
> > > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> > broadcast
> > > > > > 172.30.3.63
> > > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> > broadcast
> > > > > > 172.30.4.127
> > > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> > broadcast
> > > > > > 172.30.5.15
> > > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >
> > > > > >         so it has interfaces in both the management and the
> storage
> > > > > > subnets (as well as guest).
> > > > > >
> > > > > >
> > > > > >
> > > > > >         ________________________________
> > > > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > > > >         Sent: 06 June 2018 11:08
> > > > > >         To: users@cloudstack.apache.org
> > > > > >         Subject: Re: advanced networking with public IPs direct
> to
> > > VMs
> > > > > >
> > > > > >         Hi Rafael
> > > > > >
> > > > > >
> > > > > >         Thanks for the help, really appreciate it.
> > > > > >
> > > > > >
> > > > > >         So rerunning that command with all servers up -
> > > > > >
> > > > > >
> > > > > >
> > > > > >         mysql> select * from cloud.storage_pool where cluster_id
> =
> > 1
> > > > and
> > > > > > removed is null;
> > > > > >         Empty set (0.00 sec)
> > > > > >
> > > > > >         mysql>
> > > > > >
> > > > > >
> > > > > >         As for the storage IP no I'm not setting it to be the
> > > > management
> > > > > > IP when I setup the zone but the output of the SQL command
> suggests
> > > > that
> > > > > is
> > > > > > what has happened.
> > > > > >
> > > > > >         As I said to Dag I am using a different subnet for
> storage
> > > ie.
> > > > > >
> > > > > >         172.30.3.0/26  - management subnet
> > > > > >         172.30.4.0/25 -  guest VM subnet
> > > > > >         172.30.5.0/28 - storage
> > > > > >
> > > > > >         the NFS server IP is 172.30.5.2
> > > > > >
> > > > > >         each compute node has 3 NICs with an IP from each subnet
> (i
> > > am
> > > > > > assuming the management node only needs an IP in the management
> > > network
> > > > > ?)
> > > > > >
> > > > > >         When I add the zone in the UI I have one physical network
> > > with
> > > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > > >         When I fill in the storage traffic page I use the range
> > > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> > allocated
> > > to
> > > > > the
> > > > > > compute nodes and the NFS server.
> > > > > >
> > > > > >         I think maybe I am doing something wrong in the UI setup
> > but
> > > it
> > > > > is
> > > > > > not obvious to me what it is.
> > > > > >
> > > > > >         What I might try today unless you want me to keep the
> > setup I
> > > > > have
> > > > > > for more outputs is to go back to 2 NICs, one for
> > storage/management
> > > > and
> > > > > > one for guest VMs.
> > > > > >
> > > > > >         I think with the 2 NICs setup the mistake I made last
> time
> > > when
> > > > > > adding the zone was to assume storage would just run over
> > management
> > > > so I
> > > > > > did not drag and drop the storage icon and assign it to cloudbr0
> as
> > > > with
> > > > > > the management which I think is what I should do ?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >         ________________________________
> > > > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > > > >         Sent: 06 June 2018 10:54
> > > > > >         To: users
> > > > > >         Subject: Re: advanced networking with public IPs direct
> to
> > > VMs
> > > > > >
> > > > > >         Jon, do not panic we are here to help you :)
> > > > > >         So, I might have mistyped the SQL query. You you use
> > select *
> > > > > from
> > > > > >         cloud.storage_pool where cluster_id = 1 and removed is
> not
> > > null
> > > > > ",
> > > > > > you are
> > > > > >         listing the storage pools removed. Therefore, the right
> > query
> > > > > > would be "
> > > > > >         select * from cloud.storage_pool where cluster_id = 1 and
> > > > removed
> > > > > > is null "
> > > > > >
> > > > > >         There is also something else I do not understand. You are
> > > > setting
> > > > > > the
> > > > > >         storage IP in the management subnet? I am not sure if you
> > > > should
> > > > > > be doing
> > > > > >         like this. Normally, I set all my storages (primary[when
> > > > working
> > > > > > with NFS]
> > > > > >         and secondary) to IPs in the storage subnet.
> > > > > >
> > > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > > Dag.Sonstebo@shapeblue.com>
> > > > > >         wrote:
> > > > > >
> > > > > >         > Hi John,
> > > > > >         >
> > > > > >         > I’m late to this thread and have possibly missed some
> > > things
> > > > –
> > > > > > but a
> > > > > >         > couple of observations:
> > > > > >         >
> > > > > >         > “When I add the zone and get to the storage web page I
> > > > exclude
> > > > > > the IPs
> > > > > >         > already used for the compute node NICs and the NFS
> server
> > > > > > itself. …..”
> > > > > >         > “So the range is 172.30.5.1 -> 15 and the range I fill
> in
> > > is
> > > > > > 172.30.5.10
> > > > > >         > -> 172.30.5.14.”
> > > > > >         >
> > > > > >         > I think you may have some confusion around the use of
> the
> > > > > > storage network.
> > > > > >         > The important part here is to understand this is for
> > > > *secondary
> > > > > > storage*
> > > > > >         > use only – it has nothing to do with primary storage.
> > This
> > > > > means
> > > > > > this
> > > > > >         > storage network needs to be accessible to the SSVM, to
> > the
> > > > > > hypervisors, and
> > > > > >         > secondary storage NFS pools needs to be accessible on
> > this
> > > > > > network.
> > > > > >         >
> > > > > >         > The important part – this also means you *can not use
> the
> > > > same
> > > > > > IP ranges
> > > > > >         > for management and storage networks* - doing so means
> you
> > > > will
> > > > > > have issues
> > > > > >         > where effectively both hypervisors and SSVM can see the
> > > same
> > > > > > subnet on two
> > > > > >         > NICs – and you end up in a routing black hole.
> > > > > >         >
> > > > > >         > So – you need to either:
> > > > > >         >
> > > > > >         > 1) Use different IP subnets on management and storage,
> or
> > > > > >         > 2) preferably just simplify your setup – stop using a
> > > > secondary
> > > > > > storage
> > > > > >         > network altogether and just allow secondary storage to
> > use
> > > > the
> > > > > > management
> > > > > >         > network (which is default). Unless you have a very high
> > I/O
> > > > > > environment in
> > > > > >         > production you are just adding complexity by running
> > > separate
> > > > > > management
> > > > > >         > and storage.
> > > > > >         >
> > > > > >         > Regards,
> > > > > >         > Dag Sonstebo
> > > > > >         > Cloud Architect
> > > > > >         > ShapeBlue
> > > > > >         >
> > > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > > jms.123@hotmail.co.uk>
> > > > > > wrote:
> > > > > >         >
> > > > > >         >     I will disconnect the host this morning and test
> but
> > > > before
> > > > > > I do that
> > > > > >         > I ran this command when all hosts are up -
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >      select * from cloud.host;
> > > > > >         >     +----+-----------------+------
> > > > > ------------------------------
> > > > > >         > --+--------+------------------
> > > --+--------------------+------
> > > > > >         > -----------+------------------
> > > ---+--------------------+-----
> > > > > >         > ------------+-----------------
> > > ----+----------------------+--
> > > > > >         > ---------------------+--------
> > > -----------+------------+-----
> > > > > >         > --------------+---------------
> > > --+--------------------+------
> > > > > >         > ------+----------------+------
> > > --+-------------+------+------
> > > > > >         > -+----------------------------
> > > ---------+---------+----------
> > > > > >         > -------+--------------------+-
> > > -----------+----------+-------
> > > > > >         > ---+--------+------------+----
> > > ----------+-------------------
> > > > > >         > ------------------------------
> > > --------------+-----------+---
> > > > > >         > ----+-------------+-----------
> > > -+----------------+-----------
> > > > > >         > ----------+-------------------
> > > --+---------+--------------+--
> > > > > >         > --------------+-------+-------------+--------------+
> > > > > >         >     | id | name            | uuid
> > > > > >  | status
> > > > > >         > | type               | private_ip_address |
> > > private_netmask |
> > > > > >         > private_mac_address | storage_ip_address |
> > storage_netmask
> > > |
> > > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > > storage_mac_address_2 |
> > > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > > public_netmask  |
> > > > > >         > public_mac_address | proxy_port | data_center_id |
> > pod_id |
> > > > > > cpu_sockets |
> > > > > >         > cpus | speed | url                                 |
> > > fs_type
> > > > |
> > > > > >         > hypervisor_type | hypervisor_version | ram        |
> > > resource
> > > > |
> > > > > > version  |
> > > > > >         > parent | total_size | capabilities | guid
> > > > > >         >                         | available | setup |
> > dom0_memory |
> > > > > > last_ping  |
> > > > > >         > mgmt_server_id | disconnected        | created
> > >  |
> > > > > > removed |
> > > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > > engine_state |
> > > > > >         >     +----+-----------------+------
> > > > > ------------------------------
> > > > > >         > --+--------+------------------
> > > --+--------------------+------
> > > > > >         > -----------+------------------
> > > ---+--------------------+-----
> > > > > >         > ------------+-----------------
> > > ----+----------------------+--
> > > > > >         > ---------------------+--------
> > > -----------+------------+-----
> > > > > >         > --------------+---------------
> > > --+--------------------+------
> > > > > >         > ------+----------------+------
> > > --+-------------+------+------
> > > > > >         > -+----------------------------
> > > ---------+---------+----------
> > > > > >         > -------+--------------------+-
> > > -----------+----------+-------
> > > > > >         > ---+--------+------------+----
> > > ----------+-------------------
> > > > > >         > ------------------------------
> > > --------------+-----------+---
> > > > > >         > ----+-------------+-----------
> > > -+----------------+-----------
> > > > > >         > ----------+-------------------
> > > --+---------+--------------+--
> > > > > >         > --------------+-------+-------------+--------------+
> > > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > > eabd60012284
> > > > > > | Up
> > > > > >         >  | Routing            | 172.30.3.3         |
> > > 255.255.255.192
> > > > |
> > > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> > 255.255.255.192
> > > |
> > > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |          1 | 172.30.4.3        |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> > 1 |
> > > > > >      1 |
> > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 |
> NULL
> > > > |
> > > > > > KVM
> > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > 4.11.0.0
> > > > |
> > > > > > NULL   |
> > > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > > f83e7a5e8efb-
> > > > > > LibvirtComputingResource
> > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > > 4 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > > 3f5e2c2215b0
> > > > > > | Up
> > > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > > 255.255.255.192
> > > > |
> > > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> > 255.255.255.192
> > > |
> > > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |       NULL | 172.30.4.98       |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> > 1 |
> > > > > >   NULL |
> > > > > >         > NULL |  NULL | NoIqn                               |
> NULL
> > > > |
> > > > > > NULL
> > > > > >         >     | NULL               |          0 | NULL     |
> > > 4.11.0.0 |
> > > > > > NULL   |
> > > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > > >         >           |         1 |     0 |           0 |
> 1492390409
> > |
> > > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > > 7 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > > ff3845bb556c
> > > > > > | Up
> > > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > > 255.255.255.192
> > > > |
> > > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> > 255.255.255.192
> > > |
> > > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |       NULL | 172.30.4.86       |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> > 1 |
> > > > > >   NULL |
> > > > > >         > NULL |  NULL | NoIqn                               |
> NULL
> > > > |
> > > > > > NULL
> > > > > >         >     | NULL               |          0 | NULL     |
> > > 4.11.0.0 |
> > > > > > NULL   |
> > > > > >         >  NULL | NULL         | s-1-VM-
> NfsSecondaryStorageResource
> > > > > >         >             |         1 |     0 |           0 |
> > 1492390407
> > > |
> > > > > >  146457912294
> > > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > > > > >   7 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > > 1a96ffac9794
> > > > > > | Up
> > > > > >         >  | Routing            | 172.30.3.4         |
> > > 255.255.255.192
> > > > |
> > > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> > 255.255.255.192
> > > |
> > > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |          1 | 172.30.4.4        |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> > 1 |
> > > > > >      1 |
> > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 |
> NULL
> > > > |
> > > > > > KVM
> > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > 4.11.0.0
> > > > |
> > > > > > NULL   |
> > > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > > 16d0f99b11c9-
> > > > > > LibvirtComputingResource
> > > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > > 8 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > > ee29d2b5c274
> > > > > > | Up
> > > > > >         >  | Routing            | 172.30.3.5         |
> > > 255.255.255.192
> > > > |
> > > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> > 255.255.255.192
> > > |
> > > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |          1 | 172.30.4.5        |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> > 1 |
> > > > > >      1 |
> > > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c |
> NULL
> > > > |
> > > > > > KVM
> > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > 4.11.0.0
> > > > |
> > > > > > NULL   |
> > > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > > 37f3eebad8fb-
> > > > > > LibvirtComputingResource
> > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > > 6 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     +----+-----------------+------
> > > > > ------------------------------
> > > > > >         > --+--------+------------------
> > > --+--------------------+------
> > > > > >         > -----------+------------------
> > > ---+--------------------+-----
> > > > > >         > ------------+-----------------
> > > ----+----------------------+--
> > > > > >         > ---------------------+--------
> > > -----------+------------+-----
> > > > > >         > --------------+---------------
> > > --+--------------------+------
> > > > > >         > ------+----------------+------
> > > --+-------------+------+------
> > > > > >         > -+----------------------------
> > > ---------+---------+----------
> > > > > >         > -------+--------------------+-
> > > -----------+----------+-------
> > > > > >         > ---+--------+------------+----
> > > ----------+-------------------
> > > > > >         > ------------------------------
> > > --------------+-----------+---
> > > > > >         > ----+-------------+-----------
> > > -+----------------+-----------
> > > > > >         > ----------+-------------------
> > > --+---------+--------------+--
> > > > > >         > --------------+-------+-------------+--------------+
> > > > > >         >     5 rows in set (0.00 sec)
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     and you can see that it says the storage IP address
> > is
> > > > the
> > > > > > same as the
> > > > > >         > private IP address (the management network).
> > > > > >         >
> > > > > >         >
> > > > > >         >     I also ran the command you provided using the
> Cluster
> > > ID
> > > > > > number from
> > > > > >         > the table above -
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     mysql> select * from cloud.storage_pool where
> > > cluster_id
> > > > =
> > > > > 1
> > > > > > and
> > > > > >         > removed is not null;
> > > > > >         >     Empty set (0.00 sec)
> > > > > >         >
> > > > > >         >     mysql>
> > > > > >         >
> > > > > >         >     So assuming I am reading this correctly that seems
> to
> > > be
> > > > > the
> > > > > > issue.
> > > > > >         >
> > > > > >         >
> > > > > >         >     I am at a loss as to why though.
> > > > > >         >
> > > > > >         >
> > > > > >         >     I have a separate NIC for storage as described.
> When
> > I
> > > > add
> > > > > > the zone
> > > > > >         > and get to the storage web page I exclude the IPs
> already
> > > > used
> > > > > > for the
> > > > > >         > compute node NICs and the NFS server itself. I do this
> > > > because
> > > > > > initially I
> > > > > >         > didn't and the SSVM started using the IP address of the
> > NFS
> > > > > > server.
> > > > > >         >
> > > > > >         >
> > > > > >         >     So the range is 172.30.5.1 -> 15 and the range I
> fill
> > > in
> > > > is
> > > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > > >         >
> > > > > >         >
> > > > > >         >     And I used the label "cloudbr2" for storage.
> > > > > >         >
> > > > > >         >
> > > > > >         >     I must be doing this wrong somehow.
> > > > > >         >
> > > > > >         >
> > > > > >         >     Any pointers would be much appreciated.
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     ________________________________
> > > > > >         >     From: Rafael Weingärtner <
> > rafaelweingartner@gmail.com>
> > > > > >         >     Sent: 05 June 2018 16:13
> > > > > >         >     To: users
> > > > > >         >     Subject: Re: advanced networking with public IPs
> > direct
> > > > to
> > > > > > VMs
> > > > > >         >
> > > > > >         >     That is interesting. Let's see the source of all
> > > truth...
> > > > > >         >     This is the code that is generating that odd
> message.
> > > > > >         >
> > > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > > getClusterId());
> > > > > >         >     >         boolean hasNfs = false;
> > > > > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > > > > >         >     >             if (pool.getPoolType() ==
> > > StoragePoolType.
> > > > > > NetworkFilesystem)
> > > > > >         > {
> > > > > >         >     >                 hasNfs = true;
> > > > > >         >     >                 break;
> > > > > >         >     >             }
> > > > > >         >     >         }
> > > > > >         >     >         if (!hasNfs) {
> > > > > >         >     >             s_logger.warn(
> > > > > >         >     >                     "Agent investigation was
> > > requested
> > > > on
> > > > > > host " +
> > > > > >         > agent +
> > > > > >         >     > ", but host does not support investigation
> because
> > it
> > > > has
> > > > > > no NFS
> > > > > >         > storage.
> > > > > >         >     > Skipping investigation.");
> > > > > >         >     >             return Status.Disconnected;
> > > > > >         >     >         }
> > > > > >         >     >
> > > > > >         >
> > > > > >         >     There are two possibilities here. You do not have
> any
> > > NFS
> > > > > > storage? Is
> > > > > >         > that
> > > > > >         >     the case? Or maybe, for some reason, the call
> > > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > > getClusterId())"
> > > > > > is not
> > > > > >         > returning
> > > > > >         >     any NFS storage pools. Looking at the
> > > > "listPoolsByCluster "
> > > > > > we will see
> > > > > >         >     that the following SQL is used:
> > > > > >         >
> > > > > >         >     Select * from storage_pool where cluster_id =
> > > > > > <host'sClusterId> and
> > > > > >         > removed
> > > > > >         >     > is not null
> > > > > >         >     >
> > > > > >         >
> > > > > >         >     Can you run that SQL to see the its return when
> your
> > > > hosts
> > > > > > are marked
> > > > > >         > as
> > > > > >         >     disconnected?
> > > > > >         >
> > > > > >         >
> > > > > >         > Dag.Sonstebo@shapeblue.com
> > > > > >         > www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > > >         Shapeblue - The CloudStack Company<
> > http://www.shapeblue.com/
> > > >
> > > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > > >         ShapeBlue are the largest independent integrator of
> > > CloudStack
> > > > > > technologies globally and are specialists in the design and
> > > > > implementation
> > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > implementations.
> > > > > >
> > > > > >
> > > > > >
> > > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > >         > @shapeblue
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > > jms.123@hotmail.co.uk>
> > > > > >         > wrote:
> > > > > >         >
> > > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > > configured
> > > > > > the zone
> > > > > >         > through
> > > > > >         >     > the UI I used the labels cloudbr0 for management,
> > > > > cloudbr1
> > > > > > for guest
> > > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > > response
> > > > > > to you.
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > When I pull the power to the node
> (dcp-cscn2.local)
> > > > after
> > > > > > about 5
> > > > > >         > mins
> > > > > >         >     > the  host status goes to "Alert" but never to
> > "Down"
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > I get this in the logs -
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > [c.c.h.KVMInvestigator]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > Agent
> > > > > > investigation
> > > > > >         > was
> > > > > >         >     > requested on host Host[-4-Routing], but host does
> > not
> > > > > > support
> > > > > >         > investigation
> > > > > >         >     > because it has no NFS storage. Skipping
> > > investigation.
> > > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > > HighAvailabilityManagerImpl]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > > KVMInvestigator was
> > > > > >         > able to
> > > > > >         >     > determine host 4 is in Disconnected
> > > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > > [c.c.a.m.AgentManagerImpl]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> The
> > > > agent
> > > > > > from host
> > > > > >         > 4 state
> > > > > >         >     > determined is Disconnected
> > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > > [c.c.a.m.AgentManagerImpl]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > Agent
> > > > is
> > > > > >         > disconnected but
> > > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > > >         >     >
> > > > > >         >     > I don't understand why it thinks there is no NFS
> > > > storage
> > > > > > as each
> > > > > >         > compute
> > > > > >         >     > node has a dedicated storage NIC.
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > I also don't understand why it thinks the host is
> > > still
> > > > > up
> > > > > > ie. what
> > > > > >         > test
> > > > > >         >     > is it doing to determine that ?
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > Am I just trying to get something working that is
> > not
> > > > > > supported ?
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > ________________________________
> > > > > >         >     > From: Rafael Weingärtner <
> > > rafaelweingartner@gmail.com>
> > > > > >         >     > Sent: 04 June 2018 15:31
> > > > > >         >     > To: users
> > > > > >         >     > Subject: Re: advanced networking with public IPs
> > > direct
> > > > > to
> > > > > > VMs
> > > > > >         >     >
> > > > > >         >     > What type of failover are you talking about?
> > > > > >         >     > What ACS version are you using?
> > > > > >         >     > What hypervisor are you using?
> > > > > >         >     > How are you configuring your NICs in the
> > hypervisor?
> > > > > >         >     > How are you configuring the traffic labels in
> ACS?
> > > > > >         >     >
> > > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > > jms.123@hotmail.co.uk
> > > > > >         > >
> > > > > >         >     > wrote:
> > > > > >         >     >
> > > > > >         >     > > Hi all
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > I am close to giving up on basic networking as
> I
> > > just
> > > > > > cannot get
> > > > > >         > failover
> > > > > >         >     > > working with multiple NICs (I am not even sure
> it
> > > is
> > > > > > supported).
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > What I would like is to use 3 NICs for
> > management,
> > > > > > storage and
> > > > > >         > guest
> > > > > >         >     > > traffic. I would like to assign public IPs
> direct
> > > to
> > > > > the
> > > > > > VMs which
> > > > > >         > is
> > > > > >         >     > why I
> > > > > >         >     > > originally chose basic.
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > If I switch to advanced networking do I just
> > > > configure
> > > > > a
> > > > > > guest VM
> > > > > >         > with
> > > > > >         >     > > public IPs on one NIC and not both with the
> > public
> > > > > > traffic -
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > would this work ?
> > > > > >         >     > >
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > --
> > > > > >         >     > Rafael Weingärtner
> > > > > >         >     >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     --
> > > > > >         >     Rafael Weingärtner
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >
> > > > > >
> > > > > >         --
> > > > > >         Rafael Weingärtner
> > > > > >
> > > > > >
> > > > > >
> > > > > >     Dag.Sonstebo@shapeblue.com
> > > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > >     @shapeblue
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Dag.Sonstebo@shapeblue.com
> > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > @shapeblue
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rafael Weingärtner
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
Well, it seems that you have found a bug. Can you fill out an issue report
on Github?

Thanks for the hard work on debugging and testing.

On Fri, Jun 8, 2018 at 2:17 PM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> So based on Erik's suggestion (thanks Erik) I rebuilt the management
> server and setup cluster wide primary storage as opposed to zone wide which
> I have been using so far.
>
>
> Still using 3 NICs (management/Guest/storage) and basic networking.
>
>
> And VM HA now works. In addition it failed over quicker than it did when I
> had zone wide NFS storage on a single NIC.
>
>
> Still a bit confused about this output where it is still showing the
> storage_ip_addresses as 172.30.3.x IPs which is the management subnet but
> maybe I am reading it incorrectly.
>
>
>
> mysql> select * from cloud.host;
> +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
> | id | name            | uuid                                 | status |
> type               | private_ip_address | private_netmask |
> private_mac_address | storage_ip_address | storage_netmask |
> storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> cpus | speed | url                                 | fs_type |
> hypervisor_type | hypervisor_version | ram        | resource | version  |
> parent | total_size | capabilities | guid
>                         | available | setup | dom0_memory | last_ping  |
> mgmt_server_id | disconnected        | created             | removed |
> update_count | resource_state | owner | lastUpdated | engine_state |
> +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
> |  1 | dcp-cscn1.local | 372c738c-5370-4b46-9358-14b649c73d6b | Up     |
> Routing            | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.3        | 255.255.255.128 |
> 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492635804 |   146457912294 |
> 2018-06-08 11:57:31 | 2018-06-08 11:19:36 | NULL    |            4 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  2 | s-1-VM          | 8f40f8a9-1e97-4fce-a143-a8c9489a6d5b | Up     |
> SecondaryStorageVM | 172.30.3.55        | 255.255.255.192 |
> 1e:00:ab:00:00:24   | 172.30.3.55        | 255.255.255.192 |
> 1e:00:ab:00:00:24   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.66       | 255.255.255.128 |
> 1e:00:c4:00:00:67  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
>             |         1 |     0 |           0 | 1492635804 |   146457912294
> | 2018-06-08 11:57:31 | 2018-06-08 11:21:51 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  3 | v-2-VM          | fe5b4f8b-b5b0-4018-8ee2-4c5384da073b | Up     |
> ConsoleProxy       | 172.30.3.40        | 255.255.255.192 |
> 1e:00:96:00:00:15   | 172.30.3.40        | 255.255.255.192 |
> 1e:00:96:00:00:15   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.62       | 255.255.255.128 |
> 1e:00:01:00:00:63  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | Proxy.2-ConsoleProxyResource
>           |         1 |     0 |           0 | 1492635804 |   146457912294 |
> 2018-06-08 11:57:31 | 2018-06-08 11:22:03 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  4 | dcp-cscn2.local | 935260eb-a80c-4ead-85d7-3df8212e301b | Down   |
> Routing            | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.4        | 255.255.255.128 |
> 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
> |         1 |     0 |           0 | 1492635804 |           NULL |
> 2018-06-08 11:57:31 | 2018-06-08 11:35:07 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
> |  5 | dcp-cscn3.local | f3cabc9e-9679-4d7e-8297-b6765eea2770 | Up     |
> Routing            | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.5        | 255.255.255.128 |
> 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492635804 |   146457912294 |
> 2018-06-08 11:57:31 | 2018-06-08 11:36:27 | NULL    |            4 |
> Enabled        | NULL  | NULL        | Disabled     |
> +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
> 5 rows in set (0.00 sec)
>
> mysql>
>
> So some sort of bug maybe ?
>
> ________________________________
> From: Erik Weber <te...@gmail.com>
> Sent: 08 June 2018 10:15
> To: users@cloudstack.apache.org
> Subject: Re: advanced networking with public IPs direct to VMs
>
> While someone ponders about the zone wide storage, you could try adding a
> cluster wide nfs storage and see if it the rest works in that setup.
>
> Erik
>
> On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Yes, all basic. I read a Shapeblue doc that recommended splitting traffic
> > across multiple NICs even in basic networking mode so that is what I am
> > trying to do.
> >
> >
> > With single NIC you do not get the NFS storage message.
> >
> >
> > I have the entire management server logs for both scenarios after I
> pulled
> > the power to one of the compute nodes but from the single NIC setup these
> > seem to be the relevant lines -
> >
> >
> > 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on 4
> > with status Down
> > 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource
> state
> > = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> > 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> > (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> > dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is down,
> > name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> > 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> > (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> > agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4:
> Host
> > with specified id is not in the right state: Down
> > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> > returned status:Down for the investigated host:4
> > 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> > legacy state Down for host 4
> > 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was able
> to
> > determine host 4 is in Down
> > 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4
> state
> > determined is Down
> > 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> > 4-dcp-cscn2.local. Starting HA on the VMs
> >
> > At the moment I only need to assign public IPs direct to VMs rather than
> > using NAT with the virtual router but would be happy to go with advanced
> > networking if it would make things easier :)
> >
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 07 June 2018 10:35
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > Ah so, it is not an advanced setup; even when you use multiple NICs.
> > Can you confirm that the message ""Agent investigation was requested on
> > host, but host does not support investigation because it has no NFS
> > storage. Skipping investigation." does not appear when you use a single
> > NIC? Can you check other log entries that might appear when the host is
> > marked as "down"?
> >
> > On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > It is all basic networking at the moment for all the setups.
> > >
> > >
> > > If you want me to I can setup a single NIC solution again and run any
> > > commands you need me to do.
> > >
> > >
> > > FYI when I setup single NIC I use the guided  installtion option in the
> > UI
> > > rather than manual setup which I do for the multiple NIC scenario.
> > >
> > >
> > > Happy to set it up if it helps.
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 10:23
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > Ok, so that explains the log message. This is looking like a bug to me.
> > It
> > > seems that in Zone wide the host state (when disconnected) is not being
> > > properly identified due to this NFS thing, and as a consequency it has
> a
> > > side effect in VM HA.
> > >
> > > We would need some inputs from guys that have advanced networking
> > > deployments and Zone wide storage.
> > >
> > > I do not see how the all in one NIC deployment scenario is working
> > though.
> > > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
> > > simple, if there is no NFS in the cluster (NFS storage pools found for
> a
> > > host's cluster), KVM hosts will be detected as "disconnected" and not
> > down
> > > with that warning message you noticed.
> > >
> > > When you say "all in one NIC", is it an advanced network deployment
> where
> > > you put all traffic in a single network, or is it a basic networking
> that
> > > you are doing?
> > >
> > > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > zone wide.
> > > >
> > > >
> > > > ________________________________
> > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > Sent: 07 June 2018 10:04
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > What type of storage are you using? Zone wide? Or cluster "wide"
> > storage?
> > > >
> > > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > > wrote:
> > > >
> > > > > Rafael
> > > > >
> > > > >
> > > > > Here is the output as requested -
> > > > >
> > > > >
> > > > >
> > > > > mysql> mysql> select * from cloud.storage_pool where removed is
> null;
> > > > > +----+------+--------------------------------------+--------
> > > > > -----------+------+----------------+--------+------------+--
> > > > > ----------+----------------+--------------+-----------+-----
> > > > > ------------+---------------------+---------+-------------+-
> > > > > -------+-----------------------+-------+------------+-------
> > > > > --+---------------+
> > > > > | id | name | uuid                                 | pool_type
> > >  |
> > > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > > capacity_bytes |
> > > > > host_address | user_info | path            | created             |
> > > > removed
> > > > > | update_time | status | storage_provider_name | scope |
> hypervisor |
> > > > > managed | capacity_iops |
> > > > > +----+------+--------------------------------------+--------
> > > > > -----------+------+----------------+--------+------------+--
> > > > > ----------+----------------+--------------+-----------+-----
> > > > > ------------+---------------------+---------+-------------+-
> > > > > -------+-----------------------+-------+------------+-------
> > > > > --+---------------+
> > > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > > NetworkFilesystem |
> > > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > > 79133933568 |
> > > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 |
> > NULL
> > > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM
>   |
> > > > >  0 |          NULL |
> > > > > +----+------+--------------------------------------+--------
> > > > > -----------+------+----------------+--------+------------+--
> > > > > ----------+----------------+--------------+-----------+-----
> > > > > ------------+---------------------+---------+-------------+-
> > > > > -------+-----------------------+-------+------------+-------
> > > > > --+---------------+
> > > > > 1 row in set (0.00 sec)
> > > > >
> > > > > mysql>
> > > > >
> > > > > Do you think this problem is related to my NIC/bridge configuration
> > or
> > > > the
> > > > > way I am configuring the zone ?
> > > > >
> > > > > Jon
> > > > > ________________________________
> > > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > > Sent: 07 June 2018 06:45
> > > > > To: users
> > > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > > Can you also post the result of:
> > > > > select * from cloud.storage_pool where removed is null
> > > > >
> > > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > > Dag.Sonstebo@shapeblue.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Jon,
> > > > > >
> > > > > > Still confused where your primary storage pools are – are you
> sure
> > > your
> > > > > > hosts are in cluster 1?
> > > > > >
> > > > > > Quick question just to make sure - assuming management/storage is
> > on
> > > > the
> > > > > > same NIC when I setup basic networking the physical network has
> the
> > > > > > management and guest icons already there and I just edit the KVM
> > > > labels.
> > > > > If
> > > > > > I am running storage over management do I need to drag the
> storage
> > > icon
> > > > > to
> > > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > > management or does CS automatically just use the management NIC
> > ie. I
> > > > > would
> > > > > > only need to drag the storage icon across in basic setup if I
> > wanted
> > > it
> > > > > on
> > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > >
> > > > > > >> I would do both – set up your 2/3 physical networks, name
> isn’t
> > > that
> > > > > > important – but then drag the traffic types to the correct one
> and
> > > make
> > > > > > sure the labels are correct.
> > > > > > Regards,
> > > > > > Dag Sonstebo
> > > > > > Cloud Architect
> > > > > > ShapeBlue
> > > > > >
> > > > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk>
> > wrote:
> > > > > >
> > > > > >     Dag
> > > > > >
> > > > > >
> > > > > >     Do you mean  check the pools with "Infrastructure -> Primary
> > > > Storage"
> > > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > > >
> > > > > >
> > > > > >     If so Primary Storage has a state of UP, secondary storage
> does
> > > not
> > > > > > show a state as such so not sure where else to check it ?
> > > > > >
> > > > > >
> > > > > >     Rerun of the command -
> > > > > >
> > > > > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > > > > >     Empty set (0.00 sec)
> > > > > >
> > > > > >     mysql>
> > > > > >
> > > > > >     I think it is something to do with my zone creation rather
> than
> > > the
> > > > > > NIC, bridge setup although I can post those if needed.
> > > > > >
> > > > > >     I may try to setup just the 2 NIC solution you mentioned
> > although
> > > > as
> > > > > I
> > > > > > say I had the same issue with that ie. host goes to "Altert"
> state
> > > and
> > > > > same
> > > > > > error messages.  The only time I can get it to go to "Down" state
> > is
> > > > when
> > > > > > it is all on the single NIC.
> > > > > >
> > > > > >     Quick question just to make sure - assuming
> management/storage
> > is
> > > > on
> > > > > > the same NIC when I setup basic networking the physical network
> has
> > > the
> > > > > > management and guest icons already there and I just edit the KVM
> > > > labels.
> > > > > If
> > > > > > I am running storage over management do I need to drag the
> storage
> > > icon
> > > > > to
> > > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > > management or does CS automatically just use the management NIC
> > ie. I
> > > > > would
> > > > > > only need to drag the storage icon across in basic setup if I
> > wanted
> > > it
> > > > > on
> > > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > > >
> > > > > >     On the plus side I have been at this for so long now and done
> > so
> > > > many
> > > > > > rebuilds I could do it in my sleep now ??
> > > > > >
> > > > > >
> > > > > >     ________________________________
> > > > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > > > >     Sent: 06 June 2018 12:28
> > > > > >     To: users@cloudstack.apache.org
> > > > > >     Subject: Re: advanced networking with public IPs direct to
> VMs
> > > > > >
> > > > > >     Looks OK to me Jon.
> > > > > >
> > > > > >     The one thing that throws me is your storage pools – can you
> > > rerun
> > > > > > your query: select * from cloud.storage_pool where cluster_id =
> 1;
> > > > > >
> > > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > > >
> > > > > >     Regards,
> > > > > >     Dag Sonstebo
> > > > > >     Cloud Architect
> > > > > >     ShapeBlue
> > > > > >
> > > > > >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk>
> > > > wrote:
> > > > > >
> > > > > >         Don't know whether this helps or not but I logged into
> the
> > > SSVM
> > > > > > and ran an ifconfig -
> > > > > >
> > > > > >
> > > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > > > > 169.254.255.255
> > > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> > broadcast
> > > > > > 172.30.3.63
> > > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> > broadcast
> > > > > > 172.30.4.127
> > > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu
> 1500
> > > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> > broadcast
> > > > > > 172.30.5.15
> > > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> > (Ethernet)
> > > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > > collisions
> > > > > 0
> > > > > >
> > > > > >
> > > > > >         so it has interfaces in both the management and the
> storage
> > > > > > subnets (as well as guest).
> > > > > >
> > > > > >
> > > > > >
> > > > > >         ________________________________
> > > > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > > > >         Sent: 06 June 2018 11:08
> > > > > >         To: users@cloudstack.apache.org
> > > > > >         Subject: Re: advanced networking with public IPs direct
> to
> > > VMs
> > > > > >
> > > > > >         Hi Rafael
> > > > > >
> > > > > >
> > > > > >         Thanks for the help, really appreciate it.
> > > > > >
> > > > > >
> > > > > >         So rerunning that command with all servers up -
> > > > > >
> > > > > >
> > > > > >
> > > > > >         mysql> select * from cloud.storage_pool where cluster_id
> =
> > 1
> > > > and
> > > > > > removed is null;
> > > > > >         Empty set (0.00 sec)
> > > > > >
> > > > > >         mysql>
> > > > > >
> > > > > >
> > > > > >         As for the storage IP no I'm not setting it to be the
> > > > management
> > > > > > IP when I setup the zone but the output of the SQL command
> suggests
> > > > that
> > > > > is
> > > > > > what has happened.
> > > > > >
> > > > > >         As I said to Dag I am using a different subnet for
> storage
> > > ie.
> > > > > >
> > > > > >         172.30.3.0/26  - management subnet
> > > > > >         172.30.4.0/25 -  guest VM subnet
> > > > > >         172.30.5.0/28 - storage
> > > > > >
> > > > > >         the NFS server IP is 172.30.5.2
> > > > > >
> > > > > >         each compute node has 3 NICs with an IP from each subnet
> (i
> > > am
> > > > > > assuming the management node only needs an IP in the management
> > > network
> > > > > ?)
> > > > > >
> > > > > >         When I add the zone in the UI I have one physical network
> > > with
> > > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > > >         When I fill in the storage traffic page I use the range
> > > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> > allocated
> > > to
> > > > > the
> > > > > > compute nodes and the NFS server.
> > > > > >
> > > > > >         I think maybe I am doing something wrong in the UI setup
> > but
> > > it
> > > > > is
> > > > > > not obvious to me what it is.
> > > > > >
> > > > > >         What I might try today unless you want me to keep the
> > setup I
> > > > > have
> > > > > > for more outputs is to go back to 2 NICs, one for
> > storage/management
> > > > and
> > > > > > one for guest VMs.
> > > > > >
> > > > > >         I think with the 2 NICs setup the mistake I made last
> time
> > > when
> > > > > > adding the zone was to assume storage would just run over
> > management
> > > > so I
> > > > > > did not drag and drop the storage icon and assign it to cloudbr0
> as
> > > > with
> > > > > > the management which I think is what I should do ?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >         ________________________________
> > > > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > > > >         Sent: 06 June 2018 10:54
> > > > > >         To: users
> > > > > >         Subject: Re: advanced networking with public IPs direct
> to
> > > VMs
> > > > > >
> > > > > >         Jon, do not panic we are here to help you :)
> > > > > >         So, I might have mistyped the SQL query. You you use
> > select *
> > > > > from
> > > > > >         cloud.storage_pool where cluster_id = 1 and removed is
> not
> > > null
> > > > > ",
> > > > > > you are
> > > > > >         listing the storage pools removed. Therefore, the right
> > query
> > > > > > would be "
> > > > > >         select * from cloud.storage_pool where cluster_id = 1 and
> > > > removed
> > > > > > is null "
> > > > > >
> > > > > >         There is also something else I do not understand. You are
> > > > setting
> > > > > > the
> > > > > >         storage IP in the management subnet? I am not sure if you
> > > > should
> > > > > > be doing
> > > > > >         like this. Normally, I set all my storages (primary[when
> > > > working
> > > > > > with NFS]
> > > > > >         and secondary) to IPs in the storage subnet.
> > > > > >
> > > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > > Dag.Sonstebo@shapeblue.com>
> > > > > >         wrote:
> > > > > >
> > > > > >         > Hi John,
> > > > > >         >
> > > > > >         > I’m late to this thread and have possibly missed some
> > > things
> > > > –
> > > > > > but a
> > > > > >         > couple of observations:
> > > > > >         >
> > > > > >         > “When I add the zone and get to the storage web page I
> > > > exclude
> > > > > > the IPs
> > > > > >         > already used for the compute node NICs and the NFS
> server
> > > > > > itself. …..”
> > > > > >         > “So the range is 172.30.5.1 -> 15 and the range I fill
> in
> > > is
> > > > > > 172.30.5.10
> > > > > >         > -> 172.30.5.14.”
> > > > > >         >
> > > > > >         > I think you may have some confusion around the use of
> the
> > > > > > storage network.
> > > > > >         > The important part here is to understand this is for
> > > > *secondary
> > > > > > storage*
> > > > > >         > use only – it has nothing to do with primary storage.
> > This
> > > > > means
> > > > > > this
> > > > > >         > storage network needs to be accessible to the SSVM, to
> > the
> > > > > > hypervisors, and
> > > > > >         > secondary storage NFS pools needs to be accessible on
> > this
> > > > > > network.
> > > > > >         >
> > > > > >         > The important part – this also means you *can not use
> the
> > > > same
> > > > > > IP ranges
> > > > > >         > for management and storage networks* - doing so means
> you
> > > > will
> > > > > > have issues
> > > > > >         > where effectively both hypervisors and SSVM can see the
> > > same
> > > > > > subnet on two
> > > > > >         > NICs – and you end up in a routing black hole.
> > > > > >         >
> > > > > >         > So – you need to either:
> > > > > >         >
> > > > > >         > 1) Use different IP subnets on management and storage,
> or
> > > > > >         > 2) preferably just simplify your setup – stop using a
> > > > secondary
> > > > > > storage
> > > > > >         > network altogether and just allow secondary storage to
> > use
> > > > the
> > > > > > management
> > > > > >         > network (which is default). Unless you have a very high
> > I/O
> > > > > > environment in
> > > > > >         > production you are just adding complexity by running
> > > separate
> > > > > > management
> > > > > >         > and storage.
> > > > > >         >
> > > > > >         > Regards,
> > > > > >         > Dag Sonstebo
> > > > > >         > Cloud Architect
> > > > > >         > ShapeBlue
> > > > > >         >
> > > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > > jms.123@hotmail.co.uk>
> > > > > > wrote:
> > > > > >         >
> > > > > >         >     I will disconnect the host this morning and test
> but
> > > > before
> > > > > > I do that
> > > > > >         > I ran this command when all hosts are up -
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >      select * from cloud.host;
> > > > > >         >     +----+-----------------+------
> > > > > ------------------------------
> > > > > >         > --+--------+------------------
> > > --+--------------------+------
> > > > > >         > -----------+------------------
> > > ---+--------------------+-----
> > > > > >         > ------------+-----------------
> > > ----+----------------------+--
> > > > > >         > ---------------------+--------
> > > -----------+------------+-----
> > > > > >         > --------------+---------------
> > > --+--------------------+------
> > > > > >         > ------+----------------+------
> > > --+-------------+------+------
> > > > > >         > -+----------------------------
> > > ---------+---------+----------
> > > > > >         > -------+--------------------+-
> > > -----------+----------+-------
> > > > > >         > ---+--------+------------+----
> > > ----------+-------------------
> > > > > >         > ------------------------------
> > > --------------+-----------+---
> > > > > >         > ----+-------------+-----------
> > > -+----------------+-----------
> > > > > >         > ----------+-------------------
> > > --+---------+--------------+--
> > > > > >         > --------------+-------+-------------+--------------+
> > > > > >         >     | id | name            | uuid
> > > > > >  | status
> > > > > >         > | type               | private_ip_address |
> > > private_netmask |
> > > > > >         > private_mac_address | storage_ip_address |
> > storage_netmask
> > > |
> > > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > > storage_mac_address_2 |
> > > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > > public_netmask  |
> > > > > >         > public_mac_address | proxy_port | data_center_id |
> > pod_id |
> > > > > > cpu_sockets |
> > > > > >         > cpus | speed | url                                 |
> > > fs_type
> > > > |
> > > > > >         > hypervisor_type | hypervisor_version | ram        |
> > > resource
> > > > |
> > > > > > version  |
> > > > > >         > parent | total_size | capabilities | guid
> > > > > >         >                         | available | setup |
> > dom0_memory |
> > > > > > last_ping  |
> > > > > >         > mgmt_server_id | disconnected        | created
> > >  |
> > > > > > removed |
> > > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > > engine_state |
> > > > > >         >     +----+-----------------+------
> > > > > ------------------------------
> > > > > >         > --+--------+------------------
> > > --+--------------------+------
> > > > > >         > -----------+------------------
> > > ---+--------------------+-----
> > > > > >         > ------------+-----------------
> > > ----+----------------------+--
> > > > > >         > ---------------------+--------
> > > -----------+------------+-----
> > > > > >         > --------------+---------------
> > > --+--------------------+------
> > > > > >         > ------+----------------+------
> > > --+-------------+------+------
> > > > > >         > -+----------------------------
> > > ---------+---------+----------
> > > > > >         > -------+--------------------+-
> > > -----------+----------+-------
> > > > > >         > ---+--------+------------+----
> > > ----------+-------------------
> > > > > >         > ------------------------------
> > > --------------+-----------+---
> > > > > >         > ----+-------------+-----------
> > > -+----------------+-----------
> > > > > >         > ----------+-------------------
> > > --+---------+--------------+--
> > > > > >         > --------------+-------+-------------+--------------+
> > > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > > eabd60012284
> > > > > > | Up
> > > > > >         >  | Routing            | 172.30.3.3         |
> > > 255.255.255.192
> > > > |
> > > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> > 255.255.255.192
> > > |
> > > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |          1 | 172.30.4.3        |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> > 1 |
> > > > > >      1 |
> > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 |
> NULL
> > > > |
> > > > > > KVM
> > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > 4.11.0.0
> > > > |
> > > > > > NULL   |
> > > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > > f83e7a5e8efb-
> > > > > > LibvirtComputingResource
> > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > > 4 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > > 3f5e2c2215b0
> > > > > > | Up
> > > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > > 255.255.255.192
> > > > |
> > > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> > 255.255.255.192
> > > |
> > > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |       NULL | 172.30.4.98       |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> > 1 |
> > > > > >   NULL |
> > > > > >         > NULL |  NULL | NoIqn                               |
> NULL
> > > > |
> > > > > > NULL
> > > > > >         >     | NULL               |          0 | NULL     |
> > > 4.11.0.0 |
> > > > > > NULL   |
> > > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > > >         >           |         1 |     0 |           0 |
> 1492390409
> > |
> > > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > > 7 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > > ff3845bb556c
> > > > > > | Up
> > > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > > 255.255.255.192
> > > > |
> > > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> > 255.255.255.192
> > > |
> > > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |       NULL | 172.30.4.86       |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> > 1 |
> > > > > >   NULL |
> > > > > >         > NULL |  NULL | NoIqn                               |
> NULL
> > > > |
> > > > > > NULL
> > > > > >         >     | NULL               |          0 | NULL     |
> > > 4.11.0.0 |
> > > > > > NULL   |
> > > > > >         >  NULL | NULL         | s-1-VM-
> NfsSecondaryStorageResource
> > > > > >         >             |         1 |     0 |           0 |
> > 1492390407
> > > |
> > > > > >  146457912294
> > > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > > > > >   7 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > > 1a96ffac9794
> > > > > > | Up
> > > > > >         >  | Routing            | 172.30.3.4         |
> > > 255.255.255.192
> > > > |
> > > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> > 255.255.255.192
> > > |
> > > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |          1 | 172.30.4.4        |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> > 1 |
> > > > > >      1 |
> > > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 |
> NULL
> > > > |
> > > > > > KVM
> > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > 4.11.0.0
> > > > |
> > > > > > NULL   |
> > > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > > 16d0f99b11c9-
> > > > > > LibvirtComputingResource
> > > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > > 8 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > > ee29d2b5c274
> > > > > > | Up
> > > > > >         >  | Routing            | 172.30.3.5         |
> > > 255.255.255.192
> > > > |
> > > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> > 255.255.255.192
> > > |
> > > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > > >   | NULL
> > > > > >         >             |          1 | 172.30.4.5        |
> > > > 255.255.255.128
> > > > > |
> > > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> > 1 |
> > > > > >      1 |
> > > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c |
> NULL
> > > > |
> > > > > > KVM
> > > > > >         >      | NULL               | 7510159360 | NULL     |
> > > 4.11.0.0
> > > > |
> > > > > > NULL   |
> > > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > > 37f3eebad8fb-
> > > > > > LibvirtComputingResource
> > > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > > >  146457912294 |
> > > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > > 6 |
> > > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > > >         >     +----+-----------------+------
> > > > > ------------------------------
> > > > > >         > --+--------+------------------
> > > --+--------------------+------
> > > > > >         > -----------+------------------
> > > ---+--------------------+-----
> > > > > >         > ------------+-----------------
> > > ----+----------------------+--
> > > > > >         > ---------------------+--------
> > > -----------+------------+-----
> > > > > >         > --------------+---------------
> > > --+--------------------+------
> > > > > >         > ------+----------------+------
> > > --+-------------+------+------
> > > > > >         > -+----------------------------
> > > ---------+---------+----------
> > > > > >         > -------+--------------------+-
> > > -----------+----------+-------
> > > > > >         > ---+--------+------------+----
> > > ----------+-------------------
> > > > > >         > ------------------------------
> > > --------------+-----------+---
> > > > > >         > ----+-------------+-----------
> > > -+----------------+-----------
> > > > > >         > ----------+-------------------
> > > --+---------+--------------+--
> > > > > >         > --------------+-------+-------------+--------------+
> > > > > >         >     5 rows in set (0.00 sec)
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     and you can see that it says the storage IP address
> > is
> > > > the
> > > > > > same as the
> > > > > >         > private IP address (the management network).
> > > > > >         >
> > > > > >         >
> > > > > >         >     I also ran the command you provided using the
> Cluster
> > > ID
> > > > > > number from
> > > > > >         > the table above -
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     mysql> select * from cloud.storage_pool where
> > > cluster_id
> > > > =
> > > > > 1
> > > > > > and
> > > > > >         > removed is not null;
> > > > > >         >     Empty set (0.00 sec)
> > > > > >         >
> > > > > >         >     mysql>
> > > > > >         >
> > > > > >         >     So assuming I am reading this correctly that seems
> to
> > > be
> > > > > the
> > > > > > issue.
> > > > > >         >
> > > > > >         >
> > > > > >         >     I am at a loss as to why though.
> > > > > >         >
> > > > > >         >
> > > > > >         >     I have a separate NIC for storage as described.
> When
> > I
> > > > add
> > > > > > the zone
> > > > > >         > and get to the storage web page I exclude the IPs
> already
> > > > used
> > > > > > for the
> > > > > >         > compute node NICs and the NFS server itself. I do this
> > > > because
> > > > > > initially I
> > > > > >         > didn't and the SSVM started using the IP address of the
> > NFS
> > > > > > server.
> > > > > >         >
> > > > > >         >
> > > > > >         >     So the range is 172.30.5.1 -> 15 and the range I
> fill
> > > in
> > > > is
> > > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > > >         >
> > > > > >         >
> > > > > >         >     And I used the label "cloudbr2" for storage.
> > > > > >         >
> > > > > >         >
> > > > > >         >     I must be doing this wrong somehow.
> > > > > >         >
> > > > > >         >
> > > > > >         >     Any pointers would be much appreciated.
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     ________________________________
> > > > > >         >     From: Rafael Weingärtner <
> > rafaelweingartner@gmail.com>
> > > > > >         >     Sent: 05 June 2018 16:13
> > > > > >         >     To: users
> > > > > >         >     Subject: Re: advanced networking with public IPs
> > direct
> > > > to
> > > > > > VMs
> > > > > >         >
> > > > > >         >     That is interesting. Let's see the source of all
> > > truth...
> > > > > >         >     This is the code that is generating that odd
> message.
> > > > > >         >
> > > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > > getClusterId());
> > > > > >         >     >         boolean hasNfs = false;
> > > > > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > > > > >         >     >             if (pool.getPoolType() ==
> > > StoragePoolType.
> > > > > > NetworkFilesystem)
> > > > > >         > {
> > > > > >         >     >                 hasNfs = true;
> > > > > >         >     >                 break;
> > > > > >         >     >             }
> > > > > >         >     >         }
> > > > > >         >     >         if (!hasNfs) {
> > > > > >         >     >             s_logger.warn(
> > > > > >         >     >                     "Agent investigation was
> > > requested
> > > > on
> > > > > > host " +
> > > > > >         > agent +
> > > > > >         >     > ", but host does not support investigation
> because
> > it
> > > > has
> > > > > > no NFS
> > > > > >         > storage.
> > > > > >         >     > Skipping investigation.");
> > > > > >         >     >             return Status.Disconnected;
> > > > > >         >     >         }
> > > > > >         >     >
> > > > > >         >
> > > > > >         >     There are two possibilities here. You do not have
> any
> > > NFS
> > > > > > storage? Is
> > > > > >         > that
> > > > > >         >     the case? Or maybe, for some reason, the call
> > > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > > getClusterId())"
> > > > > > is not
> > > > > >         > returning
> > > > > >         >     any NFS storage pools. Looking at the
> > > > "listPoolsByCluster "
> > > > > > we will see
> > > > > >         >     that the following SQL is used:
> > > > > >         >
> > > > > >         >     Select * from storage_pool where cluster_id =
> > > > > > <host'sClusterId> and
> > > > > >         > removed
> > > > > >         >     > is not null
> > > > > >         >     >
> > > > > >         >
> > > > > >         >     Can you run that SQL to see the its return when
> your
> > > > hosts
> > > > > > are marked
> > > > > >         > as
> > > > > >         >     disconnected?
> > > > > >         >
> > > > > >         >
> > > > > >         > Dag.Sonstebo@shapeblue.com
> > > > > >         > www.shapeblue.com<http://www.shapeblue.com>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > > >         Shapeblue - The CloudStack Company<
> > http://www.shapeblue.com/
> > > >
> > > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > > >         ShapeBlue are the largest independent integrator of
> > > CloudStack
> > > > > > technologies globally and are specialists in the design and
> > > > > implementation
> > > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > > implementations.
> > > > > >
> > > > > >
> > > > > >
> > > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > >         > @shapeblue
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > > jms.123@hotmail.co.uk>
> > > > > >         > wrote:
> > > > > >         >
> > > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > > configured
> > > > > > the zone
> > > > > >         > through
> > > > > >         >     > the UI I used the labels cloudbr0 for management,
> > > > > cloudbr1
> > > > > > for guest
> > > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > > response
> > > > > > to you.
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > When I pull the power to the node
> (dcp-cscn2.local)
> > > > after
> > > > > > about 5
> > > > > >         > mins
> > > > > >         >     > the  host status goes to "Alert" but never to
> > "Down"
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > I get this in the logs -
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > [c.c.h.KVMInvestigator]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > Agent
> > > > > > investigation
> > > > > >         > was
> > > > > >         >     > requested on host Host[-4-Routing], but host does
> > not
> > > > > > support
> > > > > >         > investigation
> > > > > >         >     > because it has no NFS storage. Skipping
> > > investigation.
> > > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > > HighAvailabilityManagerImpl]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > > KVMInvestigator was
> > > > > >         > able to
> > > > > >         >     > determine host 4 is in Disconnected
> > > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > > [c.c.a.m.AgentManagerImpl]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> The
> > > > agent
> > > > > > from host
> > > > > >         > 4 state
> > > > > >         >     > determined is Disconnected
> > > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > > [c.c.a.m.AgentManagerImpl]
> > > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > Agent
> > > > is
> > > > > >         > disconnected but
> > > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > > >         >     >
> > > > > >         >     > I don't understand why it thinks there is no NFS
> > > > storage
> > > > > > as each
> > > > > >         > compute
> > > > > >         >     > node has a dedicated storage NIC.
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > I also don't understand why it thinks the host is
> > > still
> > > > > up
> > > > > > ie. what
> > > > > >         > test
> > > > > >         >     > is it doing to determine that ?
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > Am I just trying to get something working that is
> > not
> > > > > > supported ?
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > ________________________________
> > > > > >         >     > From: Rafael Weingärtner <
> > > rafaelweingartner@gmail.com>
> > > > > >         >     > Sent: 04 June 2018 15:31
> > > > > >         >     > To: users
> > > > > >         >     > Subject: Re: advanced networking with public IPs
> > > direct
> > > > > to
> > > > > > VMs
> > > > > >         >     >
> > > > > >         >     > What type of failover are you talking about?
> > > > > >         >     > What ACS version are you using?
> > > > > >         >     > What hypervisor are you using?
> > > > > >         >     > How are you configuring your NICs in the
> > hypervisor?
> > > > > >         >     > How are you configuring the traffic labels in
> ACS?
> > > > > >         >     >
> > > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > > jms.123@hotmail.co.uk
> > > > > >         > >
> > > > > >         >     > wrote:
> > > > > >         >     >
> > > > > >         >     > > Hi all
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > I am close to giving up on basic networking as
> I
> > > just
> > > > > > cannot get
> > > > > >         > failover
> > > > > >         >     > > working with multiple NICs (I am not even sure
> it
> > > is
> > > > > > supported).
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > What I would like is to use 3 NICs for
> > management,
> > > > > > storage and
> > > > > >         > guest
> > > > > >         >     > > traffic. I would like to assign public IPs
> direct
> > > to
> > > > > the
> > > > > > VMs which
> > > > > >         > is
> > > > > >         >     > why I
> > > > > >         >     > > originally chose basic.
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > If I switch to advanced networking do I just
> > > > configure
> > > > > a
> > > > > > guest VM
> > > > > >         > with
> > > > > >         >     > > public IPs on one NIC and not both with the
> > public
> > > > > > traffic -
> > > > > >         >     > >
> > > > > >         >     > >
> > > > > >         >     > > would this work ?
> > > > > >         >     > >
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     >
> > > > > >         >     > --
> > > > > >         >     > Rafael Weingärtner
> > > > > >         >     >
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >         >     --
> > > > > >         >     Rafael Weingärtner
> > > > > >         >
> > > > > >         >
> > > > > >         >
> > > > > >
> > > > > >
> > > > > >         --
> > > > > >         Rafael Weingärtner
> > > > > >
> > > > > >
> > > > > >
> > > > > >     Dag.Sonstebo@shapeblue.com
> > > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > >     @shapeblue
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Dag.Sonstebo@shapeblue.com
> > > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > > @shapeblue
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Rafael Weingärtner
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>



-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
So based on Erik's suggestion (thanks Erik) I rebuilt the management server and setup cluster wide primary storage as opposed to zone wide which I have been using so far.


Still using 3 NICs (management/Guest/storage) and basic networking.


And VM HA now works. In addition it failed over quicker than it did when I had zone wide NFS storage on a single NIC.


Still a bit confused about this output where it is still showing the storage_ip_addresses as 172.30.3.x IPs which is the management subnet but maybe I am reading it incorrectly.



mysql> select * from cloud.host;
+----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
| id | name            | uuid                                 | status | type               | private_ip_address | private_netmask | private_mac_address | storage_ip_address | storage_netmask | storage_mac_address | storage_ip_address_2 | storage_mac_address_2 | storage_netmask_2 | cluster_id | public_ip_address | public_netmask  | public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets | cpus | speed | url                                 | fs_type | hypervisor_type | hypervisor_version | ram        | resource | version  | parent | total_size | capabilities | guid                                                          | available | setup | dom0_memory | last_ping  | mgmt_server_id | disconnected        | created             | removed | update_count | resource_state | owner | lastUpdated | engine_state |
+----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
|  1 | dcp-cscn1.local | 372c738c-5370-4b46-9358-14b649c73d6b | Up     | Routing            | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.3        | 255.255.255.128 | 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |    2 |  3000 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource |         1 |     0 |           0 | 1492635804 |   146457912294 | 2018-06-08 11:57:31 | 2018-06-08 11:19:36 | NULL    |            4 | Enabled        | NULL  | NULL        | Disabled     |
|  2 | s-1-VM          | 8f40f8a9-1e97-4fce-a143-a8c9489a6d5b | Up     | SecondaryStorageVM | 172.30.3.55        | 255.255.255.192 | 1e:00:ab:00:00:24   | 172.30.3.55        | 255.255.255.192 | 1e:00:ab:00:00:24   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.66       | 255.255.255.128 | 1e:00:c4:00:00:67  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | s-1-VM-NfsSecondaryStorageResource                            |         1 |     0 |           0 | 1492635804 |   146457912294 | 2018-06-08 11:57:31 | 2018-06-08 11:21:51 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
|  3 | v-2-VM          | fe5b4f8b-b5b0-4018-8ee2-4c5384da073b | Up     | ConsoleProxy       | 172.30.3.40        | 255.255.255.192 | 1e:00:96:00:00:15   | 172.30.3.40        | 255.255.255.192 | 1e:00:96:00:00:15   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.62       | 255.255.255.128 | 1e:00:01:00:00:63  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | Proxy.2-ConsoleProxyResource                                  |         1 |     0 |           0 | 1492635804 |   146457912294 | 2018-06-08 11:57:31 | 2018-06-08 11:22:03 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
|  4 | dcp-cscn2.local | 935260eb-a80c-4ead-85d7-3df8212e301b | Down   | Routing            | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.4        | 255.255.255.128 | 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |    2 |  3000 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource |         1 |     0 |           0 | 1492635804 |           NULL | 2018-06-08 11:57:31 | 2018-06-08 11:35:07 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
|  5 | dcp-cscn3.local | f3cabc9e-9679-4d7e-8297-b6765eea2770 | Up     | Routing            | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.5        | 255.255.255.128 | 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |    2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource |         1 |     0 |           0 | 1492635804 |   146457912294 | 2018-06-08 11:57:31 | 2018-06-08 11:36:27 | NULL    |            4 | Enabled        | NULL  | NULL        | Disabled     |
+----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
5 rows in set (0.00 sec)

mysql>

So some sort of bug maybe ?

________________________________
From: Erik Weber <te...@gmail.com>
Sent: 08 June 2018 10:15
To: users@cloudstack.apache.org
Subject: Re: advanced networking with public IPs direct to VMs

While someone ponders about the zone wide storage, you could try adding a
cluster wide nfs storage and see if it the rest works in that setup.

Erik

On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jm...@hotmail.co.uk> wrote:

> Yes, all basic. I read a Shapeblue doc that recommended splitting traffic
> across multiple NICs even in basic networking mode so that is what I am
> trying to do.
>
>
> With single NIC you do not get the NFS storage message.
>
>
> I have the entire management server logs for both scenarios after I pulled
> the power to one of the compute nodes but from the single NIC setup these
> seem to be the relevant lines -
>
>
> 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on 4
> with status Down
> 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource state
> = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is down,
> name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4: Host
> with specified id is not in the right state: Down
> 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> returned status:Down for the investigated host:4
> 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> legacy state Down for host 4
> 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was able to
> determine host 4 is in Down
> 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4 state
> determined is Down
> 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> 4-dcp-cscn2.local. Starting HA on the VMs
>
> At the moment I only need to assign public IPs direct to VMs rather than
> using NAT with the virtual router but would be happy to go with advanced
> networking if it would make things easier :)
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 07 June 2018 10:35
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Ah so, it is not an advanced setup; even when you use multiple NICs.
> Can you confirm that the message ""Agent investigation was requested on
> host, but host does not support investigation because it has no NFS
> storage. Skipping investigation." does not appear when you use a single
> NIC? Can you check other log entries that might appear when the host is
> marked as "down"?
>
> On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > It is all basic networking at the moment for all the setups.
> >
> >
> > If you want me to I can setup a single NIC solution again and run any
> > commands you need me to do.
> >
> >
> > FYI when I setup single NIC I use the guided  installtion option in the
> UI
> > rather than manual setup which I do for the multiple NIC scenario.
> >
> >
> > Happy to set it up if it helps.
> >
> >
> >
> >
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 07 June 2018 10:23
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > Ok, so that explains the log message. This is looking like a bug to me.
> It
> > seems that in Zone wide the host state (when disconnected) is not being
> > properly identified due to this NFS thing, and as a consequency it has a
> > side effect in VM HA.
> >
> > We would need some inputs from guys that have advanced networking
> > deployments and Zone wide storage.
> >
> > I do not see how the all in one NIC deployment scenario is working
> though.
> > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
> > simple, if there is no NFS in the cluster (NFS storage pools found for a
> > host's cluster), KVM hosts will be detected as "disconnected" and not
> down
> > with that warning message you noticed.
> >
> > When you say "all in one NIC", is it an advanced network deployment where
> > you put all traffic in a single network, or is it a basic networking that
> > you are doing?
> >
> > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > zone wide.
> > >
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 10:04
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > What type of storage are you using? Zone wide? Or cluster "wide"
> storage?
> > >
> > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > Rafael
> > > >
> > > >
> > > > Here is the output as requested -
> > > >
> > > >
> > > >
> > > > mysql> mysql> select * from cloud.storage_pool where removed is null;
> > > > +----+------+--------------------------------------+--------
> > > > -----------+------+----------------+--------+------------+--
> > > > ----------+----------------+--------------+-----------+-----
> > > > ------------+---------------------+---------+-------------+-
> > > > -------+-----------------------+-------+------------+-------
> > > > --+---------------+
> > > > | id | name | uuid                                 | pool_type
> >  |
> > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > capacity_bytes |
> > > > host_address | user_info | path            | created             |
> > > removed
> > > > | update_time | status | storage_provider_name | scope | hypervisor |
> > > > managed | capacity_iops |
> > > > +----+------+--------------------------------------+--------
> > > > -----------+------+----------------+--------+------------+--
> > > > ----------+----------------+--------------+-----------+-----
> > > > ------------+---------------------+---------+-------------+-
> > > > -------+-----------------------+-------+------------+-------
> > > > --+---------------+
> > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > NetworkFilesystem |
> > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > 79133933568 |
> > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 |
> NULL
> > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |
> > > >  0 |          NULL |
> > > > +----+------+--------------------------------------+--------
> > > > -----------+------+----------------+--------+------------+--
> > > > ----------+----------------+--------------+-----------+-----
> > > > ------------+---------------------+---------+-------------+-
> > > > -------+-----------------------+-------+------------+-------
> > > > --+---------------+
> > > > 1 row in set (0.00 sec)
> > > >
> > > > mysql>
> > > >
> > > > Do you think this problem is related to my NIC/bridge configuration
> or
> > > the
> > > > way I am configuring the zone ?
> > > >
> > > > Jon
> > > > ________________________________
> > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > Sent: 07 June 2018 06:45
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > Can you also post the result of:
> > > > select * from cloud.storage_pool where removed is null
> > > >
> > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > Dag.Sonstebo@shapeblue.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi Jon,
> > > > >
> > > > > Still confused where your primary storage pools are – are you sure
> > your
> > > > > hosts are in cluster 1?
> > > > >
> > > > > Quick question just to make sure - assuming management/storage is
> on
> > > the
> > > > > same NIC when I setup basic networking the physical network has the
> > > > > management and guest icons already there and I just edit the KVM
> > > labels.
> > > > If
> > > > > I am running storage over management do I need to drag the storage
> > icon
> > > > to
> > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > management or does CS automatically just use the management NIC
> ie. I
> > > > would
> > > > > only need to drag the storage icon across in basic setup if I
> wanted
> > it
> > > > on
> > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > >
> > > > > >> I would do both – set up your 2/3 physical networks, name isn’t
> > that
> > > > > important – but then drag the traffic types to the correct one and
> > make
> > > > > sure the labels are correct.
> > > > > Regards,
> > > > > Dag Sonstebo
> > > > > Cloud Architect
> > > > > ShapeBlue
> > > > >
> > > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk>
> wrote:
> > > > >
> > > > >     Dag
> > > > >
> > > > >
> > > > >     Do you mean  check the pools with "Infrastructure -> Primary
> > > Storage"
> > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > >
> > > > >
> > > > >     If so Primary Storage has a state of UP, secondary storage does
> > not
> > > > > show a state as such so not sure where else to check it ?
> > > > >
> > > > >
> > > > >     Rerun of the command -
> > > > >
> > > > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > > > >     Empty set (0.00 sec)
> > > > >
> > > > >     mysql>
> > > > >
> > > > >     I think it is something to do with my zone creation rather than
> > the
> > > > > NIC, bridge setup although I can post those if needed.
> > > > >
> > > > >     I may try to setup just the 2 NIC solution you mentioned
> although
> > > as
> > > > I
> > > > > say I had the same issue with that ie. host goes to "Altert" state
> > and
> > > > same
> > > > > error messages.  The only time I can get it to go to "Down" state
> is
> > > when
> > > > > it is all on the single NIC.
> > > > >
> > > > >     Quick question just to make sure - assuming management/storage
> is
> > > on
> > > > > the same NIC when I setup basic networking the physical network has
> > the
> > > > > management and guest icons already there and I just edit the KVM
> > > labels.
> > > > If
> > > > > I am running storage over management do I need to drag the storage
> > icon
> > > > to
> > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > management or does CS automatically just use the management NIC
> ie. I
> > > > would
> > > > > only need to drag the storage icon across in basic setup if I
> wanted
> > it
> > > > on
> > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > >
> > > > >     On the plus side I have been at this for so long now and done
> so
> > > many
> > > > > rebuilds I could do it in my sleep now ??
> > > > >
> > > > >
> > > > >     ________________________________
> > > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > > >     Sent: 06 June 2018 12:28
> > > > >     To: users@cloudstack.apache.org
> > > > >     Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > >     Looks OK to me Jon.
> > > > >
> > > > >     The one thing that throws me is your storage pools – can you
> > rerun
> > > > > your query: select * from cloud.storage_pool where cluster_id = 1;
> > > > >
> > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > >
> > > > >     Regards,
> > > > >     Dag Sonstebo
> > > > >     Cloud Architect
> > > > >     ShapeBlue
> > > > >
> > > > >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk>
> > > wrote:
> > > > >
> > > > >         Don't know whether this helps or not but I logged into the
> > SSVM
> > > > > and ran an ifconfig -
> > > > >
> > > > >
> > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > > > 169.254.255.255
> > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> broadcast
> > > > > 172.30.3.63
> > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> broadcast
> > > > > 172.30.4.127
> > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> broadcast
> > > > > 172.30.5.15
> > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >
> > > > >         so it has interfaces in both the management and the storage
> > > > > subnets (as well as guest).
> > > > >
> > > > >
> > > > >
> > > > >         ________________________________
> > > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > > >         Sent: 06 June 2018 11:08
> > > > >         To: users@cloudstack.apache.org
> > > > >         Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > >
> > > > >         Hi Rafael
> > > > >
> > > > >
> > > > >         Thanks for the help, really appreciate it.
> > > > >
> > > > >
> > > > >         So rerunning that command with all servers up -
> > > > >
> > > > >
> > > > >
> > > > >         mysql> select * from cloud.storage_pool where cluster_id =
> 1
> > > and
> > > > > removed is null;
> > > > >         Empty set (0.00 sec)
> > > > >
> > > > >         mysql>
> > > > >
> > > > >
> > > > >         As for the storage IP no I'm not setting it to be the
> > > management
> > > > > IP when I setup the zone but the output of the SQL command suggests
> > > that
> > > > is
> > > > > what has happened.
> > > > >
> > > > >         As I said to Dag I am using a different subnet for storage
> > ie.
> > > > >
> > > > >         172.30.3.0/26  - management subnet
> > > > >         172.30.4.0/25 -  guest VM subnet
> > > > >         172.30.5.0/28 - storage
> > > > >
> > > > >         the NFS server IP is 172.30.5.2
> > > > >
> > > > >         each compute node has 3 NICs with an IP from each subnet (i
> > am
> > > > > assuming the management node only needs an IP in the management
> > network
> > > > ?)
> > > > >
> > > > >         When I add the zone in the UI I have one physical network
> > with
> > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > >         When I fill in the storage traffic page I use the range
> > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> allocated
> > to
> > > > the
> > > > > compute nodes and the NFS server.
> > > > >
> > > > >         I think maybe I am doing something wrong in the UI setup
> but
> > it
> > > > is
> > > > > not obvious to me what it is.
> > > > >
> > > > >         What I might try today unless you want me to keep the
> setup I
> > > > have
> > > > > for more outputs is to go back to 2 NICs, one for
> storage/management
> > > and
> > > > > one for guest VMs.
> > > > >
> > > > >         I think with the 2 NICs setup the mistake I made last time
> > when
> > > > > adding the zone was to assume storage would just run over
> management
> > > so I
> > > > > did not drag and drop the storage icon and assign it to cloudbr0 as
> > > with
> > > > > the management which I think is what I should do ?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >         ________________________________
> > > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > > >         Sent: 06 June 2018 10:54
> > > > >         To: users
> > > > >         Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > >
> > > > >         Jon, do not panic we are here to help you :)
> > > > >         So, I might have mistyped the SQL query. You you use
> select *
> > > > from
> > > > >         cloud.storage_pool where cluster_id = 1 and removed is not
> > null
> > > > ",
> > > > > you are
> > > > >         listing the storage pools removed. Therefore, the right
> query
> > > > > would be "
> > > > >         select * from cloud.storage_pool where cluster_id = 1 and
> > > removed
> > > > > is null "
> > > > >
> > > > >         There is also something else I do not understand. You are
> > > setting
> > > > > the
> > > > >         storage IP in the management subnet? I am not sure if you
> > > should
> > > > > be doing
> > > > >         like this. Normally, I set all my storages (primary[when
> > > working
> > > > > with NFS]
> > > > >         and secondary) to IPs in the storage subnet.
> > > > >
> > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > Dag.Sonstebo@shapeblue.com>
> > > > >         wrote:
> > > > >
> > > > >         > Hi John,
> > > > >         >
> > > > >         > I’m late to this thread and have possibly missed some
> > things
> > > –
> > > > > but a
> > > > >         > couple of observations:
> > > > >         >
> > > > >         > “When I add the zone and get to the storage web page I
> > > exclude
> > > > > the IPs
> > > > >         > already used for the compute node NICs and the NFS server
> > > > > itself. …..”
> > > > >         > “So the range is 172.30.5.1 -> 15 and the range I fill in
> > is
> > > > > 172.30.5.10
> > > > >         > -> 172.30.5.14.”
> > > > >         >
> > > > >         > I think you may have some confusion around the use of the
> > > > > storage network.
> > > > >         > The important part here is to understand this is for
> > > *secondary
> > > > > storage*
> > > > >         > use only – it has nothing to do with primary storage.
> This
> > > > means
> > > > > this
> > > > >         > storage network needs to be accessible to the SSVM, to
> the
> > > > > hypervisors, and
> > > > >         > secondary storage NFS pools needs to be accessible on
> this
> > > > > network.
> > > > >         >
> > > > >         > The important part – this also means you *can not use the
> > > same
> > > > > IP ranges
> > > > >         > for management and storage networks* - doing so means you
> > > will
> > > > > have issues
> > > > >         > where effectively both hypervisors and SSVM can see the
> > same
> > > > > subnet on two
> > > > >         > NICs – and you end up in a routing black hole.
> > > > >         >
> > > > >         > So – you need to either:
> > > > >         >
> > > > >         > 1) Use different IP subnets on management and storage, or
> > > > >         > 2) preferably just simplify your setup – stop using a
> > > secondary
> > > > > storage
> > > > >         > network altogether and just allow secondary storage to
> use
> > > the
> > > > > management
> > > > >         > network (which is default). Unless you have a very high
> I/O
> > > > > environment in
> > > > >         > production you are just adding complexity by running
> > separate
> > > > > management
> > > > >         > and storage.
> > > > >         >
> > > > >         > Regards,
> > > > >         > Dag Sonstebo
> > > > >         > Cloud Architect
> > > > >         > ShapeBlue
> > > > >         >
> > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > >         >
> > > > >         >     I will disconnect the host this morning and test but
> > > before
> > > > > I do that
> > > > >         > I ran this command when all hosts are up -
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >      select * from cloud.host;
> > > > >         >     +----+-----------------+------
> > > > ------------------------------
> > > > >         > --+--------+------------------
> > --+--------------------+------
> > > > >         > -----------+------------------
> > ---+--------------------+-----
> > > > >         > ------------+-----------------
> > ----+----------------------+--
> > > > >         > ---------------------+--------
> > -----------+------------+-----
> > > > >         > --------------+---------------
> > --+--------------------+------
> > > > >         > ------+----------------+------
> > --+-------------+------+------
> > > > >         > -+----------------------------
> > ---------+---------+----------
> > > > >         > -------+--------------------+-
> > -----------+----------+-------
> > > > >         > ---+--------+------------+----
> > ----------+-------------------
> > > > >         > ------------------------------
> > --------------+-----------+---
> > > > >         > ----+-------------+-----------
> > -+----------------+-----------
> > > > >         > ----------+-------------------
> > --+---------+--------------+--
> > > > >         > --------------+-------+-------------+--------------+
> > > > >         >     | id | name            | uuid
> > > > >  | status
> > > > >         > | type               | private_ip_address |
> > private_netmask |
> > > > >         > private_mac_address | storage_ip_address |
> storage_netmask
> > |
> > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > storage_mac_address_2 |
> > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > public_netmask  |
> > > > >         > public_mac_address | proxy_port | data_center_id |
> pod_id |
> > > > > cpu_sockets |
> > > > >         > cpus | speed | url                                 |
> > fs_type
> > > |
> > > > >         > hypervisor_type | hypervisor_version | ram        |
> > resource
> > > |
> > > > > version  |
> > > > >         > parent | total_size | capabilities | guid
> > > > >         >                         | available | setup |
> dom0_memory |
> > > > > last_ping  |
> > > > >         > mgmt_server_id | disconnected        | created
> >  |
> > > > > removed |
> > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > engine_state |
> > > > >         >     +----+-----------------+------
> > > > ------------------------------
> > > > >         > --+--------+------------------
> > --+--------------------+------
> > > > >         > -----------+------------------
> > ---+--------------------+-----
> > > > >         > ------------+-----------------
> > ----+----------------------+--
> > > > >         > ---------------------+--------
> > -----------+------------+-----
> > > > >         > --------------+---------------
> > --+--------------------+------
> > > > >         > ------+----------------+------
> > --+-------------+------+------
> > > > >         > -+----------------------------
> > ---------+---------+----------
> > > > >         > -------+--------------------+-
> > -----------+----------+-------
> > > > >         > ---+--------+------------+----
> > ----------+-------------------
> > > > >         > ------------------------------
> > --------------+-----------+---
> > > > >         > ----+-------------+-----------
> > -+----------------+-----------
> > > > >         > ----------+-------------------
> > --+---------+--------------+--
> > > > >         > --------------+-------+-------------+--------------+
> > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > eabd60012284
> > > > > | Up
> > > > >         >  | Routing            | 172.30.3.3         |
> > 255.255.255.192
> > > |
> > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> 255.255.255.192
> > |
> > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |          1 | 172.30.4.3        |
> > > 255.255.255.128
> > > > |
> > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> 1 |
> > > > >      1 |
> > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL
> > > |
> > > > > KVM
> > > > >         >      | NULL               | 7510159360 | NULL     |
> > 4.11.0.0
> > > |
> > > > > NULL   |
> > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > f83e7a5e8efb-
> > > > > LibvirtComputingResource
> > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > 4 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > 3f5e2c2215b0
> > > > > | Up
> > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > 255.255.255.192
> > > |
> > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> 255.255.255.192
> > |
> > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |       NULL | 172.30.4.98       |
> > > 255.255.255.128
> > > > |
> > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> 1 |
> > > > >   NULL |
> > > > >         > NULL |  NULL | NoIqn                               | NULL
> > > |
> > > > > NULL
> > > > >         >     | NULL               |          0 | NULL     |
> > 4.11.0.0 |
> > > > > NULL   |
> > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > >         >           |         1 |     0 |           0 | 1492390409
> |
> > > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > 7 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > ff3845bb556c
> > > > > | Up
> > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > 255.255.255.192
> > > |
> > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> 255.255.255.192
> > |
> > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |       NULL | 172.30.4.86       |
> > > 255.255.255.128
> > > > |
> > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> 1 |
> > > > >   NULL |
> > > > >         > NULL |  NULL | NoIqn                               | NULL
> > > |
> > > > > NULL
> > > > >         >     | NULL               |          0 | NULL     |
> > 4.11.0.0 |
> > > > > NULL   |
> > > > >         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> > > > >         >             |         1 |     0 |           0 |
> 1492390407
> > |
> > > > >  146457912294
> > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > > > >   7 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > 1a96ffac9794
> > > > > | Up
> > > > >         >  | Routing            | 172.30.3.4         |
> > 255.255.255.192
> > > |
> > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> 255.255.255.192
> > |
> > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |          1 | 172.30.4.4        |
> > > 255.255.255.128
> > > > |
> > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> 1 |
> > > > >      1 |
> > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL
> > > |
> > > > > KVM
> > > > >         >      | NULL               | 7510159360 | NULL     |
> > 4.11.0.0
> > > |
> > > > > NULL   |
> > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > 16d0f99b11c9-
> > > > > LibvirtComputingResource
> > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > 8 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > ee29d2b5c274
> > > > > | Up
> > > > >         >  | Routing            | 172.30.3.5         |
> > 255.255.255.192
> > > |
> > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> 255.255.255.192
> > |
> > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |          1 | 172.30.4.5        |
> > > 255.255.255.128
> > > > |
> > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> 1 |
> > > > >      1 |
> > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL
> > > |
> > > > > KVM
> > > > >         >      | NULL               | 7510159360 | NULL     |
> > 4.11.0.0
> > > |
> > > > > NULL   |
> > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > 37f3eebad8fb-
> > > > > LibvirtComputingResource
> > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > 6 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     +----+-----------------+------
> > > > ------------------------------
> > > > >         > --+--------+------------------
> > --+--------------------+------
> > > > >         > -----------+------------------
> > ---+--------------------+-----
> > > > >         > ------------+-----------------
> > ----+----------------------+--
> > > > >         > ---------------------+--------
> > -----------+------------+-----
> > > > >         > --------------+---------------
> > --+--------------------+------
> > > > >         > ------+----------------+------
> > --+-------------+------+------
> > > > >         > -+----------------------------
> > ---------+---------+----------
> > > > >         > -------+--------------------+-
> > -----------+----------+-------
> > > > >         > ---+--------+------------+----
> > ----------+-------------------
> > > > >         > ------------------------------
> > --------------+-----------+---
> > > > >         > ----+-------------+-----------
> > -+----------------+-----------
> > > > >         > ----------+-------------------
> > --+---------+--------------+--
> > > > >         > --------------+-------+-------------+--------------+
> > > > >         >     5 rows in set (0.00 sec)
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     and you can see that it says the storage IP address
> is
> > > the
> > > > > same as the
> > > > >         > private IP address (the management network).
> > > > >         >
> > > > >         >
> > > > >         >     I also ran the command you provided using the Cluster
> > ID
> > > > > number from
> > > > >         > the table above -
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     mysql> select * from cloud.storage_pool where
> > cluster_id
> > > =
> > > > 1
> > > > > and
> > > > >         > removed is not null;
> > > > >         >     Empty set (0.00 sec)
> > > > >         >
> > > > >         >     mysql>
> > > > >         >
> > > > >         >     So assuming I am reading this correctly that seems to
> > be
> > > > the
> > > > > issue.
> > > > >         >
> > > > >         >
> > > > >         >     I am at a loss as to why though.
> > > > >         >
> > > > >         >
> > > > >         >     I have a separate NIC for storage as described. When
> I
> > > add
> > > > > the zone
> > > > >         > and get to the storage web page I exclude the IPs already
> > > used
> > > > > for the
> > > > >         > compute node NICs and the NFS server itself. I do this
> > > because
> > > > > initially I
> > > > >         > didn't and the SSVM started using the IP address of the
> NFS
> > > > > server.
> > > > >         >
> > > > >         >
> > > > >         >     So the range is 172.30.5.1 -> 15 and the range I fill
> > in
> > > is
> > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > >         >
> > > > >         >
> > > > >         >     And I used the label "cloudbr2" for storage.
> > > > >         >
> > > > >         >
> > > > >         >     I must be doing this wrong somehow.
> > > > >         >
> > > > >         >
> > > > >         >     Any pointers would be much appreciated.
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     ________________________________
> > > > >         >     From: Rafael Weingärtner <
> rafaelweingartner@gmail.com>
> > > > >         >     Sent: 05 June 2018 16:13
> > > > >         >     To: users
> > > > >         >     Subject: Re: advanced networking with public IPs
> direct
> > > to
> > > > > VMs
> > > > >         >
> > > > >         >     That is interesting. Let's see the source of all
> > truth...
> > > > >         >     This is the code that is generating that odd message.
> > > > >         >
> > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > getClusterId());
> > > > >         >     >         boolean hasNfs = false;
> > > > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > > > >         >     >             if (pool.getPoolType() ==
> > StoragePoolType.
> > > > > NetworkFilesystem)
> > > > >         > {
> > > > >         >     >                 hasNfs = true;
> > > > >         >     >                 break;
> > > > >         >     >             }
> > > > >         >     >         }
> > > > >         >     >         if (!hasNfs) {
> > > > >         >     >             s_logger.warn(
> > > > >         >     >                     "Agent investigation was
> > requested
> > > on
> > > > > host " +
> > > > >         > agent +
> > > > >         >     > ", but host does not support investigation because
> it
> > > has
> > > > > no NFS
> > > > >         > storage.
> > > > >         >     > Skipping investigation.");
> > > > >         >     >             return Status.Disconnected;
> > > > >         >     >         }
> > > > >         >     >
> > > > >         >
> > > > >         >     There are two possibilities here. You do not have any
> > NFS
> > > > > storage? Is
> > > > >         > that
> > > > >         >     the case? Or maybe, for some reason, the call
> > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > getClusterId())"
> > > > > is not
> > > > >         > returning
> > > > >         >     any NFS storage pools. Looking at the
> > > "listPoolsByCluster "
> > > > > we will see
> > > > >         >     that the following SQL is used:
> > > > >         >
> > > > >         >     Select * from storage_pool where cluster_id =
> > > > > <host'sClusterId> and
> > > > >         > removed
> > > > >         >     > is not null
> > > > >         >     >
> > > > >         >
> > > > >         >     Can you run that SQL to see the its return when your
> > > hosts
> > > > > are marked
> > > > >         > as
> > > > >         >     disconnected?
> > > > >         >
> > > > >         >
> > > > >         > Dag.Sonstebo@shapeblue.com
> > > > >         > www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > >         Shapeblue - The CloudStack Company<
> http://www.shapeblue.com/
> > >
> > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > >         ShapeBlue are the largest independent integrator of
> > CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > >         > @shapeblue
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > jms.123@hotmail.co.uk>
> > > > >         > wrote:
> > > > >         >
> > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > configured
> > > > > the zone
> > > > >         > through
> > > > >         >     > the UI I used the labels cloudbr0 for management,
> > > > cloudbr1
> > > > > for guest
> > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > response
> > > > > to you.
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > When I pull the power to the node (dcp-cscn2.local)
> > > after
> > > > > about 5
> > > > >         > mins
> > > > >         >     > the  host status goes to "Alert" but never to
> "Down"
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > I get this in the logs -
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > 2018-06-05 15:17:14,382 WARN
> [c.c.h.KVMInvestigator]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> Agent
> > > > > investigation
> > > > >         > was
> > > > >         >     > requested on host Host[-4-Routing], but host does
> not
> > > > > support
> > > > >         > investigation
> > > > >         >     > because it has no NFS storage. Skipping
> > investigation.
> > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > HighAvailabilityManagerImpl]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > KVMInvestigator was
> > > > >         > able to
> > > > >         >     > determine host 4 is in Disconnected
> > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > [c.c.a.m.AgentManagerImpl]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The
> > > agent
> > > > > from host
> > > > >         > 4 state
> > > > >         >     > determined is Disconnected
> > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > [c.c.a.m.AgentManagerImpl]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> Agent
> > > is
> > > > >         > disconnected but
> > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > >         >     >
> > > > >         >     > I don't understand why it thinks there is no NFS
> > > storage
> > > > > as each
> > > > >         > compute
> > > > >         >     > node has a dedicated storage NIC.
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > I also don't understand why it thinks the host is
> > still
> > > > up
> > > > > ie. what
> > > > >         > test
> > > > >         >     > is it doing to determine that ?
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > Am I just trying to get something working that is
> not
> > > > > supported ?
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > ________________________________
> > > > >         >     > From: Rafael Weingärtner <
> > rafaelweingartner@gmail.com>
> > > > >         >     > Sent: 04 June 2018 15:31
> > > > >         >     > To: users
> > > > >         >     > Subject: Re: advanced networking with public IPs
> > direct
> > > > to
> > > > > VMs
> > > > >         >     >
> > > > >         >     > What type of failover are you talking about?
> > > > >         >     > What ACS version are you using?
> > > > >         >     > What hypervisor are you using?
> > > > >         >     > How are you configuring your NICs in the
> hypervisor?
> > > > >         >     > How are you configuring the traffic labels in ACS?
> > > > >         >     >
> > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > jms.123@hotmail.co.uk
> > > > >         > >
> > > > >         >     > wrote:
> > > > >         >     >
> > > > >         >     > > Hi all
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > I am close to giving up on basic networking as I
> > just
> > > > > cannot get
> > > > >         > failover
> > > > >         >     > > working with multiple NICs (I am not even sure it
> > is
> > > > > supported).
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > What I would like is to use 3 NICs for
> management,
> > > > > storage and
> > > > >         > guest
> > > > >         >     > > traffic. I would like to assign public IPs direct
> > to
> > > > the
> > > > > VMs which
> > > > >         > is
> > > > >         >     > why I
> > > > >         >     > > originally chose basic.
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > If I switch to advanced networking do I just
> > > configure
> > > > a
> > > > > guest VM
> > > > >         > with
> > > > >         >     > > public IPs on one NIC and not both with the
> public
> > > > > traffic -
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > would this work ?
> > > > >         >     > >
> > > > >         >     >
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > --
> > > > >         >     > Rafael Weingärtner
> > > > >         >     >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     --
> > > > >         >     Rafael Weingärtner
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >
> > > > >
> > > > >         --
> > > > >         Rafael Weingärtner
> > > > >
> > > > >
> > > > >
> > > > >     Dag.Sonstebo@shapeblue.com
> > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > >     @shapeblue
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Dag.Sonstebo@shapeblue.com
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > @shapeblue
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> Rafael Weingärtner
>

Re: advanced networking with public IPs direct to VMs

Posted by Erik Weber <te...@gmail.com>.
While someone ponders about the zone wide storage, you could try adding a
cluster wide nfs storage and see if it the rest works in that setup.

Erik

On Thu, Jun 7, 2018 at 11:49 AM Jon Marshall <jm...@hotmail.co.uk> wrote:

> Yes, all basic. I read a Shapeblue doc that recommended splitting traffic
> across multiple NICs even in basic networking mode so that is what I am
> trying to do.
>
>
> With single NIC you do not get the NFS storage message.
>
>
> I have the entire management server logs for both scenarios after I pulled
> the power to one of the compute nodes but from the single NIC setup these
> seem to be the relevant lines -
>
>
> 2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl]
> (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on 4
> with status Down
> 2018-06-04 10:17:10,972 DEBUG [c.c.h.Status]
> (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource state
> = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
> 2018-06-04 10:17:10,981 WARN  [o.a.c.alerts]
> (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 |
> dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is down,
> name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
> 2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator]
> (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the
> agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4: Host
> with specified id is not in the right state: Down
> 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5
> returned status:Down for the investigated host:4
> 2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible
> legacy state Down for host 4
> 2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was able to
> determine host 4 is in Down
> 2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4 state
> determined is Down
> 2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down:
> 4-dcp-cscn2.local. Starting HA on the VMs
>
> At the moment I only need to assign public IPs direct to VMs rather than
> using NAT with the virtual router but would be happy to go with advanced
> networking if it would make things easier :)
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 07 June 2018 10:35
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Ah so, it is not an advanced setup; even when you use multiple NICs.
> Can you confirm that the message ""Agent investigation was requested on
> host, but host does not support investigation because it has no NFS
> storage. Skipping investigation." does not appear when you use a single
> NIC? Can you check other log entries that might appear when the host is
> marked as "down"?
>
> On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > It is all basic networking at the moment for all the setups.
> >
> >
> > If you want me to I can setup a single NIC solution again and run any
> > commands you need me to do.
> >
> >
> > FYI when I setup single NIC I use the guided  installtion option in the
> UI
> > rather than manual setup which I do for the multiple NIC scenario.
> >
> >
> > Happy to set it up if it helps.
> >
> >
> >
> >
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 07 June 2018 10:23
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > Ok, so that explains the log message. This is looking like a bug to me.
> It
> > seems that in Zone wide the host state (when disconnected) is not being
> > properly identified due to this NFS thing, and as a consequency it has a
> > side effect in VM HA.
> >
> > We would need some inputs from guys that have advanced networking
> > deployments and Zone wide storage.
> >
> > I do not see how the all in one NIC deployment scenario is working
> though.
> > This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
> > simple, if there is no NFS in the cluster (NFS storage pools found for a
> > host's cluster), KVM hosts will be detected as "disconnected" and not
> down
> > with that warning message you noticed.
> >
> > When you say "all in one NIC", is it an advanced network deployment where
> > you put all traffic in a single network, or is it a basic networking that
> > you are doing?
> >
> > On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > zone wide.
> > >
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 10:04
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > What type of storage are you using? Zone wide? Or cluster "wide"
> storage?
> > >
> > > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > Rafael
> > > >
> > > >
> > > > Here is the output as requested -
> > > >
> > > >
> > > >
> > > > mysql> mysql> select * from cloud.storage_pool where removed is null;
> > > > +----+------+--------------------------------------+--------
> > > > -----------+------+----------------+--------+------------+--
> > > > ----------+----------------+--------------+-----------+-----
> > > > ------------+---------------------+---------+-------------+-
> > > > -------+-----------------------+-------+------------+-------
> > > > --+---------------+
> > > > | id | name | uuid                                 | pool_type
> >  |
> > > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > > capacity_bytes |
> > > > host_address | user_info | path            | created             |
> > > removed
> > > > | update_time | status | storage_provider_name | scope | hypervisor |
> > > > managed | capacity_iops |
> > > > +----+------+--------------------------------------+--------
> > > > -----------+------+----------------+--------+------------+--
> > > > ----------+----------------+--------------+-----------+-----
> > > > ------------+---------------------+---------+-------------+-
> > > > -------+-----------------------+-------+------------+-------
> > > > --+---------------+
> > > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> > NetworkFilesystem |
> > > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > > 79133933568 |
> > > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 |
> NULL
> > > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |
> > > >  0 |          NULL |
> > > > +----+------+--------------------------------------+--------
> > > > -----------+------+----------------+--------+------------+--
> > > > ----------+----------------+--------------+-----------+-----
> > > > ------------+---------------------+---------+-------------+-
> > > > -------+-----------------------+-------+------------+-------
> > > > --+---------------+
> > > > 1 row in set (0.00 sec)
> > > >
> > > > mysql>
> > > >
> > > > Do you think this problem is related to my NIC/bridge configuration
> or
> > > the
> > > > way I am configuring the zone ?
> > > >
> > > > Jon
> > > > ________________________________
> > > > From: Rafael Weingärtner <ra...@gmail.com>
> > > > Sent: 07 June 2018 06:45
> > > > To: users
> > > > Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > > Can you also post the result of:
> > > > select * from cloud.storage_pool where removed is null
> > > >
> > > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> > Dag.Sonstebo@shapeblue.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi Jon,
> > > > >
> > > > > Still confused where your primary storage pools are – are you sure
> > your
> > > > > hosts are in cluster 1?
> > > > >
> > > > > Quick question just to make sure - assuming management/storage is
> on
> > > the
> > > > > same NIC when I setup basic networking the physical network has the
> > > > > management and guest icons already there and I just edit the KVM
> > > labels.
> > > > If
> > > > > I am running storage over management do I need to drag the storage
> > icon
> > > > to
> > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > management or does CS automatically just use the management NIC
> ie. I
> > > > would
> > > > > only need to drag the storage icon across in basic setup if I
> wanted
> > it
> > > > on
> > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > >
> > > > > >> I would do both – set up your 2/3 physical networks, name isn’t
> > that
> > > > > important – but then drag the traffic types to the correct one and
> > make
> > > > > sure the labels are correct.
> > > > > Regards,
> > > > > Dag Sonstebo
> > > > > Cloud Architect
> > > > > ShapeBlue
> > > > >
> > > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk>
> wrote:
> > > > >
> > > > >     Dag
> > > > >
> > > > >
> > > > >     Do you mean  check the pools with "Infrastructure -> Primary
> > > Storage"
> > > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > > >
> > > > >
> > > > >     If so Primary Storage has a state of UP, secondary storage does
> > not
> > > > > show a state as such so not sure where else to check it ?
> > > > >
> > > > >
> > > > >     Rerun of the command -
> > > > >
> > > > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > > > >     Empty set (0.00 sec)
> > > > >
> > > > >     mysql>
> > > > >
> > > > >     I think it is something to do with my zone creation rather than
> > the
> > > > > NIC, bridge setup although I can post those if needed.
> > > > >
> > > > >     I may try to setup just the 2 NIC solution you mentioned
> although
> > > as
> > > > I
> > > > > say I had the same issue with that ie. host goes to "Altert" state
> > and
> > > > same
> > > > > error messages.  The only time I can get it to go to "Down" state
> is
> > > when
> > > > > it is all on the single NIC.
> > > > >
> > > > >     Quick question just to make sure - assuming management/storage
> is
> > > on
> > > > > the same NIC when I setup basic networking the physical network has
> > the
> > > > > management and guest icons already there and I just edit the KVM
> > > labels.
> > > > If
> > > > > I am running storage over management do I need to drag the storage
> > icon
> > > > to
> > > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > > management or does CS automatically just use the management NIC
> ie. I
> > > > would
> > > > > only need to drag the storage icon across in basic setup if I
> wanted
> > it
> > > > on
> > > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > > >
> > > > >     On the plus side I have been at this for so long now and done
> so
> > > many
> > > > > rebuilds I could do it in my sleep now 😊
> > > > >
> > > > >
> > > > >     ________________________________
> > > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > > >     Sent: 06 June 2018 12:28
> > > > >     To: users@cloudstack.apache.org
> > > > >     Subject: Re: advanced networking with public IPs direct to VMs
> > > > >
> > > > >     Looks OK to me Jon.
> > > > >
> > > > >     The one thing that throws me is your storage pools – can you
> > rerun
> > > > > your query: select * from cloud.storage_pool where cluster_id = 1;
> > > > >
> > > > >     Do the pools show up as online in the CloudStack GUI?
> > > > >
> > > > >     Regards,
> > > > >     Dag Sonstebo
> > > > >     Cloud Architect
> > > > >     ShapeBlue
> > > > >
> > > > >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk>
> > > wrote:
> > > > >
> > > > >         Don't know whether this helps or not but I logged into the
> > SSVM
> > > > > and ran an ifconfig -
> > > > >
> > > > >
> > > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > > > 169.254.255.255
> > > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 172.30.3.34  netmask 255.255.255.192
> broadcast
> > > > > 172.30.3.63
> > > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 172.30.4.86  netmask 255.255.255.128
> broadcast
> > > > > 172.30.4.127
> > > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > > >                 inet 172.30.5.14  netmask 255.255.255.240
> broadcast
> > > > > 172.30.5.15
> > > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000
> (Ethernet)
> > > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > > >                 loop  txqueuelen 1  (Local Loopback)
> > > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > > collisions
> > > > 0
> > > > >
> > > > >
> > > > >         so it has interfaces in both the management and the storage
> > > > > subnets (as well as guest).
> > > > >
> > > > >
> > > > >
> > > > >         ________________________________
> > > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > > >         Sent: 06 June 2018 11:08
> > > > >         To: users@cloudstack.apache.org
> > > > >         Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > >
> > > > >         Hi Rafael
> > > > >
> > > > >
> > > > >         Thanks for the help, really appreciate it.
> > > > >
> > > > >
> > > > >         So rerunning that command with all servers up -
> > > > >
> > > > >
> > > > >
> > > > >         mysql> select * from cloud.storage_pool where cluster_id =
> 1
> > > and
> > > > > removed is null;
> > > > >         Empty set (0.00 sec)
> > > > >
> > > > >         mysql>
> > > > >
> > > > >
> > > > >         As for the storage IP no I'm not setting it to be the
> > > management
> > > > > IP when I setup the zone but the output of the SQL command suggests
> > > that
> > > > is
> > > > > what has happened.
> > > > >
> > > > >         As I said to Dag I am using a different subnet for storage
> > ie.
> > > > >
> > > > >         172.30.3.0/26  - management subnet
> > > > >         172.30.4.0/25 -  guest VM subnet
> > > > >         172.30.5.0/28 - storage
> > > > >
> > > > >         the NFS server IP is 172.30.5.2
> > > > >
> > > > >         each compute node has 3 NICs with an IP from each subnet (i
> > am
> > > > > assuming the management node only needs an IP in the management
> > network
> > > > ?)
> > > > >
> > > > >         When I add the zone in the UI I have one physical network
> > with
> > > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > > >         When I fill in the storage traffic page I use the range
> > > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already
> allocated
> > to
> > > > the
> > > > > compute nodes and the NFS server.
> > > > >
> > > > >         I think maybe I am doing something wrong in the UI setup
> but
> > it
> > > > is
> > > > > not obvious to me what it is.
> > > > >
> > > > >         What I might try today unless you want me to keep the
> setup I
> > > > have
> > > > > for more outputs is to go back to 2 NICs, one for
> storage/management
> > > and
> > > > > one for guest VMs.
> > > > >
> > > > >         I think with the 2 NICs setup the mistake I made last time
> > when
> > > > > adding the zone was to assume storage would just run over
> management
> > > so I
> > > > > did not drag and drop the storage icon and assign it to cloudbr0 as
> > > with
> > > > > the management which I think is what I should do ?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >         ________________________________
> > > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > > >         Sent: 06 June 2018 10:54
> > > > >         To: users
> > > > >         Subject: Re: advanced networking with public IPs direct to
> > VMs
> > > > >
> > > > >         Jon, do not panic we are here to help you :)
> > > > >         So, I might have mistyped the SQL query. You you use
> select *
> > > > from
> > > > >         cloud.storage_pool where cluster_id = 1 and removed is not
> > null
> > > > ",
> > > > > you are
> > > > >         listing the storage pools removed. Therefore, the right
> query
> > > > > would be "
> > > > >         select * from cloud.storage_pool where cluster_id = 1 and
> > > removed
> > > > > is null "
> > > > >
> > > > >         There is also something else I do not understand. You are
> > > setting
> > > > > the
> > > > >         storage IP in the management subnet? I am not sure if you
> > > should
> > > > > be doing
> > > > >         like this. Normally, I set all my storages (primary[when
> > > working
> > > > > with NFS]
> > > > >         and secondary) to IPs in the storage subnet.
> > > > >
> > > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > > Dag.Sonstebo@shapeblue.com>
> > > > >         wrote:
> > > > >
> > > > >         > Hi John,
> > > > >         >
> > > > >         > I’m late to this thread and have possibly missed some
> > things
> > > –
> > > > > but a
> > > > >         > couple of observations:
> > > > >         >
> > > > >         > “When I add the zone and get to the storage web page I
> > > exclude
> > > > > the IPs
> > > > >         > already used for the compute node NICs and the NFS server
> > > > > itself. …..”
> > > > >         > “So the range is 172.30.5.1 -> 15 and the range I fill in
> > is
> > > > > 172.30.5.10
> > > > >         > -> 172.30.5.14.”
> > > > >         >
> > > > >         > I think you may have some confusion around the use of the
> > > > > storage network.
> > > > >         > The important part here is to understand this is for
> > > *secondary
> > > > > storage*
> > > > >         > use only – it has nothing to do with primary storage.
> This
> > > > means
> > > > > this
> > > > >         > storage network needs to be accessible to the SSVM, to
> the
> > > > > hypervisors, and
> > > > >         > secondary storage NFS pools needs to be accessible on
> this
> > > > > network.
> > > > >         >
> > > > >         > The important part – this also means you *can not use the
> > > same
> > > > > IP ranges
> > > > >         > for management and storage networks* - doing so means you
> > > will
> > > > > have issues
> > > > >         > where effectively both hypervisors and SSVM can see the
> > same
> > > > > subnet on two
> > > > >         > NICs – and you end up in a routing black hole.
> > > > >         >
> > > > >         > So – you need to either:
> > > > >         >
> > > > >         > 1) Use different IP subnets on management and storage, or
> > > > >         > 2) preferably just simplify your setup – stop using a
> > > secondary
> > > > > storage
> > > > >         > network altogether and just allow secondary storage to
> use
> > > the
> > > > > management
> > > > >         > network (which is default). Unless you have a very high
> I/O
> > > > > environment in
> > > > >         > production you are just adding complexity by running
> > separate
> > > > > management
> > > > >         > and storage.
> > > > >         >
> > > > >         > Regards,
> > > > >         > Dag Sonstebo
> > > > >         > Cloud Architect
> > > > >         > ShapeBlue
> > > > >         >
> > > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> > jms.123@hotmail.co.uk>
> > > > > wrote:
> > > > >         >
> > > > >         >     I will disconnect the host this morning and test but
> > > before
> > > > > I do that
> > > > >         > I ran this command when all hosts are up -
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >      select * from cloud.host;
> > > > >         >     +----+-----------------+------
> > > > ------------------------------
> > > > >         > --+--------+------------------
> > --+--------------------+------
> > > > >         > -----------+------------------
> > ---+--------------------+-----
> > > > >         > ------------+-----------------
> > ----+----------------------+--
> > > > >         > ---------------------+--------
> > -----------+------------+-----
> > > > >         > --------------+---------------
> > --+--------------------+------
> > > > >         > ------+----------------+------
> > --+-------------+------+------
> > > > >         > -+----------------------------
> > ---------+---------+----------
> > > > >         > -------+--------------------+-
> > -----------+----------+-------
> > > > >         > ---+--------+------------+----
> > ----------+-------------------
> > > > >         > ------------------------------
> > --------------+-----------+---
> > > > >         > ----+-------------+-----------
> > -+----------------+-----------
> > > > >         > ----------+-------------------
> > --+---------+--------------+--
> > > > >         > --------------+-------+-------------+--------------+
> > > > >         >     | id | name            | uuid
> > > > >  | status
> > > > >         > | type               | private_ip_address |
> > private_netmask |
> > > > >         > private_mac_address | storage_ip_address |
> storage_netmask
> > |
> > > > >         > storage_mac_address | storage_ip_address_2 |
> > > > > storage_mac_address_2 |
> > > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > > public_netmask  |
> > > > >         > public_mac_address | proxy_port | data_center_id |
> pod_id |
> > > > > cpu_sockets |
> > > > >         > cpus | speed | url                                 |
> > fs_type
> > > |
> > > > >         > hypervisor_type | hypervisor_version | ram        |
> > resource
> > > |
> > > > > version  |
> > > > >         > parent | total_size | capabilities | guid
> > > > >         >                         | available | setup |
> dom0_memory |
> > > > > last_ping  |
> > > > >         > mgmt_server_id | disconnected        | created
> >  |
> > > > > removed |
> > > > >         > update_count | resource_state | owner | lastUpdated |
> > > > > engine_state |
> > > > >         >     +----+-----------------+------
> > > > ------------------------------
> > > > >         > --+--------+------------------
> > --+--------------------+------
> > > > >         > -----------+------------------
> > ---+--------------------+-----
> > > > >         > ------------+-----------------
> > ----+----------------------+--
> > > > >         > ---------------------+--------
> > -----------+------------+-----
> > > > >         > --------------+---------------
> > --+--------------------+------
> > > > >         > ------+----------------+------
> > --+-------------+------+------
> > > > >         > -+----------------------------
> > ---------+---------+----------
> > > > >         > -------+--------------------+-
> > -----------+----------+-------
> > > > >         > ---+--------+------------+----
> > ----------+-------------------
> > > > >         > ------------------------------
> > --------------+-----------+---
> > > > >         > ----+-------------+-----------
> > -+----------------+-----------
> > > > >         > ----------+-------------------
> > --+---------+--------------+--
> > > > >         > --------------+-------+-------------+--------------+
> > > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > > eabd60012284
> > > > > | Up
> > > > >         >  | Routing            | 172.30.3.3         |
> > 255.255.255.192
> > > |
> > > > >         > 00:22:19:92:4e:34   | 172.30.3.3         |
> 255.255.255.192
> > |
> > > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |          1 | 172.30.4.3        |
> > > 255.255.255.128
> > > > |
> > > > >         > 00:22:19:92:4e:35  |       NULL |              1 |
> 1 |
> > > > >      1 |
> > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL
> > > |
> > > > > KVM
> > > > >         >      | NULL               | 7510159360 | NULL     |
> > 4.11.0.0
> > > |
> > > > > NULL   |
> > > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > > f83e7a5e8efb-
> > > > > LibvirtComputingResource
> > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > > 4 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > > 3f5e2c2215b0
> > > > > | Up
> > > > >         >  | ConsoleProxy       | 172.30.3.49        |
> > 255.255.255.192
> > > |
> > > > >         > 1e:00:80:00:00:14   | 172.30.3.49        |
> 255.255.255.192
> > |
> > > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |       NULL | 172.30.4.98       |
> > > 255.255.255.128
> > > > |
> > > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |
> 1 |
> > > > >   NULL |
> > > > >         > NULL |  NULL | NoIqn                               | NULL
> > > |
> > > > > NULL
> > > > >         >     | NULL               |          0 | NULL     |
> > 4.11.0.0 |
> > > > > NULL   |
> > > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > > >         >           |         1 |     0 |           0 | 1492390409
> |
> > > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > > 7 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > > ff3845bb556c
> > > > > | Up
> > > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> > 255.255.255.192
> > > |
> > > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        |
> 255.255.255.192
> > |
> > > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |       NULL | 172.30.4.86       |
> > > 255.255.255.128
> > > > |
> > > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |
> 1 |
> > > > >   NULL |
> > > > >         > NULL |  NULL | NoIqn                               | NULL
> > > |
> > > > > NULL
> > > > >         >     | NULL               |          0 | NULL     |
> > 4.11.0.0 |
> > > > > NULL   |
> > > > >         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> > > > >         >             |         1 |     0 |           0 |
> 1492390407
> > |
> > > > >  146457912294
> > > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > > > >   7 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > > 1a96ffac9794
> > > > > | Up
> > > > >         >  | Routing            | 172.30.3.4         |
> > 255.255.255.192
> > > |
> > > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         |
> 255.255.255.192
> > |
> > > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |          1 | 172.30.4.4        |
> > > 255.255.255.128
> > > > |
> > > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |
> 1 |
> > > > >      1 |
> > > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL
> > > |
> > > > > KVM
> > > > >         >      | NULL               | 7510159360 | NULL     |
> > 4.11.0.0
> > > |
> > > > > NULL   |
> > > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > > 16d0f99b11c9-
> > > > > LibvirtComputingResource
> > > > >         > |         1 |     0 |           0 | 1492450882 |
> > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > > 8 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > > ee29d2b5c274
> > > > > | Up
> > > > >         >  | Routing            | 172.30.3.5         |
> > 255.255.255.192
> > > |
> > > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         |
> 255.255.255.192
> > |
> > > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > > >   | NULL
> > > > >         >             |          1 | 172.30.4.5        |
> > > 255.255.255.128
> > > > |
> > > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |
> 1 |
> > > > >      1 |
> > > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL
> > > |
> > > > > KVM
> > > > >         >      | NULL               | 7510159360 | NULL     |
> > 4.11.0.0
> > > |
> > > > > NULL   |
> > > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > > 37f3eebad8fb-
> > > > > LibvirtComputingResource
> > > > >         > |         1 |     0 |           0 | 1492390408 |
> > > >  146457912294 |
> > > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > > 6 |
> > > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > > >         >     +----+-----------------+------
> > > > ------------------------------
> > > > >         > --+--------+------------------
> > --+--------------------+------
> > > > >         > -----------+------------------
> > ---+--------------------+-----
> > > > >         > ------------+-----------------
> > ----+----------------------+--
> > > > >         > ---------------------+--------
> > -----------+------------+-----
> > > > >         > --------------+---------------
> > --+--------------------+------
> > > > >         > ------+----------------+------
> > --+-------------+------+------
> > > > >         > -+----------------------------
> > ---------+---------+----------
> > > > >         > -------+--------------------+-
> > -----------+----------+-------
> > > > >         > ---+--------+------------+----
> > ----------+-------------------
> > > > >         > ------------------------------
> > --------------+-----------+---
> > > > >         > ----+-------------+-----------
> > -+----------------+-----------
> > > > >         > ----------+-------------------
> > --+---------+--------------+--
> > > > >         > --------------+-------+-------------+--------------+
> > > > >         >     5 rows in set (0.00 sec)
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     and you can see that it says the storage IP address
> is
> > > the
> > > > > same as the
> > > > >         > private IP address (the management network).
> > > > >         >
> > > > >         >
> > > > >         >     I also ran the command you provided using the Cluster
> > ID
> > > > > number from
> > > > >         > the table above -
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     mysql> select * from cloud.storage_pool where
> > cluster_id
> > > =
> > > > 1
> > > > > and
> > > > >         > removed is not null;
> > > > >         >     Empty set (0.00 sec)
> > > > >         >
> > > > >         >     mysql>
> > > > >         >
> > > > >         >     So assuming I am reading this correctly that seems to
> > be
> > > > the
> > > > > issue.
> > > > >         >
> > > > >         >
> > > > >         >     I am at a loss as to why though.
> > > > >         >
> > > > >         >
> > > > >         >     I have a separate NIC for storage as described. When
> I
> > > add
> > > > > the zone
> > > > >         > and get to the storage web page I exclude the IPs already
> > > used
> > > > > for the
> > > > >         > compute node NICs and the NFS server itself. I do this
> > > because
> > > > > initially I
> > > > >         > didn't and the SSVM started using the IP address of the
> NFS
> > > > > server.
> > > > >         >
> > > > >         >
> > > > >         >     So the range is 172.30.5.1 -> 15 and the range I fill
> > in
> > > is
> > > > >         > 172.30.5.10 -> 172.30.5.14.
> > > > >         >
> > > > >         >
> > > > >         >     And I used the label "cloudbr2" for storage.
> > > > >         >
> > > > >         >
> > > > >         >     I must be doing this wrong somehow.
> > > > >         >
> > > > >         >
> > > > >         >     Any pointers would be much appreciated.
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     ________________________________
> > > > >         >     From: Rafael Weingärtner <
> rafaelweingartner@gmail.com>
> > > > >         >     Sent: 05 June 2018 16:13
> > > > >         >     To: users
> > > > >         >     Subject: Re: advanced networking with public IPs
> direct
> > > to
> > > > > VMs
> > > > >         >
> > > > >         >     That is interesting. Let's see the source of all
> > truth...
> > > > >         >     This is the code that is generating that odd message.
> > > > >         >
> > > > >         >     >     List<StoragePoolVO> clusterPools =
> > > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > > getClusterId());
> > > > >         >     >         boolean hasNfs = false;
> > > > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > > > >         >     >             if (pool.getPoolType() ==
> > StoragePoolType.
> > > > > NetworkFilesystem)
> > > > >         > {
> > > > >         >     >                 hasNfs = true;
> > > > >         >     >                 break;
> > > > >         >     >             }
> > > > >         >     >         }
> > > > >         >     >         if (!hasNfs) {
> > > > >         >     >             s_logger.warn(
> > > > >         >     >                     "Agent investigation was
> > requested
> > > on
> > > > > host " +
> > > > >         > agent +
> > > > >         >     > ", but host does not support investigation because
> it
> > > has
> > > > > no NFS
> > > > >         > storage.
> > > > >         >     > Skipping investigation.");
> > > > >         >     >             return Status.Disconnected;
> > > > >         >     >         }
> > > > >         >     >
> > > > >         >
> > > > >         >     There are two possibilities here. You do not have any
> > NFS
> > > > > storage? Is
> > > > >         > that
> > > > >         >     the case? Or maybe, for some reason, the call
> > > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > > getClusterId())"
> > > > > is not
> > > > >         > returning
> > > > >         >     any NFS storage pools. Looking at the
> > > "listPoolsByCluster "
> > > > > we will see
> > > > >         >     that the following SQL is used:
> > > > >         >
> > > > >         >     Select * from storage_pool where cluster_id =
> > > > > <host'sClusterId> and
> > > > >         > removed
> > > > >         >     > is not null
> > > > >         >     >
> > > > >         >
> > > > >         >     Can you run that SQL to see the its return when your
> > > hosts
> > > > > are marked
> > > > >         > as
> > > > >         >     disconnected?
> > > > >         >
> > > > >         >
> > > > >         > Dag.Sonstebo@shapeblue.com
> > > > >         > www.shapeblue.com<http://www.shapeblue.com>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > ShapeBlue are the largest independent integrator of CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > > >         Shapeblue - The CloudStack Company<
> http://www.shapeblue.com/
> > >
> > > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > > >         ShapeBlue are the largest independent integrator of
> > CloudStack
> > > > > technologies globally and are specialists in the design and
> > > > implementation
> > > > > of IaaS cloud infrastructures for both private and public cloud
> > > > > implementations.
> > > > >
> > > > >
> > > > >
> > > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > >         > @shapeblue
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > > jms.123@hotmail.co.uk>
> > > > >         > wrote:
> > > > >         >
> > > > >         >     > I reran the tests with the 3 NIC setup. When I
> > > configured
> > > > > the zone
> > > > >         > through
> > > > >         >     > the UI I used the labels cloudbr0 for management,
> > > > cloudbr1
> > > > > for guest
> > > > >         >     > traffic and cloudbr2 for NFS as per my original
> > > response
> > > > > to you.
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > When I pull the power to the node (dcp-cscn2.local)
> > > after
> > > > > about 5
> > > > >         > mins
> > > > >         >     > the  host status goes to "Alert" but never to
> "Down"
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > I get this in the logs -
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > 2018-06-05 15:17:14,382 WARN
> [c.c.h.KVMInvestigator]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> Agent
> > > > > investigation
> > > > >         > was
> > > > >         >     > requested on host Host[-4-Routing], but host does
> not
> > > > > support
> > > > >         > investigation
> > > > >         >     > because it has no NFS storage. Skipping
> > investigation.
> > > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > > HighAvailabilityManagerImpl]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > > KVMInvestigator was
> > > > >         > able to
> > > > >         >     > determine host 4 is in Disconnected
> > > > >         >     > 2018-06-05 15:17:14,382 INFO
> > > [c.c.a.m.AgentManagerImpl]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The
> > > agent
> > > > > from host
> > > > >         > 4 state
> > > > >         >     > determined is Disconnected
> > > > >         >     > 2018-06-05 15:17:14,382 WARN
> > > [c.c.a.m.AgentManagerImpl]
> > > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> Agent
> > > is
> > > > >         > disconnected but
> > > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > > >         >     >
> > > > >         >     > I don't understand why it thinks there is no NFS
> > > storage
> > > > > as each
> > > > >         > compute
> > > > >         >     > node has a dedicated storage NIC.
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > I also don't understand why it thinks the host is
> > still
> > > > up
> > > > > ie. what
> > > > >         > test
> > > > >         >     > is it doing to determine that ?
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > Am I just trying to get something working that is
> not
> > > > > supported ?
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > ________________________________
> > > > >         >     > From: Rafael Weingärtner <
> > rafaelweingartner@gmail.com>
> > > > >         >     > Sent: 04 June 2018 15:31
> > > > >         >     > To: users
> > > > >         >     > Subject: Re: advanced networking with public IPs
> > direct
> > > > to
> > > > > VMs
> > > > >         >     >
> > > > >         >     > What type of failover are you talking about?
> > > > >         >     > What ACS version are you using?
> > > > >         >     > What hypervisor are you using?
> > > > >         >     > How are you configuring your NICs in the
> hypervisor?
> > > > >         >     > How are you configuring the traffic labels in ACS?
> > > > >         >     >
> > > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > > jms.123@hotmail.co.uk
> > > > >         > >
> > > > >         >     > wrote:
> > > > >         >     >
> > > > >         >     > > Hi all
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > I am close to giving up on basic networking as I
> > just
> > > > > cannot get
> > > > >         > failover
> > > > >         >     > > working with multiple NICs (I am not even sure it
> > is
> > > > > supported).
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > What I would like is to use 3 NICs for
> management,
> > > > > storage and
> > > > >         > guest
> > > > >         >     > > traffic. I would like to assign public IPs direct
> > to
> > > > the
> > > > > VMs which
> > > > >         > is
> > > > >         >     > why I
> > > > >         >     > > originally chose basic.
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > If I switch to advanced networking do I just
> > > configure
> > > > a
> > > > > guest VM
> > > > >         > with
> > > > >         >     > > public IPs on one NIC and not both with the
> public
> > > > > traffic -
> > > > >         >     > >
> > > > >         >     > >
> > > > >         >     > > would this work ?
> > > > >         >     > >
> > > > >         >     >
> > > > >         >     >
> > > > >         >     >
> > > > >         >     > --
> > > > >         >     > Rafael Weingärtner
> > > > >         >     >
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >         >     --
> > > > >         >     Rafael Weingärtner
> > > > >         >
> > > > >         >
> > > > >         >
> > > > >
> > > > >
> > > > >         --
> > > > >         Rafael Weingärtner
> > > > >
> > > > >
> > > > >
> > > > >     Dag.Sonstebo@shapeblue.com
> > > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > >     @shapeblue
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Dag.Sonstebo@shapeblue.com
> > > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > > @shapeblue
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Rafael Weingärtner
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> Rafael Weingärtner
>

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Yes, all basic. I read a Shapeblue doc that recommended splitting traffic across multiple NICs even in basic networking mode so that is what I am trying to do.


With single NIC you do not get the NFS storage message.


I have the entire management server logs for both scenarios after I pulled the power to one of the compute nodes but from the single NIC setup these seem to be the relevant lines -


2018-06-04 10:17:10,972 DEBUG [c.c.n.NetworkUsageManagerImpl] (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Disconnected called on 4 with status Down
2018-06-04 10:17:10,972 DEBUG [c.c.h.Status] (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Transition:[Resource state = Enabled, Agent event = HostDown, Host id = 4, name = dcp-cscn2.local]
2018-06-04 10:17:10,981 WARN  [o.a.c.alerts] (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) AlertType:: 7 | dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host is down, name: dcp-cscn2.local (id:4), availability zone: dcpz1, pod: dcp1
2018-06-04 10:17:11,000 DEBUG [c.c.h.CheckOnAgentInvestigator] (HA-Worker-1:ctx-f763f12f work-17) (logid:77c56778) Unable to reach the agent for VM[User|i-2-6-VM]: Resource [Host:4] is unreachable: Host 4: Host with specified id is not in the right state: Down
2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator] (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Neighbouring host:5 returned status:Down for the investigated host:4
2018-06-04 10:17:11,006 DEBUG [c.c.h.KVMInvestigator] (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) HA: HOST is ineligible legacy state Down for host 4
2018-06-04 10:17:11,006 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) KVMInvestigator was able to determine host 4 is in Down
2018-06-04 10:17:11,006 INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) The agent from host 4 state determined is Down
2018-06-04 10:17:11,006 ERROR [c.c.a.m.AgentManagerImpl] (AgentTaskPool-2:ctx-a6f6dbd1) (logid:774553ff) Host is down: 4-dcp-cscn2.local. Starting HA on the VMs

At the moment I only need to assign public IPs direct to VMs rather than using NAT with the virtual router but would be happy to go with advanced networking if it would make things easier :)

________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 07 June 2018 10:35
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Ah so, it is not an advanced setup; even when you use multiple NICs.
Can you confirm that the message ""Agent investigation was requested on
host, but host does not support investigation because it has no NFS
storage. Skipping investigation." does not appear when you use a single
NIC? Can you check other log entries that might appear when the host is
marked as "down"?

On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> It is all basic networking at the moment for all the setups.
>
>
> If you want me to I can setup a single NIC solution again and run any
> commands you need me to do.
>
>
> FYI when I setup single NIC I use the guided  installtion option in the UI
> rather than manual setup which I do for the multiple NIC scenario.
>
>
> Happy to set it up if it helps.
>
>
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 07 June 2018 10:23
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Ok, so that explains the log message. This is looking like a bug to me. It
> seems that in Zone wide the host state (when disconnected) is not being
> properly identified due to this NFS thing, and as a consequency it has a
> side effect in VM HA.
>
> We would need some inputs from guys that have advanced networking
> deployments and Zone wide storage.
>
> I do not see how the all in one NIC deployment scenario is working though.
> This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
> simple, if there is no NFS in the cluster (NFS storage pools found for a
> host's cluster), KVM hosts will be detected as "disconnected" and not down
> with that warning message you noticed.
>
> When you say "all in one NIC", is it an advanced network deployment where
> you put all traffic in a single network, or is it a basic networking that
> you are doing?
>
> On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > zone wide.
> >
> >
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 07 June 2018 10:04
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > What type of storage are you using? Zone wide? Or cluster "wide" storage?
> >
> > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > Rafael
> > >
> > >
> > > Here is the output as requested -
> > >
> > >
> > >
> > > mysql> mysql> select * from cloud.storage_pool where removed is null;
> > > +----+------+--------------------------------------+--------
> > > -----------+------+----------------+--------+------------+--
> > > ----------+----------------+--------------+-----------+-----
> > > ------------+---------------------+---------+-------------+-
> > > -------+-----------------------+-------+------------+-------
> > > --+---------------+
> > > | id | name | uuid                                 | pool_type
>  |
> > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > capacity_bytes |
> > > host_address | user_info | path            | created             |
> > removed
> > > | update_time | status | storage_provider_name | scope | hypervisor |
> > > managed | capacity_iops |
> > > +----+------+--------------------------------------+--------
> > > -----------+------+----------------+--------+------------+--
> > > ----------+----------------+--------------+-----------+-----
> > > ------------+---------------------+---------+-------------+-
> > > -------+-----------------------+-------+------------+-------
> > > --+---------------+
> > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> NetworkFilesystem |
> > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > 79133933568 |
> > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 | NULL
> > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |
> > >  0 |          NULL |
> > > +----+------+--------------------------------------+--------
> > > -----------+------+----------------+--------+------------+--
> > > ----------+----------------+--------------+-----------+-----
> > > ------------+---------------------+---------+-------------+-
> > > -------+-----------------------+-------+------------+-------
> > > --+---------------+
> > > 1 row in set (0.00 sec)
> > >
> > > mysql>
> > >
> > > Do you think this problem is related to my NIC/bridge configuration or
> > the
> > > way I am configuring the zone ?
> > >
> > > Jon
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 06:45
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > Can you also post the result of:
> > > select * from cloud.storage_pool where removed is null
> > >
> > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> Dag.Sonstebo@shapeblue.com
> > >
> > > wrote:
> > >
> > > > Hi Jon,
> > > >
> > > > Still confused where your primary storage pools are – are you sure
> your
> > > > hosts are in cluster 1?
> > > >
> > > > Quick question just to make sure - assuming management/storage is on
> > the
> > > > same NIC when I setup basic networking the physical network has the
> > > > management and guest icons already there and I just edit the KVM
> > labels.
> > > If
> > > > I am running storage over management do I need to drag the storage
> icon
> > > to
> > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > management or does CS automatically just use the management NIC ie. I
> > > would
> > > > only need to drag the storage icon across in basic setup if I wanted
> it
> > > on
> > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > >
> > > > >> I would do both – set up your 2/3 physical networks, name isn’t
> that
> > > > important – but then drag the traffic types to the correct one and
> make
> > > > sure the labels are correct.
> > > > Regards,
> > > > Dag Sonstebo
> > > > Cloud Architect
> > > > ShapeBlue
> > > >
> > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
> > > >
> > > >     Dag
> > > >
> > > >
> > > >     Do you mean  check the pools with "Infrastructure -> Primary
> > Storage"
> > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > >
> > > >
> > > >     If so Primary Storage has a state of UP, secondary storage does
> not
> > > > show a state as such so not sure where else to check it ?
> > > >
> > > >
> > > >     Rerun of the command -
> > > >
> > > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > > >     Empty set (0.00 sec)
> > > >
> > > >     mysql>
> > > >
> > > >     I think it is something to do with my zone creation rather than
> the
> > > > NIC, bridge setup although I can post those if needed.
> > > >
> > > >     I may try to setup just the 2 NIC solution you mentioned although
> > as
> > > I
> > > > say I had the same issue with that ie. host goes to "Altert" state
> and
> > > same
> > > > error messages.  The only time I can get it to go to "Down" state is
> > when
> > > > it is all on the single NIC.
> > > >
> > > >     Quick question just to make sure - assuming management/storage is
> > on
> > > > the same NIC when I setup basic networking the physical network has
> the
> > > > management and guest icons already there and I just edit the KVM
> > labels.
> > > If
> > > > I am running storage over management do I need to drag the storage
> icon
> > > to
> > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > management or does CS automatically just use the management NIC ie. I
> > > would
> > > > only need to drag the storage icon across in basic setup if I wanted
> it
> > > on
> > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > >
> > > >     On the plus side I have been at this for so long now and done so
> > many
> > > > rebuilds I could do it in my sleep now 😊
> > > >
> > > >
> > > >     ________________________________
> > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > >     Sent: 06 June 2018 12:28
> > > >     To: users@cloudstack.apache.org
> > > >     Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > >     Looks OK to me Jon.
> > > >
> > > >     The one thing that throws me is your storage pools – can you
> rerun
> > > > your query: select * from cloud.storage_pool where cluster_id = 1;
> > > >
> > > >     Do the pools show up as online in the CloudStack GUI?
> > > >
> > > >     Regards,
> > > >     Dag Sonstebo
> > > >     Cloud Architect
> > > >     ShapeBlue
> > > >
> > > >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk>
> > wrote:
> > > >
> > > >         Don't know whether this helps or not but I logged into the
> SSVM
> > > > and ran an ifconfig -
> > > >
> > > >
> > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > > 169.254.255.255
> > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
> > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > >                 inet 172.30.3.34  netmask 255.255.255.192  broadcast
> > > > 172.30.3.63
> > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
> > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > >                 inet 172.30.4.86  netmask 255.255.255.128  broadcast
> > > > 172.30.4.127
> > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
> > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > >                 inet 172.30.5.14  netmask 255.255.255.240  broadcast
> > > > 172.30.5.15
> > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
> > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > >                 loop  txqueuelen 1  (Local Loopback)
> > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >
> > > >         so it has interfaces in both the management and the storage
> > > > subnets (as well as guest).
> > > >
> > > >
> > > >
> > > >         ________________________________
> > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > >         Sent: 06 June 2018 11:08
> > > >         To: users@cloudstack.apache.org
> > > >         Subject: Re: advanced networking with public IPs direct to
> VMs
> > > >
> > > >         Hi Rafael
> > > >
> > > >
> > > >         Thanks for the help, really appreciate it.
> > > >
> > > >
> > > >         So rerunning that command with all servers up -
> > > >
> > > >
> > > >
> > > >         mysql> select * from cloud.storage_pool where cluster_id = 1
> > and
> > > > removed is null;
> > > >         Empty set (0.00 sec)
> > > >
> > > >         mysql>
> > > >
> > > >
> > > >         As for the storage IP no I'm not setting it to be the
> > management
> > > > IP when I setup the zone but the output of the SQL command suggests
> > that
> > > is
> > > > what has happened.
> > > >
> > > >         As I said to Dag I am using a different subnet for storage
> ie.
> > > >
> > > >         172.30.3.0/26  - management subnet
> > > >         172.30.4.0/25 -  guest VM subnet
> > > >         172.30.5.0/28 - storage
> > > >
> > > >         the NFS server IP is 172.30.5.2
> > > >
> > > >         each compute node has 3 NICs with an IP from each subnet (i
> am
> > > > assuming the management node only needs an IP in the management
> network
> > > ?)
> > > >
> > > >         When I add the zone in the UI I have one physical network
> with
> > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > >         When I fill in the storage traffic page I use the range
> > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated
> to
> > > the
> > > > compute nodes and the NFS server.
> > > >
> > > >         I think maybe I am doing something wrong in the UI setup but
> it
> > > is
> > > > not obvious to me what it is.
> > > >
> > > >         What I might try today unless you want me to keep the setup I
> > > have
> > > > for more outputs is to go back to 2 NICs, one for storage/management
> > and
> > > > one for guest VMs.
> > > >
> > > >         I think with the 2 NICs setup the mistake I made last time
> when
> > > > adding the zone was to assume storage would just run over management
> > so I
> > > > did not drag and drop the storage icon and assign it to cloudbr0 as
> > with
> > > > the management which I think is what I should do ?
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >         ________________________________
> > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > >         Sent: 06 June 2018 10:54
> > > >         To: users
> > > >         Subject: Re: advanced networking with public IPs direct to
> VMs
> > > >
> > > >         Jon, do not panic we are here to help you :)
> > > >         So, I might have mistyped the SQL query. You you use select *
> > > from
> > > >         cloud.storage_pool where cluster_id = 1 and removed is not
> null
> > > ",
> > > > you are
> > > >         listing the storage pools removed. Therefore, the right query
> > > > would be "
> > > >         select * from cloud.storage_pool where cluster_id = 1 and
> > removed
> > > > is null "
> > > >
> > > >         There is also something else I do not understand. You are
> > setting
> > > > the
> > > >         storage IP in the management subnet? I am not sure if you
> > should
> > > > be doing
> > > >         like this. Normally, I set all my storages (primary[when
> > working
> > > > with NFS]
> > > >         and secondary) to IPs in the storage subnet.
> > > >
> > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > Dag.Sonstebo@shapeblue.com>
> > > >         wrote:
> > > >
> > > >         > Hi John,
> > > >         >
> > > >         > I’m late to this thread and have possibly missed some
> things
> > –
> > > > but a
> > > >         > couple of observations:
> > > >         >
> > > >         > “When I add the zone and get to the storage web page I
> > exclude
> > > > the IPs
> > > >         > already used for the compute node NICs and the NFS server
> > > > itself. …..”
> > > >         > “So the range is 172.30.5.1 -> 15 and the range I fill in
> is
> > > > 172.30.5.10
> > > >         > -> 172.30.5.14.”
> > > >         >
> > > >         > I think you may have some confusion around the use of the
> > > > storage network.
> > > >         > The important part here is to understand this is for
> > *secondary
> > > > storage*
> > > >         > use only – it has nothing to do with primary storage. This
> > > means
> > > > this
> > > >         > storage network needs to be accessible to the SSVM, to the
> > > > hypervisors, and
> > > >         > secondary storage NFS pools needs to be accessible on this
> > > > network.
> > > >         >
> > > >         > The important part – this also means you *can not use the
> > same
> > > > IP ranges
> > > >         > for management and storage networks* - doing so means you
> > will
> > > > have issues
> > > >         > where effectively both hypervisors and SSVM can see the
> same
> > > > subnet on two
> > > >         > NICs – and you end up in a routing black hole.
> > > >         >
> > > >         > So – you need to either:
> > > >         >
> > > >         > 1) Use different IP subnets on management and storage, or
> > > >         > 2) preferably just simplify your setup – stop using a
> > secondary
> > > > storage
> > > >         > network altogether and just allow secondary storage to use
> > the
> > > > management
> > > >         > network (which is default). Unless you have a very high I/O
> > > > environment in
> > > >         > production you are just adding complexity by running
> separate
> > > > management
> > > >         > and storage.
> > > >         >
> > > >         > Regards,
> > > >         > Dag Sonstebo
> > > >         > Cloud Architect
> > > >         > ShapeBlue
> > > >         >
> > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> jms.123@hotmail.co.uk>
> > > > wrote:
> > > >         >
> > > >         >     I will disconnect the host this morning and test but
> > before
> > > > I do that
> > > >         > I ran this command when all hosts are up -
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >      select * from cloud.host;
> > > >         >     +----+-----------------+------
> > > ------------------------------
> > > >         > --+--------+------------------
> --+--------------------+------
> > > >         > -----------+------------------
> ---+--------------------+-----
> > > >         > ------------+-----------------
> ----+----------------------+--
> > > >         > ---------------------+--------
> -----------+------------+-----
> > > >         > --------------+---------------
> --+--------------------+------
> > > >         > ------+----------------+------
> --+-------------+------+------
> > > >         > -+----------------------------
> ---------+---------+----------
> > > >         > -------+--------------------+-
> -----------+----------+-------
> > > >         > ---+--------+------------+----
> ----------+-------------------
> > > >         > ------------------------------
> --------------+-----------+---
> > > >         > ----+-------------+-----------
> -+----------------+-----------
> > > >         > ----------+-------------------
> --+---------+--------------+--
> > > >         > --------------+-------+-------------+--------------+
> > > >         >     | id | name            | uuid
> > > >  | status
> > > >         > | type               | private_ip_address |
> private_netmask |
> > > >         > private_mac_address | storage_ip_address | storage_netmask
> |
> > > >         > storage_mac_address | storage_ip_address_2 |
> > > > storage_mac_address_2 |
> > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > public_netmask  |
> > > >         > public_mac_address | proxy_port | data_center_id | pod_id |
> > > > cpu_sockets |
> > > >         > cpus | speed | url                                 |
> fs_type
> > |
> > > >         > hypervisor_type | hypervisor_version | ram        |
> resource
> > |
> > > > version  |
> > > >         > parent | total_size | capabilities | guid
> > > >         >                         | available | setup | dom0_memory |
> > > > last_ping  |
> > > >         > mgmt_server_id | disconnected        | created
>  |
> > > > removed |
> > > >         > update_count | resource_state | owner | lastUpdated |
> > > > engine_state |
> > > >         >     +----+-----------------+------
> > > ------------------------------
> > > >         > --+--------+------------------
> --+--------------------+------
> > > >         > -----------+------------------
> ---+--------------------+-----
> > > >         > ------------+-----------------
> ----+----------------------+--
> > > >         > ---------------------+--------
> -----------+------------+-----
> > > >         > --------------+---------------
> --+--------------------+------
> > > >         > ------+----------------+------
> --+-------------+------+------
> > > >         > -+----------------------------
> ---------+---------+----------
> > > >         > -------+--------------------+-
> -----------+----------+-------
> > > >         > ---+--------+------------+----
> ----------+-------------------
> > > >         > ------------------------------
> --------------+-----------+---
> > > >         > ----+-------------+-----------
> -+----------------+-----------
> > > >         > ----------+-------------------
> --+---------+--------------+--
> > > >         > --------------+-------+-------------+--------------+
> > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > eabd60012284
> > > > | Up
> > > >         >  | Routing            | 172.30.3.3         |
> 255.255.255.192
> > |
> > > >         > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192
> |
> > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |          1 | 172.30.4.3        |
> > 255.255.255.128
> > > |
> > > >         > 00:22:19:92:4e:35  |       NULL |              1 |      1 |
> > > >      1 |
> > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL
> > |
> > > > KVM
> > > >         >      | NULL               | 7510159360 | NULL     |
> 4.11.0.0
> > |
> > > > NULL   |
> > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > f83e7a5e8efb-
> > > > LibvirtComputingResource
> > > >         > |         1 |     0 |           0 | 1492390408 |
> > >  146457912294 |
> > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > 4 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > 3f5e2c2215b0
> > > > | Up
> > > >         >  | ConsoleProxy       | 172.30.3.49        |
> 255.255.255.192
> > |
> > > >         > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192
> |
> > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |       NULL | 172.30.4.98       |
> > 255.255.255.128
> > > |
> > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |
> > > >   NULL |
> > > >         > NULL |  NULL | NoIqn                               | NULL
> > |
> > > > NULL
> > > >         >     | NULL               |          0 | NULL     |
> 4.11.0.0 |
> > > > NULL   |
> > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > >         >           |         1 |     0 |           0 | 1492390409 |
> > > >  146457912294 |
> > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > 7 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > ff3845bb556c
> > > > | Up
> > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> 255.255.255.192
> > |
> > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192
> |
> > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |       NULL | 172.30.4.86       |
> > 255.255.255.128
> > > |
> > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |
> > > >   NULL |
> > > >         > NULL |  NULL | NoIqn                               | NULL
> > |
> > > > NULL
> > > >         >     | NULL               |          0 | NULL     |
> 4.11.0.0 |
> > > > NULL   |
> > > >         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> > > >         >             |         1 |     0 |           0 | 1492390407
> |
> > > >  146457912294
> > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > > >   7 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > 1a96ffac9794
> > > > | Up
> > > >         >  | Routing            | 172.30.3.4         |
> 255.255.255.192
> > |
> > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192
> |
> > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |          1 | 172.30.4.4        |
> > 255.255.255.128
> > > |
> > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |
> > > >      1 |
> > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL
> > |
> > > > KVM
> > > >         >      | NULL               | 7510159360 | NULL     |
> 4.11.0.0
> > |
> > > > NULL   |
> > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > 16d0f99b11c9-
> > > > LibvirtComputingResource
> > > >         > |         1 |     0 |           0 | 1492450882 |
> > >  146457912294 |
> > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > 8 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > ee29d2b5c274
> > > > | Up
> > > >         >  | Routing            | 172.30.3.5         |
> 255.255.255.192
> > |
> > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192
> |
> > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |          1 | 172.30.4.5        |
> > 255.255.255.128
> > > |
> > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |
> > > >      1 |
> > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL
> > |
> > > > KVM
> > > >         >      | NULL               | 7510159360 | NULL     |
> 4.11.0.0
> > |
> > > > NULL   |
> > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > 37f3eebad8fb-
> > > > LibvirtComputingResource
> > > >         > |         1 |     0 |           0 | 1492390408 |
> > >  146457912294 |
> > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > 6 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     +----+-----------------+------
> > > ------------------------------
> > > >         > --+--------+------------------
> --+--------------------+------
> > > >         > -----------+------------------
> ---+--------------------+-----
> > > >         > ------------+-----------------
> ----+----------------------+--
> > > >         > ---------------------+--------
> -----------+------------+-----
> > > >         > --------------+---------------
> --+--------------------+------
> > > >         > ------+----------------+------
> --+-------------+------+------
> > > >         > -+----------------------------
> ---------+---------+----------
> > > >         > -------+--------------------+-
> -----------+----------+-------
> > > >         > ---+--------+------------+----
> ----------+-------------------
> > > >         > ------------------------------
> --------------+-----------+---
> > > >         > ----+-------------+-----------
> -+----------------+-----------
> > > >         > ----------+-------------------
> --+---------+--------------+--
> > > >         > --------------+-------+-------------+--------------+
> > > >         >     5 rows in set (0.00 sec)
> > > >         >
> > > >         >
> > > >         >
> > > >         >     and you can see that it says the storage IP address is
> > the
> > > > same as the
> > > >         > private IP address (the management network).
> > > >         >
> > > >         >
> > > >         >     I also ran the command you provided using the Cluster
> ID
> > > > number from
> > > >         > the table above -
> > > >         >
> > > >         >
> > > >         >
> > > >         >     mysql> select * from cloud.storage_pool where
> cluster_id
> > =
> > > 1
> > > > and
> > > >         > removed is not null;
> > > >         >     Empty set (0.00 sec)
> > > >         >
> > > >         >     mysql>
> > > >         >
> > > >         >     So assuming I am reading this correctly that seems to
> be
> > > the
> > > > issue.
> > > >         >
> > > >         >
> > > >         >     I am at a loss as to why though.
> > > >         >
> > > >         >
> > > >         >     I have a separate NIC for storage as described. When I
> > add
> > > > the zone
> > > >         > and get to the storage web page I exclude the IPs already
> > used
> > > > for the
> > > >         > compute node NICs and the NFS server itself. I do this
> > because
> > > > initially I
> > > >         > didn't and the SSVM started using the IP address of the NFS
> > > > server.
> > > >         >
> > > >         >
> > > >         >     So the range is 172.30.5.1 -> 15 and the range I fill
> in
> > is
> > > >         > 172.30.5.10 -> 172.30.5.14.
> > > >         >
> > > >         >
> > > >         >     And I used the label "cloudbr2" for storage.
> > > >         >
> > > >         >
> > > >         >     I must be doing this wrong somehow.
> > > >         >
> > > >         >
> > > >         >     Any pointers would be much appreciated.
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >     ________________________________
> > > >         >     From: Rafael Weingärtner <ra...@gmail.com>
> > > >         >     Sent: 05 June 2018 16:13
> > > >         >     To: users
> > > >         >     Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > >         >
> > > >         >     That is interesting. Let's see the source of all
> truth...
> > > >         >     This is the code that is generating that odd message.
> > > >         >
> > > >         >     >     List<StoragePoolVO> clusterPools =
> > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > getClusterId());
> > > >         >     >         boolean hasNfs = false;
> > > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > > >         >     >             if (pool.getPoolType() ==
> StoragePoolType.
> > > > NetworkFilesystem)
> > > >         > {
> > > >         >     >                 hasNfs = true;
> > > >         >     >                 break;
> > > >         >     >             }
> > > >         >     >         }
> > > >         >     >         if (!hasNfs) {
> > > >         >     >             s_logger.warn(
> > > >         >     >                     "Agent investigation was
> requested
> > on
> > > > host " +
> > > >         > agent +
> > > >         >     > ", but host does not support investigation because it
> > has
> > > > no NFS
> > > >         > storage.
> > > >         >     > Skipping investigation.");
> > > >         >     >             return Status.Disconnected;
> > > >         >     >         }
> > > >         >     >
> > > >         >
> > > >         >     There are two possibilities here. You do not have any
> NFS
> > > > storage? Is
> > > >         > that
> > > >         >     the case? Or maybe, for some reason, the call
> > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > getClusterId())"
> > > > is not
> > > >         > returning
> > > >         >     any NFS storage pools. Looking at the
> > "listPoolsByCluster "
> > > > we will see
> > > >         >     that the following SQL is used:
> > > >         >
> > > >         >     Select * from storage_pool where cluster_id =
> > > > <host'sClusterId> and
> > > >         > removed
> > > >         >     > is not null
> > > >         >     >
> > > >         >
> > > >         >     Can you run that SQL to see the its return when your
> > hosts
> > > > are marked
> > > >         > as
> > > >         >     disconnected?
> > > >         >
> > > >         >
> > > >         > Dag.Sonstebo@shapeblue.com
> > > >         > www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > >         Shapeblue - The CloudStack Company<http://www.shapeblue.com/
> >
> > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > >         ShapeBlue are the largest independent integrator of
> CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > >         > @shapeblue
> > > >         >
> > > >         >
> > > >         >
> > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > jms.123@hotmail.co.uk>
> > > >         > wrote:
> > > >         >
> > > >         >     > I reran the tests with the 3 NIC setup. When I
> > configured
> > > > the zone
> > > >         > through
> > > >         >     > the UI I used the labels cloudbr0 for management,
> > > cloudbr1
> > > > for guest
> > > >         >     > traffic and cloudbr2 for NFS as per my original
> > response
> > > > to you.
> > > >         >     >
> > > >         >     >
> > > >         >     > When I pull the power to the node (dcp-cscn2.local)
> > after
> > > > about 5
> > > >         > mins
> > > >         >     > the  host status goes to "Alert" but never to "Down"
> > > >         >     >
> > > >         >     >
> > > >         >     > I get this in the logs -
> > > >         >     >
> > > >         >     >
> > > >         >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> > > > investigation
> > > >         > was
> > > >         >     > requested on host Host[-4-Routing], but host does not
> > > > support
> > > >         > investigation
> > > >         >     > because it has no NFS storage. Skipping
> investigation.
> > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > HighAvailabilityManagerImpl]
> > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > KVMInvestigator was
> > > >         > able to
> > > >         >     > determine host 4 is in Disconnected
> > > >         >     > 2018-06-05 15:17:14,382 INFO
> > [c.c.a.m.AgentManagerImpl]
> > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The
> > agent
> > > > from host
> > > >         > 4 state
> > > >         >     > determined is Disconnected
> > > >         >     > 2018-06-05 15:17:14,382 WARN
> > [c.c.a.m.AgentManagerImpl]
> > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> > is
> > > >         > disconnected but
> > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > >         >     >
> > > >         >     > I don't understand why it thinks there is no NFS
> > storage
> > > > as each
> > > >         > compute
> > > >         >     > node has a dedicated storage NIC.
> > > >         >     >
> > > >         >     >
> > > >         >     > I also don't understand why it thinks the host is
> still
> > > up
> > > > ie. what
> > > >         > test
> > > >         >     > is it doing to determine that ?
> > > >         >     >
> > > >         >     >
> > > >         >     > Am I just trying to get something working that is not
> > > > supported ?
> > > >         >     >
> > > >         >     >
> > > >         >     > ________________________________
> > > >         >     > From: Rafael Weingärtner <
> rafaelweingartner@gmail.com>
> > > >         >     > Sent: 04 June 2018 15:31
> > > >         >     > To: users
> > > >         >     > Subject: Re: advanced networking with public IPs
> direct
> > > to
> > > > VMs
> > > >         >     >
> > > >         >     > What type of failover are you talking about?
> > > >         >     > What ACS version are you using?
> > > >         >     > What hypervisor are you using?
> > > >         >     > How are you configuring your NICs in the hypervisor?
> > > >         >     > How are you configuring the traffic labels in ACS?
> > > >         >     >
> > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > jms.123@hotmail.co.uk
> > > >         > >
> > > >         >     > wrote:
> > > >         >     >
> > > >         >     > > Hi all
> > > >         >     > >
> > > >         >     > >
> > > >         >     > > I am close to giving up on basic networking as I
> just
> > > > cannot get
> > > >         > failover
> > > >         >     > > working with multiple NICs (I am not even sure it
> is
> > > > supported).
> > > >         >     > >
> > > >         >     > >
> > > >         >     > > What I would like is to use 3 NICs for management,
> > > > storage and
> > > >         > guest
> > > >         >     > > traffic. I would like to assign public IPs direct
> to
> > > the
> > > > VMs which
> > > >         > is
> > > >         >     > why I
> > > >         >     > > originally chose basic.
> > > >         >     > >
> > > >         >     > >
> > > >         >     > > If I switch to advanced networking do I just
> > configure
> > > a
> > > > guest VM
> > > >         > with
> > > >         >     > > public IPs on one NIC and not both with the public
> > > > traffic -
> > > >         >     > >
> > > >         >     > >
> > > >         >     > > would this work ?
> > > >         >     > >
> > > >         >     >
> > > >         >     >
> > > >         >     >
> > > >         >     > --
> > > >         >     > Rafael Weingärtner
> > > >         >     >
> > > >         >
> > > >         >
> > > >         >
> > > >         >     --
> > > >         >     Rafael Weingärtner
> > > >         >
> > > >         >
> > > >         >
> > > >
> > > >
> > > >         --
> > > >         Rafael Weingärtner
> > > >
> > > >
> > > >
> > > >     Dag.Sonstebo@shapeblue.com
> > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > >     @shapeblue
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Dag.Sonstebo@shapeblue.com
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > @shapeblue
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> Rafael Weingärtner
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
Ah so, it is not an advanced setup; even when you use multiple NICs.
Can you confirm that the message ""Agent investigation was requested on
host, but host does not support investigation because it has no NFS
storage. Skipping investigation." does not appear when you use a single
NIC? Can you check other log entries that might appear when the host is
marked as "down"?

On Thu, Jun 7, 2018 at 6:30 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> It is all basic networking at the moment for all the setups.
>
>
> If you want me to I can setup a single NIC solution again and run any
> commands you need me to do.
>
>
> FYI when I setup single NIC I use the guided  installtion option in the UI
> rather than manual setup which I do for the multiple NIC scenario.
>
>
> Happy to set it up if it helps.
>
>
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 07 June 2018 10:23
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Ok, so that explains the log message. This is looking like a bug to me. It
> seems that in Zone wide the host state (when disconnected) is not being
> properly identified due to this NFS thing, and as a consequency it has a
> side effect in VM HA.
>
> We would need some inputs from guys that have advanced networking
> deployments and Zone wide storage.
>
> I do not see how the all in one NIC deployment scenario is working though.
> This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
> simple, if there is no NFS in the cluster (NFS storage pools found for a
> host's cluster), KVM hosts will be detected as "disconnected" and not down
> with that warning message you noticed.
>
> When you say "all in one NIC", is it an advanced network deployment where
> you put all traffic in a single network, or is it a basic networking that
> you are doing?
>
> On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > zone wide.
> >
> >
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 07 June 2018 10:04
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > What type of storage are you using? Zone wide? Or cluster "wide" storage?
> >
> > On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > Rafael
> > >
> > >
> > > Here is the output as requested -
> > >
> > >
> > >
> > > mysql> mysql> select * from cloud.storage_pool where removed is null;
> > > +----+------+--------------------------------------+--------
> > > -----------+------+----------------+--------+------------+--
> > > ----------+----------------+--------------+-----------+-----
> > > ------------+---------------------+---------+-------------+-
> > > -------+-----------------------+-------+------------+-------
> > > --+---------------+
> > > | id | name | uuid                                 | pool_type
>  |
> > > port | data_center_id | pod_id | cluster_id | used_bytes |
> > capacity_bytes |
> > > host_address | user_info | path            | created             |
> > removed
> > > | update_time | status | storage_provider_name | scope | hypervisor |
> > > managed | capacity_iops |
> > > +----+------+--------------------------------------+--------
> > > -----------+------+----------------+--------+------------+--
> > > ----------+----------------+--------------+-----------+-----
> > > ------------+---------------------+---------+-------------+-
> > > -------+-----------------------+-------+------------+-------
> > > --+---------------+
> > > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 |
> NetworkFilesystem |
> > > 2049 |              1 |   NULL |       NULL | 6059720704 |
> > 79133933568 |
> > > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 | NULL
> > > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |
> > >  0 |          NULL |
> > > +----+------+--------------------------------------+--------
> > > -----------+------+----------------+--------+------------+--
> > > ----------+----------------+--------------+-----------+-----
> > > ------------+---------------------+---------+-------------+-
> > > -------+-----------------------+-------+------------+-------
> > > --+---------------+
> > > 1 row in set (0.00 sec)
> > >
> > > mysql>
> > >
> > > Do you think this problem is related to my NIC/bridge configuration or
> > the
> > > way I am configuring the zone ?
> > >
> > > Jon
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 07 June 2018 06:45
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > Can you also post the result of:
> > > select * from cloud.storage_pool where removed is null
> > >
> > > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <
> Dag.Sonstebo@shapeblue.com
> > >
> > > wrote:
> > >
> > > > Hi Jon,
> > > >
> > > > Still confused where your primary storage pools are – are you sure
> your
> > > > hosts are in cluster 1?
> > > >
> > > > Quick question just to make sure - assuming management/storage is on
> > the
> > > > same NIC when I setup basic networking the physical network has the
> > > > management and guest icons already there and I just edit the KVM
> > labels.
> > > If
> > > > I am running storage over management do I need to drag the storage
> icon
> > > to
> > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > management or does CS automatically just use the management NIC ie. I
> > > would
> > > > only need to drag the storage icon across in basic setup if I wanted
> it
> > > on
> > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > >
> > > > >> I would do both – set up your 2/3 physical networks, name isn’t
> that
> > > > important – but then drag the traffic types to the correct one and
> make
> > > > sure the labels are correct.
> > > > Regards,
> > > > Dag Sonstebo
> > > > Cloud Architect
> > > > ShapeBlue
> > > >
> > > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
> > > >
> > > >     Dag
> > > >
> > > >
> > > >     Do you mean  check the pools with "Infrastructure -> Primary
> > Storage"
> > > > and "Infrastructure -> Secondary Storage" within the UI ?
> > > >
> > > >
> > > >     If so Primary Storage has a state of UP, secondary storage does
> not
> > > > show a state as such so not sure where else to check it ?
> > > >
> > > >
> > > >     Rerun of the command -
> > > >
> > > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > > >     Empty set (0.00 sec)
> > > >
> > > >     mysql>
> > > >
> > > >     I think it is something to do with my zone creation rather than
> the
> > > > NIC, bridge setup although I can post those if needed.
> > > >
> > > >     I may try to setup just the 2 NIC solution you mentioned although
> > as
> > > I
> > > > say I had the same issue with that ie. host goes to "Altert" state
> and
> > > same
> > > > error messages.  The only time I can get it to go to "Down" state is
> > when
> > > > it is all on the single NIC.
> > > >
> > > >     Quick question just to make sure - assuming management/storage is
> > on
> > > > the same NIC when I setup basic networking the physical network has
> the
> > > > management and guest icons already there and I just edit the KVM
> > labels.
> > > If
> > > > I am running storage over management do I need to drag the storage
> icon
> > > to
> > > > the physical network and use the same KVM label (cloudbr0) as the
> > > > management or does CS automatically just use the management NIC ie. I
> > > would
> > > > only need to drag the storage icon across in basic setup if I wanted
> it
> > > on
> > > > a different NIC/IP subnet ?  (hope that makes sense !)
> > > >
> > > >     On the plus side I have been at this for so long now and done so
> > many
> > > > rebuilds I could do it in my sleep now 😊
> > > >
> > > >
> > > >     ________________________________
> > > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > > >     Sent: 06 June 2018 12:28
> > > >     To: users@cloudstack.apache.org
> > > >     Subject: Re: advanced networking with public IPs direct to VMs
> > > >
> > > >     Looks OK to me Jon.
> > > >
> > > >     The one thing that throws me is your storage pools – can you
> rerun
> > > > your query: select * from cloud.storage_pool where cluster_id = 1;
> > > >
> > > >     Do the pools show up as online in the CloudStack GUI?
> > > >
> > > >     Regards,
> > > >     Dag Sonstebo
> > > >     Cloud Architect
> > > >     ShapeBlue
> > > >
> > > >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk>
> > wrote:
> > > >
> > > >         Don't know whether this helps or not but I logged into the
> SSVM
> > > > and ran an ifconfig -
> > > >
> > > >
> > > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > > 169.254.255.255
> > > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
> > > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > >                 inet 172.30.3.34  netmask 255.255.255.192  broadcast
> > > > 172.30.3.63
> > > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
> > > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > >                 inet 172.30.4.86  netmask 255.255.255.128  broadcast
> > > > 172.30.4.127
> > > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
> > > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > > >                 inet 172.30.5.14  netmask 255.255.255.240  broadcast
> > > > 172.30.5.15
> > > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
> > > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > > >                 inet 127.0.0.1  netmask 255.0.0.0
> > > >                 loop  txqueuelen 1  (Local Loopback)
> > > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> > collisions
> > > 0
> > > >
> > > >
> > > >         so it has interfaces in both the management and the storage
> > > > subnets (as well as guest).
> > > >
> > > >
> > > >
> > > >         ________________________________
> > > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > > >         Sent: 06 June 2018 11:08
> > > >         To: users@cloudstack.apache.org
> > > >         Subject: Re: advanced networking with public IPs direct to
> VMs
> > > >
> > > >         Hi Rafael
> > > >
> > > >
> > > >         Thanks for the help, really appreciate it.
> > > >
> > > >
> > > >         So rerunning that command with all servers up -
> > > >
> > > >
> > > >
> > > >         mysql> select * from cloud.storage_pool where cluster_id = 1
> > and
> > > > removed is null;
> > > >         Empty set (0.00 sec)
> > > >
> > > >         mysql>
> > > >
> > > >
> > > >         As for the storage IP no I'm not setting it to be the
> > management
> > > > IP when I setup the zone but the output of the SQL command suggests
> > that
> > > is
> > > > what has happened.
> > > >
> > > >         As I said to Dag I am using a different subnet for storage
> ie.
> > > >
> > > >         172.30.3.0/26  - management subnet
> > > >         172.30.4.0/25 -  guest VM subnet
> > > >         172.30.5.0/28 - storage
> > > >
> > > >         the NFS server IP is 172.30.5.2
> > > >
> > > >         each compute node has 3 NICs with an IP from each subnet (i
> am
> > > > assuming the management node only needs an IP in the management
> network
> > > ?)
> > > >
> > > >         When I add the zone in the UI I have one physical network
> with
> > > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > > >         When I fill in the storage traffic page I use the range
> > > > 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated
> to
> > > the
> > > > compute nodes and the NFS server.
> > > >
> > > >         I think maybe I am doing something wrong in the UI setup but
> it
> > > is
> > > > not obvious to me what it is.
> > > >
> > > >         What I might try today unless you want me to keep the setup I
> > > have
> > > > for more outputs is to go back to 2 NICs, one for storage/management
> > and
> > > > one for guest VMs.
> > > >
> > > >         I think with the 2 NICs setup the mistake I made last time
> when
> > > > adding the zone was to assume storage would just run over management
> > so I
> > > > did not drag and drop the storage icon and assign it to cloudbr0 as
> > with
> > > > the management which I think is what I should do ?
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >         ________________________________
> > > >         From: Rafael Weingärtner <ra...@gmail.com>
> > > >         Sent: 06 June 2018 10:54
> > > >         To: users
> > > >         Subject: Re: advanced networking with public IPs direct to
> VMs
> > > >
> > > >         Jon, do not panic we are here to help you :)
> > > >         So, I might have mistyped the SQL query. You you use select *
> > > from
> > > >         cloud.storage_pool where cluster_id = 1 and removed is not
> null
> > > ",
> > > > you are
> > > >         listing the storage pools removed. Therefore, the right query
> > > > would be "
> > > >         select * from cloud.storage_pool where cluster_id = 1 and
> > removed
> > > > is null "
> > > >
> > > >         There is also something else I do not understand. You are
> > setting
> > > > the
> > > >         storage IP in the management subnet? I am not sure if you
> > should
> > > > be doing
> > > >         like this. Normally, I set all my storages (primary[when
> > working
> > > > with NFS]
> > > >         and secondary) to IPs in the storage subnet.
> > > >
> > > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > > Dag.Sonstebo@shapeblue.com>
> > > >         wrote:
> > > >
> > > >         > Hi John,
> > > >         >
> > > >         > I’m late to this thread and have possibly missed some
> things
> > –
> > > > but a
> > > >         > couple of observations:
> > > >         >
> > > >         > “When I add the zone and get to the storage web page I
> > exclude
> > > > the IPs
> > > >         > already used for the compute node NICs and the NFS server
> > > > itself. …..”
> > > >         > “So the range is 172.30.5.1 -> 15 and the range I fill in
> is
> > > > 172.30.5.10
> > > >         > -> 172.30.5.14.”
> > > >         >
> > > >         > I think you may have some confusion around the use of the
> > > > storage network.
> > > >         > The important part here is to understand this is for
> > *secondary
> > > > storage*
> > > >         > use only – it has nothing to do with primary storage. This
> > > means
> > > > this
> > > >         > storage network needs to be accessible to the SSVM, to the
> > > > hypervisors, and
> > > >         > secondary storage NFS pools needs to be accessible on this
> > > > network.
> > > >         >
> > > >         > The important part – this also means you *can not use the
> > same
> > > > IP ranges
> > > >         > for management and storage networks* - doing so means you
> > will
> > > > have issues
> > > >         > where effectively both hypervisors and SSVM can see the
> same
> > > > subnet on two
> > > >         > NICs – and you end up in a routing black hole.
> > > >         >
> > > >         > So – you need to either:
> > > >         >
> > > >         > 1) Use different IP subnets on management and storage, or
> > > >         > 2) preferably just simplify your setup – stop using a
> > secondary
> > > > storage
> > > >         > network altogether and just allow secondary storage to use
> > the
> > > > management
> > > >         > network (which is default). Unless you have a very high I/O
> > > > environment in
> > > >         > production you are just adding complexity by running
> separate
> > > > management
> > > >         > and storage.
> > > >         >
> > > >         > Regards,
> > > >         > Dag Sonstebo
> > > >         > Cloud Architect
> > > >         > ShapeBlue
> > > >         >
> > > >         > On 06/06/2018, 10:18, "Jon Marshall" <
> jms.123@hotmail.co.uk>
> > > > wrote:
> > > >         >
> > > >         >     I will disconnect the host this morning and test but
> > before
> > > > I do that
> > > >         > I ran this command when all hosts are up -
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >      select * from cloud.host;
> > > >         >     +----+-----------------+------
> > > ------------------------------
> > > >         > --+--------+------------------
> --+--------------------+------
> > > >         > -----------+------------------
> ---+--------------------+-----
> > > >         > ------------+-----------------
> ----+----------------------+--
> > > >         > ---------------------+--------
> -----------+------------+-----
> > > >         > --------------+---------------
> --+--------------------+------
> > > >         > ------+----------------+------
> --+-------------+------+------
> > > >         > -+----------------------------
> ---------+---------+----------
> > > >         > -------+--------------------+-
> -----------+----------+-------
> > > >         > ---+--------+------------+----
> ----------+-------------------
> > > >         > ------------------------------
> --------------+-----------+---
> > > >         > ----+-------------+-----------
> -+----------------+-----------
> > > >         > ----------+-------------------
> --+---------+--------------+--
> > > >         > --------------+-------+-------------+--------------+
> > > >         >     | id | name            | uuid
> > > >  | status
> > > >         > | type               | private_ip_address |
> private_netmask |
> > > >         > private_mac_address | storage_ip_address | storage_netmask
> |
> > > >         > storage_mac_address | storage_ip_address_2 |
> > > > storage_mac_address_2 |
> > > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > > public_netmask  |
> > > >         > public_mac_address | proxy_port | data_center_id | pod_id |
> > > > cpu_sockets |
> > > >         > cpus | speed | url                                 |
> fs_type
> > |
> > > >         > hypervisor_type | hypervisor_version | ram        |
> resource
> > |
> > > > version  |
> > > >         > parent | total_size | capabilities | guid
> > > >         >                         | available | setup | dom0_memory |
> > > > last_ping  |
> > > >         > mgmt_server_id | disconnected        | created
>  |
> > > > removed |
> > > >         > update_count | resource_state | owner | lastUpdated |
> > > > engine_state |
> > > >         >     +----+-----------------+------
> > > ------------------------------
> > > >         > --+--------+------------------
> --+--------------------+------
> > > >         > -----------+------------------
> ---+--------------------+-----
> > > >         > ------------+-----------------
> ----+----------------------+--
> > > >         > ---------------------+--------
> -----------+------------+-----
> > > >         > --------------+---------------
> --+--------------------+------
> > > >         > ------+----------------+------
> --+-------------+------+------
> > > >         > -+----------------------------
> ---------+---------+----------
> > > >         > -------+--------------------+-
> -----------+----------+-------
> > > >         > ---+--------+------------+----
> ----------+-------------------
> > > >         > ------------------------------
> --------------+-----------+---
> > > >         > ----+-------------+-----------
> -+----------------+-----------
> > > >         > ----------+-------------------
> --+---------+--------------+--
> > > >         > --------------+-------+-------------+--------------+
> > > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > > eabd60012284
> > > > | Up
> > > >         >  | Routing            | 172.30.3.3         |
> 255.255.255.192
> > |
> > > >         > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192
> |
> > > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |          1 | 172.30.4.3        |
> > 255.255.255.128
> > > |
> > > >         > 00:22:19:92:4e:35  |       NULL |              1 |      1 |
> > > >      1 |
> > > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL
> > |
> > > > KVM
> > > >         >      | NULL               | 7510159360 | NULL     |
> 4.11.0.0
> > |
> > > > NULL   |
> > > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> > f83e7a5e8efb-
> > > > LibvirtComputingResource
> > > >         > |         1 |     0 |           0 | 1492390408 |
> > >  146457912294 |
> > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > > 4 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > > 3f5e2c2215b0
> > > > | Up
> > > >         >  | ConsoleProxy       | 172.30.3.49        |
> 255.255.255.192
> > |
> > > >         > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192
> |
> > > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |       NULL | 172.30.4.98       |
> > 255.255.255.128
> > > |
> > > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |
> > > >   NULL |
> > > >         > NULL |  NULL | NoIqn                               | NULL
> > |
> > > > NULL
> > > >         >     | NULL               |          0 | NULL     |
> 4.11.0.0 |
> > > > NULL   |
> > > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > > >         >           |         1 |     0 |           0 | 1492390409 |
> > > >  146457912294 |
> > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > > 7 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > > ff3845bb556c
> > > > | Up
> > > >         >  | SecondaryStorageVM | 172.30.3.34        |
> 255.255.255.192
> > |
> > > >         > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192
> |
> > > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |       NULL | 172.30.4.86       |
> > 255.255.255.128
> > > |
> > > >         > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |
> > > >   NULL |
> > > >         > NULL |  NULL | NoIqn                               | NULL
> > |
> > > > NULL
> > > >         >     | NULL               |          0 | NULL     |
> 4.11.0.0 |
> > > > NULL   |
> > > >         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> > > >         >             |         1 |     0 |           0 | 1492390407
> |
> > > >  146457912294
> > > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > > >   7 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > > 1a96ffac9794
> > > > | Up
> > > >         >  | Routing            | 172.30.3.4         |
> 255.255.255.192
> > |
> > > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192
> |
> > > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |          1 | 172.30.4.4        |
> > 255.255.255.128
> > > |
> > > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |
> > > >      1 |
> > > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL
> > |
> > > > KVM
> > > >         >      | NULL               | 7510159360 | NULL     |
> 4.11.0.0
> > |
> > > > NULL   |
> > > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> > 16d0f99b11c9-
> > > > LibvirtComputingResource
> > > >         > |         1 |     0 |           0 | 1492450882 |
> > >  146457912294 |
> > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > > 8 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > > ee29d2b5c274
> > > > | Up
> > > >         >  | Routing            | 172.30.3.5         |
> 255.255.255.192
> > |
> > > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192
> |
> > > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > > >   | NULL
> > > >         >             |          1 | 172.30.4.5        |
> > 255.255.255.128
> > > |
> > > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |
> > > >      1 |
> > > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL
> > |
> > > > KVM
> > > >         >      | NULL               | 7510159360 | NULL     |
> 4.11.0.0
> > |
> > > > NULL   |
> > > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> > 37f3eebad8fb-
> > > > LibvirtComputingResource
> > > >         > |         1 |     0 |           0 | 1492390408 |
> > >  146457912294 |
> > > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > > 6 |
> > > >         > Enabled        | NULL  | NULL        | Disabled     |
> > > >         >     +----+-----------------+------
> > > ------------------------------
> > > >         > --+--------+------------------
> --+--------------------+------
> > > >         > -----------+------------------
> ---+--------------------+-----
> > > >         > ------------+-----------------
> ----+----------------------+--
> > > >         > ---------------------+--------
> -----------+------------+-----
> > > >         > --------------+---------------
> --+--------------------+------
> > > >         > ------+----------------+------
> --+-------------+------+------
> > > >         > -+----------------------------
> ---------+---------+----------
> > > >         > -------+--------------------+-
> -----------+----------+-------
> > > >         > ---+--------+------------+----
> ----------+-------------------
> > > >         > ------------------------------
> --------------+-----------+---
> > > >         > ----+-------------+-----------
> -+----------------+-----------
> > > >         > ----------+-------------------
> --+---------+--------------+--
> > > >         > --------------+-------+-------------+--------------+
> > > >         >     5 rows in set (0.00 sec)
> > > >         >
> > > >         >
> > > >         >
> > > >         >     and you can see that it says the storage IP address is
> > the
> > > > same as the
> > > >         > private IP address (the management network).
> > > >         >
> > > >         >
> > > >         >     I also ran the command you provided using the Cluster
> ID
> > > > number from
> > > >         > the table above -
> > > >         >
> > > >         >
> > > >         >
> > > >         >     mysql> select * from cloud.storage_pool where
> cluster_id
> > =
> > > 1
> > > > and
> > > >         > removed is not null;
> > > >         >     Empty set (0.00 sec)
> > > >         >
> > > >         >     mysql>
> > > >         >
> > > >         >     So assuming I am reading this correctly that seems to
> be
> > > the
> > > > issue.
> > > >         >
> > > >         >
> > > >         >     I am at a loss as to why though.
> > > >         >
> > > >         >
> > > >         >     I have a separate NIC for storage as described. When I
> > add
> > > > the zone
> > > >         > and get to the storage web page I exclude the IPs already
> > used
> > > > for the
> > > >         > compute node NICs and the NFS server itself. I do this
> > because
> > > > initially I
> > > >         > didn't and the SSVM started using the IP address of the NFS
> > > > server.
> > > >         >
> > > >         >
> > > >         >     So the range is 172.30.5.1 -> 15 and the range I fill
> in
> > is
> > > >         > 172.30.5.10 -> 172.30.5.14.
> > > >         >
> > > >         >
> > > >         >     And I used the label "cloudbr2" for storage.
> > > >         >
> > > >         >
> > > >         >     I must be doing this wrong somehow.
> > > >         >
> > > >         >
> > > >         >     Any pointers would be much appreciated.
> > > >         >
> > > >         >
> > > >         >
> > > >         >
> > > >         >     ________________________________
> > > >         >     From: Rafael Weingärtner <ra...@gmail.com>
> > > >         >     Sent: 05 June 2018 16:13
> > > >         >     To: users
> > > >         >     Subject: Re: advanced networking with public IPs direct
> > to
> > > > VMs
> > > >         >
> > > >         >     That is interesting. Let's see the source of all
> truth...
> > > >         >     This is the code that is generating that odd message.
> > > >         >
> > > >         >     >     List<StoragePoolVO> clusterPools =
> > > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > > getClusterId());
> > > >         >     >         boolean hasNfs = false;
> > > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > > >         >     >             if (pool.getPoolType() ==
> StoragePoolType.
> > > > NetworkFilesystem)
> > > >         > {
> > > >         >     >                 hasNfs = true;
> > > >         >     >                 break;
> > > >         >     >             }
> > > >         >     >         }
> > > >         >     >         if (!hasNfs) {
> > > >         >     >             s_logger.warn(
> > > >         >     >                     "Agent investigation was
> requested
> > on
> > > > host " +
> > > >         > agent +
> > > >         >     > ", but host does not support investigation because it
> > has
> > > > no NFS
> > > >         > storage.
> > > >         >     > Skipping investigation.");
> > > >         >     >             return Status.Disconnected;
> > > >         >     >         }
> > > >         >     >
> > > >         >
> > > >         >     There are two possibilities here. You do not have any
> NFS
> > > > storage? Is
> > > >         > that
> > > >         >     the case? Or maybe, for some reason, the call
> > > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> > getClusterId())"
> > > > is not
> > > >         > returning
> > > >         >     any NFS storage pools. Looking at the
> > "listPoolsByCluster "
> > > > we will see
> > > >         >     that the following SQL is used:
> > > >         >
> > > >         >     Select * from storage_pool where cluster_id =
> > > > <host'sClusterId> and
> > > >         > removed
> > > >         >     > is not null
> > > >         >     >
> > > >         >
> > > >         >     Can you run that SQL to see the its return when your
> > hosts
> > > > are marked
> > > >         > as
> > > >         >     disconnected?
> > > >         >
> > > >         >
> > > >         > Dag.Sonstebo@shapeblue.com
> > > >         > www.shapeblue.com<http://www.shapeblue.com>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > > >         Shapeblue - The CloudStack Company<http://www.shapeblue.com/
> >
> > > >         www.shapeblue.com<http://www.shapeblue.com>
> > > >         ShapeBlue are the largest independent integrator of
> CloudStack
> > > > technologies globally and are specialists in the design and
> > > implementation
> > > > of IaaS cloud infrastructures for both private and public cloud
> > > > implementations.
> > > >
> > > >
> > > >
> > > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > >         > @shapeblue
> > > >         >
> > > >         >
> > > >         >
> > > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > > jms.123@hotmail.co.uk>
> > > >         > wrote:
> > > >         >
> > > >         >     > I reran the tests with the 3 NIC setup. When I
> > configured
> > > > the zone
> > > >         > through
> > > >         >     > the UI I used the labels cloudbr0 for management,
> > > cloudbr1
> > > > for guest
> > > >         >     > traffic and cloudbr2 for NFS as per my original
> > response
> > > > to you.
> > > >         >     >
> > > >         >     >
> > > >         >     > When I pull the power to the node (dcp-cscn2.local)
> > after
> > > > about 5
> > > >         > mins
> > > >         >     > the  host status goes to "Alert" but never to "Down"
> > > >         >     >
> > > >         >     >
> > > >         >     > I get this in the logs -
> > > >         >     >
> > > >         >     >
> > > >         >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> > > > investigation
> > > >         > was
> > > >         >     > requested on host Host[-4-Routing], but host does not
> > > > support
> > > >         > investigation
> > > >         >     > because it has no NFS storage. Skipping
> investigation.
> > > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > > HighAvailabilityManagerImpl]
> > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > > KVMInvestigator was
> > > >         > able to
> > > >         >     > determine host 4 is in Disconnected
> > > >         >     > 2018-06-05 15:17:14,382 INFO
> > [c.c.a.m.AgentManagerImpl]
> > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The
> > agent
> > > > from host
> > > >         > 4 state
> > > >         >     > determined is Disconnected
> > > >         >     > 2018-06-05 15:17:14,382 WARN
> > [c.c.a.m.AgentManagerImpl]
> > > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> > is
> > > >         > disconnected but
> > > >         >     > the host is still up: 4-dcp-cscn2.local
> > > >         >     >
> > > >         >     > I don't understand why it thinks there is no NFS
> > storage
> > > > as each
> > > >         > compute
> > > >         >     > node has a dedicated storage NIC.
> > > >         >     >
> > > >         >     >
> > > >         >     > I also don't understand why it thinks the host is
> still
> > > up
> > > > ie. what
> > > >         > test
> > > >         >     > is it doing to determine that ?
> > > >         >     >
> > > >         >     >
> > > >         >     > Am I just trying to get something working that is not
> > > > supported ?
> > > >         >     >
> > > >         >     >
> > > >         >     > ________________________________
> > > >         >     > From: Rafael Weingärtner <
> rafaelweingartner@gmail.com>
> > > >         >     > Sent: 04 June 2018 15:31
> > > >         >     > To: users
> > > >         >     > Subject: Re: advanced networking with public IPs
> direct
> > > to
> > > > VMs
> > > >         >     >
> > > >         >     > What type of failover are you talking about?
> > > >         >     > What ACS version are you using?
> > > >         >     > What hypervisor are you using?
> > > >         >     > How are you configuring your NICs in the hypervisor?
> > > >         >     > How are you configuring the traffic labels in ACS?
> > > >         >     >
> > > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > > jms.123@hotmail.co.uk
> > > >         > >
> > > >         >     > wrote:
> > > >         >     >
> > > >         >     > > Hi all
> > > >         >     > >
> > > >         >     > >
> > > >         >     > > I am close to giving up on basic networking as I
> just
> > > > cannot get
> > > >         > failover
> > > >         >     > > working with multiple NICs (I am not even sure it
> is
> > > > supported).
> > > >         >     > >
> > > >         >     > >
> > > >         >     > > What I would like is to use 3 NICs for management,
> > > > storage and
> > > >         > guest
> > > >         >     > > traffic. I would like to assign public IPs direct
> to
> > > the
> > > > VMs which
> > > >         > is
> > > >         >     > why I
> > > >         >     > > originally chose basic.
> > > >         >     > >
> > > >         >     > >
> > > >         >     > > If I switch to advanced networking do I just
> > configure
> > > a
> > > > guest VM
> > > >         > with
> > > >         >     > > public IPs on one NIC and not both with the public
> > > > traffic -
> > > >         >     > >
> > > >         >     > >
> > > >         >     > > would this work ?
> > > >         >     > >
> > > >         >     >
> > > >         >     >
> > > >         >     >
> > > >         >     > --
> > > >         >     > Rafael Weingärtner
> > > >         >     >
> > > >         >
> > > >         >
> > > >         >
> > > >         >     --
> > > >         >     Rafael Weingärtner
> > > >         >
> > > >         >
> > > >         >
> > > >
> > > >
> > > >         --
> > > >         Rafael Weingärtner
> > > >
> > > >
> > > >
> > > >     Dag.Sonstebo@shapeblue.com
> > > >     www.shapeblue.com<http://www.shapeblue.com>
> > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > >     @shapeblue
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Dag.Sonstebo@shapeblue.com
> > > > www.shapeblue.com<http://www.shapeblue.com>
> > > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > > @shapeblue
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> Rafael Weingärtner
>



-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
It is all basic networking at the moment for all the setups.


If you want me to I can setup a single NIC solution again and run any commands you need me to do.


FYI when I setup single NIC I use the guided  installtion option in the UI rather than manual setup which I do for the multiple NIC scenario.


Happy to set it up if it helps.




________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 07 June 2018 10:23
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Ok, so that explains the log message. This is looking like a bug to me. It
seems that in Zone wide the host state (when disconnected) is not being
properly identified due to this NFS thing, and as a consequency it has a
side effect in VM HA.

We would need some inputs from guys that have advanced networking
deployments and Zone wide storage.

I do not see how the all in one NIC deployment scenario is working though.
This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
simple, if there is no NFS in the cluster (NFS storage pools found for a
host's cluster), KVM hosts will be detected as "disconnected" and not down
with that warning message you noticed.

When you say "all in one NIC", is it an advanced network deployment where
you put all traffic in a single network, or is it a basic networking that
you are doing?

On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> zone wide.
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 07 June 2018 10:04
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> What type of storage are you using? Zone wide? Or cluster "wide" storage?
>
> On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Rafael
> >
> >
> > Here is the output as requested -
> >
> >
> >
> > mysql> mysql> select * from cloud.storage_pool where removed is null;
> > +----+------+--------------------------------------+--------
> > -----------+------+----------------+--------+------------+--
> > ----------+----------------+--------------+-----------+-----
> > ------------+---------------------+---------+-------------+-
> > -------+-----------------------+-------+------------+-------
> > --+---------------+
> > | id | name | uuid                                 | pool_type         |
> > port | data_center_id | pod_id | cluster_id | used_bytes |
> capacity_bytes |
> > host_address | user_info | path            | created             |
> removed
> > | update_time | status | storage_provider_name | scope | hypervisor |
> > managed | capacity_iops |
> > +----+------+--------------------------------------+--------
> > -----------+------+----------------+--------+------------+--
> > ----------+----------------+--------------+-----------+-----
> > ------------+---------------------+---------+-------------+-
> > -------+-----------------------+-------+------------+-------
> > --+---------------+
> > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 | NetworkFilesystem |
> > 2049 |              1 |   NULL |       NULL | 6059720704 |
> 79133933568 |
> > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 | NULL
> > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |
> >  0 |          NULL |
> > +----+------+--------------------------------------+--------
> > -----------+------+----------------+--------+------------+--
> > ----------+----------------+--------------+-----------+-----
> > ------------+---------------------+---------+-------------+-
> > -------+-----------------------+-------+------------+-------
> > --+---------------+
> > 1 row in set (0.00 sec)
> >
> > mysql>
> >
> > Do you think this problem is related to my NIC/bridge configuration or
> the
> > way I am configuring the zone ?
> >
> > Jon
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 07 June 2018 06:45
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > Can you also post the result of:
> > select * from cloud.storage_pool where removed is null
> >
> > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <Dag.Sonstebo@shapeblue.com
> >
> > wrote:
> >
> > > Hi Jon,
> > >
> > > Still confused where your primary storage pools are – are you sure your
> > > hosts are in cluster 1?
> > >
> > > Quick question just to make sure - assuming management/storage is on
> the
> > > same NIC when I setup basic networking the physical network has the
> > > management and guest icons already there and I just edit the KVM
> labels.
> > If
> > > I am running storage over management do I need to drag the storage icon
> > to
> > > the physical network and use the same KVM label (cloudbr0) as the
> > > management or does CS automatically just use the management NIC ie. I
> > would
> > > only need to drag the storage icon across in basic setup if I wanted it
> > on
> > > a different NIC/IP subnet ?  (hope that makes sense !)
> > >
> > > >> I would do both – set up your 2/3 physical networks, name isn’t that
> > > important – but then drag the traffic types to the correct one and make
> > > sure the labels are correct.
> > > Regards,
> > > Dag Sonstebo
> > > Cloud Architect
> > > ShapeBlue
> > >
> > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
> > >
> > >     Dag
> > >
> > >
> > >     Do you mean  check the pools with "Infrastructure -> Primary
> Storage"
> > > and "Infrastructure -> Secondary Storage" within the UI ?
> > >
> > >
> > >     If so Primary Storage has a state of UP, secondary storage does not
> > > show a state as such so not sure where else to check it ?
> > >
> > >
> > >     Rerun of the command -
> > >
> > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > >     Empty set (0.00 sec)
> > >
> > >     mysql>
> > >
> > >     I think it is something to do with my zone creation rather than the
> > > NIC, bridge setup although I can post those if needed.
> > >
> > >     I may try to setup just the 2 NIC solution you mentioned although
> as
> > I
> > > say I had the same issue with that ie. host goes to "Altert" state and
> > same
> > > error messages.  The only time I can get it to go to "Down" state is
> when
> > > it is all on the single NIC.
> > >
> > >     Quick question just to make sure - assuming management/storage is
> on
> > > the same NIC when I setup basic networking the physical network has the
> > > management and guest icons already there and I just edit the KVM
> labels.
> > If
> > > I am running storage over management do I need to drag the storage icon
> > to
> > > the physical network and use the same KVM label (cloudbr0) as the
> > > management or does CS automatically just use the management NIC ie. I
> > would
> > > only need to drag the storage icon across in basic setup if I wanted it
> > on
> > > a different NIC/IP subnet ?  (hope that makes sense !)
> > >
> > >     On the plus side I have been at this for so long now and done so
> many
> > > rebuilds I could do it in my sleep now 😊
> > >
> > >
> > >     ________________________________
> > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > >     Sent: 06 June 2018 12:28
> > >     To: users@cloudstack.apache.org
> > >     Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > >     Looks OK to me Jon.
> > >
> > >     The one thing that throws me is your storage pools – can you rerun
> > > your query: select * from cloud.storage_pool where cluster_id = 1;
> > >
> > >     Do the pools show up as online in the CloudStack GUI?
> > >
> > >     Regards,
> > >     Dag Sonstebo
> > >     Cloud Architect
> > >     ShapeBlue
> > >
> > >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk>
> wrote:
> > >
> > >         Don't know whether this helps or not but I logged into the SSVM
> > > and ran an ifconfig -
> > >
> > >
> > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > 169.254.255.255
> > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
> > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > >                 inet 172.30.3.34  netmask 255.255.255.192  broadcast
> > > 172.30.3.63
> > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
> > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > >                 inet 172.30.4.86  netmask 255.255.255.128  broadcast
> > > 172.30.4.127
> > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
> > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > >                 inet 172.30.5.14  netmask 255.255.255.240  broadcast
> > > 172.30.5.15
> > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
> > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > >                 inet 127.0.0.1  netmask 255.0.0.0
> > >                 loop  txqueuelen 1  (Local Loopback)
> > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >
> > >         so it has interfaces in both the management and the storage
> > > subnets (as well as guest).
> > >
> > >
> > >
> > >         ________________________________
> > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > >         Sent: 06 June 2018 11:08
> > >         To: users@cloudstack.apache.org
> > >         Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > >         Hi Rafael
> > >
> > >
> > >         Thanks for the help, really appreciate it.
> > >
> > >
> > >         So rerunning that command with all servers up -
> > >
> > >
> > >
> > >         mysql> select * from cloud.storage_pool where cluster_id = 1
> and
> > > removed is null;
> > >         Empty set (0.00 sec)
> > >
> > >         mysql>
> > >
> > >
> > >         As for the storage IP no I'm not setting it to be the
> management
> > > IP when I setup the zone but the output of the SQL command suggests
> that
> > is
> > > what has happened.
> > >
> > >         As I said to Dag I am using a different subnet for storage ie.
> > >
> > >         172.30.3.0/26  - management subnet
> > >         172.30.4.0/25 -  guest VM subnet
> > >         172.30.5.0/28 - storage
> > >
> > >         the NFS server IP is 172.30.5.2
> > >
> > >         each compute node has 3 NICs with an IP from each subnet (i am
> > > assuming the management node only needs an IP in the management network
> > ?)
> > >
> > >         When I add the zone in the UI I have one physical network with
> > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > >         When I fill in the storage traffic page I use the range
> > > 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to
> > the
> > > compute nodes and the NFS server.
> > >
> > >         I think maybe I am doing something wrong in the UI setup but it
> > is
> > > not obvious to me what it is.
> > >
> > >         What I might try today unless you want me to keep the setup I
> > have
> > > for more outputs is to go back to 2 NICs, one for storage/management
> and
> > > one for guest VMs.
> > >
> > >         I think with the 2 NICs setup the mistake I made last time when
> > > adding the zone was to assume storage would just run over management
> so I
> > > did not drag and drop the storage icon and assign it to cloudbr0 as
> with
> > > the management which I think is what I should do ?
> > >
> > >
> > >
> > >
> > >
> > >         ________________________________
> > >         From: Rafael Weingärtner <ra...@gmail.com>
> > >         Sent: 06 June 2018 10:54
> > >         To: users
> > >         Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > >         Jon, do not panic we are here to help you :)
> > >         So, I might have mistyped the SQL query. You you use select *
> > from
> > >         cloud.storage_pool where cluster_id = 1 and removed is not null
> > ",
> > > you are
> > >         listing the storage pools removed. Therefore, the right query
> > > would be "
> > >         select * from cloud.storage_pool where cluster_id = 1 and
> removed
> > > is null "
> > >
> > >         There is also something else I do not understand. You are
> setting
> > > the
> > >         storage IP in the management subnet? I am not sure if you
> should
> > > be doing
> > >         like this. Normally, I set all my storages (primary[when
> working
> > > with NFS]
> > >         and secondary) to IPs in the storage subnet.
> > >
> > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > Dag.Sonstebo@shapeblue.com>
> > >         wrote:
> > >
> > >         > Hi John,
> > >         >
> > >         > I’m late to this thread and have possibly missed some things
> –
> > > but a
> > >         > couple of observations:
> > >         >
> > >         > “When I add the zone and get to the storage web page I
> exclude
> > > the IPs
> > >         > already used for the compute node NICs and the NFS server
> > > itself. …..”
> > >         > “So the range is 172.30.5.1 -> 15 and the range I fill in is
> > > 172.30.5.10
> > >         > -> 172.30.5.14.”
> > >         >
> > >         > I think you may have some confusion around the use of the
> > > storage network.
> > >         > The important part here is to understand this is for
> *secondary
> > > storage*
> > >         > use only – it has nothing to do with primary storage. This
> > means
> > > this
> > >         > storage network needs to be accessible to the SSVM, to the
> > > hypervisors, and
> > >         > secondary storage NFS pools needs to be accessible on this
> > > network.
> > >         >
> > >         > The important part – this also means you *can not use the
> same
> > > IP ranges
> > >         > for management and storage networks* - doing so means you
> will
> > > have issues
> > >         > where effectively both hypervisors and SSVM can see the same
> > > subnet on two
> > >         > NICs – and you end up in a routing black hole.
> > >         >
> > >         > So – you need to either:
> > >         >
> > >         > 1) Use different IP subnets on management and storage, or
> > >         > 2) preferably just simplify your setup – stop using a
> secondary
> > > storage
> > >         > network altogether and just allow secondary storage to use
> the
> > > management
> > >         > network (which is default). Unless you have a very high I/O
> > > environment in
> > >         > production you are just adding complexity by running separate
> > > management
> > >         > and storage.
> > >         >
> > >         > Regards,
> > >         > Dag Sonstebo
> > >         > Cloud Architect
> > >         > ShapeBlue
> > >         >
> > >         > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk>
> > > wrote:
> > >         >
> > >         >     I will disconnect the host this morning and test but
> before
> > > I do that
> > >         > I ran this command when all hosts are up -
> > >         >
> > >         >
> > >         >
> > >         >
> > >         >
> > >         >      select * from cloud.host;
> > >         >     +----+-----------------+------
> > ------------------------------
> > >         > --+--------+--------------------+--------------------+------
> > >         > -----------+---------------------+--------------------+-----
> > >         > ------------+---------------------+----------------------+--
> > >         > ---------------------+-------------------+------------+-----
> > >         > --------------+-----------------+--------------------+------
> > >         > ------+----------------+--------+-------------+------+------
> > >         > -+-------------------------------------+---------+----------
> > >         > -------+--------------------+------------+----------+-------
> > >         > ---+--------+------------+--------------+-------------------
> > >         > --------------------------------------------+-----------+---
> > >         > ----+-------------+------------+----------------+-----------
> > >         > ----------+---------------------+---------+--------------+--
> > >         > --------------+-------+-------------+--------------+
> > >         >     | id | name            | uuid
> > >  | status
> > >         > | type               | private_ip_address | private_netmask |
> > >         > private_mac_address | storage_ip_address | storage_netmask |
> > >         > storage_mac_address | storage_ip_address_2 |
> > > storage_mac_address_2 |
> > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > public_netmask  |
> > >         > public_mac_address | proxy_port | data_center_id | pod_id |
> > > cpu_sockets |
> > >         > cpus | speed | url                                 | fs_type
> |
> > >         > hypervisor_type | hypervisor_version | ram        | resource
> |
> > > version  |
> > >         > parent | total_size | capabilities | guid
> > >         >                         | available | setup | dom0_memory |
> > > last_ping  |
> > >         > mgmt_server_id | disconnected        | created             |
> > > removed |
> > >         > update_count | resource_state | owner | lastUpdated |
> > > engine_state |
> > >         >     +----+-----------------+------
> > ------------------------------
> > >         > --+--------+--------------------+--------------------+------
> > >         > -----------+---------------------+--------------------+-----
> > >         > ------------+---------------------+----------------------+--
> > >         > ---------------------+-------------------+------------+-----
> > >         > --------------+-----------------+--------------------+------
> > >         > ------+----------------+--------+-------------+------+------
> > >         > -+-------------------------------------+---------+----------
> > >         > -------+--------------------+------------+----------+-------
> > >         > ---+--------+------------+--------------+-------------------
> > >         > --------------------------------------------+-----------+---
> > >         > ----+-------------+------------+----------------+-----------
> > >         > ----------+---------------------+---------+--------------+--
> > >         > --------------+-------+-------------+--------------+
> > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > eabd60012284
> > > | Up
> > >         >  | Routing            | 172.30.3.3         | 255.255.255.192
> |
> > >         > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > >   | NULL
> > >         >             |          1 | 172.30.4.3        |
> 255.255.255.128
> > |
> > >         > 00:22:19:92:4e:35  |       NULL |              1 |      1 |
> > >      1 |
> > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL
> |
> > > KVM
> > >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0
> |
> > > NULL   |
> > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> f83e7a5e8efb-
> > > LibvirtComputingResource
> > >         > |         1 |     0 |           0 | 1492390408 |
> >  146457912294 |
> > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > 4 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > 3f5e2c2215b0
> > > | Up
> > >         >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192
> |
> > >         > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
> > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > >   | NULL
> > >         >             |       NULL | 172.30.4.98       |
> 255.255.255.128
> > |
> > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |
> > >   NULL |
> > >         > NULL |  NULL | NoIqn                               | NULL
> |
> > > NULL
> > >         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> > > NULL   |
> > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > >         >           |         1 |     0 |           0 | 1492390409 |
> > >  146457912294 |
> > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > 7 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > ff3845bb556c
> > > | Up
> > >         >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192
> |
> > >         > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
> > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > >   | NULL
> > >         >             |       NULL | 172.30.4.86       |
> 255.255.255.128
> > |
> > >         > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |
> > >   NULL |
> > >         > NULL |  NULL | NoIqn                               | NULL
> |
> > > NULL
> > >         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> > > NULL   |
> > >         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> > >         >             |         1 |     0 |           0 | 1492390407 |
> > >  146457912294
> > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > >   7 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > 1a96ffac9794
> > > | Up
> > >         >  | Routing            | 172.30.3.4         | 255.255.255.192
> |
> > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > >   | NULL
> > >         >             |          1 | 172.30.4.4        |
> 255.255.255.128
> > |
> > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |
> > >      1 |
> > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL
> |
> > > KVM
> > >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0
> |
> > > NULL   |
> > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> 16d0f99b11c9-
> > > LibvirtComputingResource
> > >         > |         1 |     0 |           0 | 1492450882 |
> >  146457912294 |
> > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > 8 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > ee29d2b5c274
> > > | Up
> > >         >  | Routing            | 172.30.3.5         | 255.255.255.192
> |
> > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > >   | NULL
> > >         >             |          1 | 172.30.4.5        |
> 255.255.255.128
> > |
> > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |
> > >      1 |
> > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL
> |
> > > KVM
> > >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0
> |
> > > NULL   |
> > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> 37f3eebad8fb-
> > > LibvirtComputingResource
> > >         > |         1 |     0 |           0 | 1492390408 |
> >  146457912294 |
> > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > 6 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     +----+-----------------+------
> > ------------------------------
> > >         > --+--------+--------------------+--------------------+------
> > >         > -----------+---------------------+--------------------+-----
> > >         > ------------+---------------------+----------------------+--
> > >         > ---------------------+-------------------+------------+-----
> > >         > --------------+-----------------+--------------------+------
> > >         > ------+----------------+--------+-------------+------+------
> > >         > -+-------------------------------------+---------+----------
> > >         > -------+--------------------+------------+----------+-------
> > >         > ---+--------+------------+--------------+-------------------
> > >         > --------------------------------------------+-----------+---
> > >         > ----+-------------+------------+----------------+-----------
> > >         > ----------+---------------------+---------+--------------+--
> > >         > --------------+-------+-------------+--------------+
> > >         >     5 rows in set (0.00 sec)
> > >         >
> > >         >
> > >         >
> > >         >     and you can see that it says the storage IP address is
> the
> > > same as the
> > >         > private IP address (the management network).
> > >         >
> > >         >
> > >         >     I also ran the command you provided using the Cluster ID
> > > number from
> > >         > the table above -
> > >         >
> > >         >
> > >         >
> > >         >     mysql> select * from cloud.storage_pool where cluster_id
> =
> > 1
> > > and
> > >         > removed is not null;
> > >         >     Empty set (0.00 sec)
> > >         >
> > >         >     mysql>
> > >         >
> > >         >     So assuming I am reading this correctly that seems to be
> > the
> > > issue.
> > >         >
> > >         >
> > >         >     I am at a loss as to why though.
> > >         >
> > >         >
> > >         >     I have a separate NIC for storage as described. When I
> add
> > > the zone
> > >         > and get to the storage web page I exclude the IPs already
> used
> > > for the
> > >         > compute node NICs and the NFS server itself. I do this
> because
> > > initially I
> > >         > didn't and the SSVM started using the IP address of the NFS
> > > server.
> > >         >
> > >         >
> > >         >     So the range is 172.30.5.1 -> 15 and the range I fill in
> is
> > >         > 172.30.5.10 -> 172.30.5.14.
> > >         >
> > >         >
> > >         >     And I used the label "cloudbr2" for storage.
> > >         >
> > >         >
> > >         >     I must be doing this wrong somehow.
> > >         >
> > >         >
> > >         >     Any pointers would be much appreciated.
> > >         >
> > >         >
> > >         >
> > >         >
> > >         >     ________________________________
> > >         >     From: Rafael Weingärtner <ra...@gmail.com>
> > >         >     Sent: 05 June 2018 16:13
> > >         >     To: users
> > >         >     Subject: Re: advanced networking with public IPs direct
> to
> > > VMs
> > >         >
> > >         >     That is interesting. Let's see the source of all truth...
> > >         >     This is the code that is generating that odd message.
> > >         >
> > >         >     >     List<StoragePoolVO> clusterPools =
> > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > getClusterId());
> > >         >     >         boolean hasNfs = false;
> > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > >         >     >             if (pool.getPoolType() == StoragePoolType.
> > > NetworkFilesystem)
> > >         > {
> > >         >     >                 hasNfs = true;
> > >         >     >                 break;
> > >         >     >             }
> > >         >     >         }
> > >         >     >         if (!hasNfs) {
> > >         >     >             s_logger.warn(
> > >         >     >                     "Agent investigation was requested
> on
> > > host " +
> > >         > agent +
> > >         >     > ", but host does not support investigation because it
> has
> > > no NFS
> > >         > storage.
> > >         >     > Skipping investigation.");
> > >         >     >             return Status.Disconnected;
> > >         >     >         }
> > >         >     >
> > >         >
> > >         >     There are two possibilities here. You do not have any NFS
> > > storage? Is
> > >         > that
> > >         >     the case? Or maybe, for some reason, the call
> > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> getClusterId())"
> > > is not
> > >         > returning
> > >         >     any NFS storage pools. Looking at the
> "listPoolsByCluster "
> > > we will see
> > >         >     that the following SQL is used:
> > >         >
> > >         >     Select * from storage_pool where cluster_id =
> > > <host'sClusterId> and
> > >         > removed
> > >         >     > is not null
> > >         >     >
> > >         >
> > >         >     Can you run that SQL to see the its return when your
> hosts
> > > are marked
> > >         > as
> > >         >     disconnected?
> > >         >
> > >         >
> > >         > Dag.Sonstebo@shapeblue.com
> > >         > www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > >         Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >         www.shapeblue.com<http://www.shapeblue.com>
> > >         ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >         > @shapeblue
> > >         >
> > >         >
> > >         >
> > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > jms.123@hotmail.co.uk>
> > >         > wrote:
> > >         >
> > >         >     > I reran the tests with the 3 NIC setup. When I
> configured
> > > the zone
> > >         > through
> > >         >     > the UI I used the labels cloudbr0 for management,
> > cloudbr1
> > > for guest
> > >         >     > traffic and cloudbr2 for NFS as per my original
> response
> > > to you.
> > >         >     >
> > >         >     >
> > >         >     > When I pull the power to the node (dcp-cscn2.local)
> after
> > > about 5
> > >         > mins
> > >         >     > the  host status goes to "Alert" but never to "Down"
> > >         >     >
> > >         >     >
> > >         >     > I get this in the logs -
> > >         >     >
> > >         >     >
> > >         >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> > > investigation
> > >         > was
> > >         >     > requested on host Host[-4-Routing], but host does not
> > > support
> > >         > investigation
> > >         >     > because it has no NFS storage. Skipping investigation.
> > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > HighAvailabilityManagerImpl]
> > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > KVMInvestigator was
> > >         > able to
> > >         >     > determine host 4 is in Disconnected
> > >         >     > 2018-06-05 15:17:14,382 INFO
> [c.c.a.m.AgentManagerImpl]
> > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The
> agent
> > > from host
> > >         > 4 state
> > >         >     > determined is Disconnected
> > >         >     > 2018-06-05 15:17:14,382 WARN
> [c.c.a.m.AgentManagerImpl]
> > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> is
> > >         > disconnected but
> > >         >     > the host is still up: 4-dcp-cscn2.local
> > >         >     >
> > >         >     > I don't understand why it thinks there is no NFS
> storage
> > > as each
> > >         > compute
> > >         >     > node has a dedicated storage NIC.
> > >         >     >
> > >         >     >
> > >         >     > I also don't understand why it thinks the host is still
> > up
> > > ie. what
> > >         > test
> > >         >     > is it doing to determine that ?
> > >         >     >
> > >         >     >
> > >         >     > Am I just trying to get something working that is not
> > > supported ?
> > >         >     >
> > >         >     >
> > >         >     > ________________________________
> > >         >     > From: Rafael Weingärtner <ra...@gmail.com>
> > >         >     > Sent: 04 June 2018 15:31
> > >         >     > To: users
> > >         >     > Subject: Re: advanced networking with public IPs direct
> > to
> > > VMs
> > >         >     >
> > >         >     > What type of failover are you talking about?
> > >         >     > What ACS version are you using?
> > >         >     > What hypervisor are you using?
> > >         >     > How are you configuring your NICs in the hypervisor?
> > >         >     > How are you configuring the traffic labels in ACS?
> > >         >     >
> > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > jms.123@hotmail.co.uk
> > >         > >
> > >         >     > wrote:
> > >         >     >
> > >         >     > > Hi all
> > >         >     > >
> > >         >     > >
> > >         >     > > I am close to giving up on basic networking as I just
> > > cannot get
> > >         > failover
> > >         >     > > working with multiple NICs (I am not even sure it is
> > > supported).
> > >         >     > >
> > >         >     > >
> > >         >     > > What I would like is to use 3 NICs for management,
> > > storage and
> > >         > guest
> > >         >     > > traffic. I would like to assign public IPs direct to
> > the
> > > VMs which
> > >         > is
> > >         >     > why I
> > >         >     > > originally chose basic.
> > >         >     > >
> > >         >     > >
> > >         >     > > If I switch to advanced networking do I just
> configure
> > a
> > > guest VM
> > >         > with
> > >         >     > > public IPs on one NIC and not both with the public
> > > traffic -
> > >         >     > >
> > >         >     > >
> > >         >     > > would this work ?
> > >         >     > >
> > >         >     >
> > >         >     >
> > >         >     >
> > >         >     > --
> > >         >     > Rafael Weingärtner
> > >         >     >
> > >         >
> > >         >
> > >         >
> > >         >     --
> > >         >     Rafael Weingärtner
> > >         >
> > >         >
> > >         >
> > >
> > >
> > >         --
> > >         Rafael Weingärtner
> > >
> > >
> > >
> > >     Dag.Sonstebo@shapeblue.com
> > >     www.shapeblue.com<http://www.shapeblue.com>
> > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >     @shapeblue
> > >
> > >
> > >
> > >
> > >
> > >
> > > Dag.Sonstebo@shapeblue.com
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > @shapeblue
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> Rafael Weingärtner
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
Ok, so that explains the log message. This is looking like a bug to me. It
seems that in Zone wide the host state (when disconnected) is not being
properly identified due to this NFS thing, and as a consequency it has a
side effect in VM HA.

We would need some inputs from guys that have advanced networking
deployments and Zone wide storage.

I do not see how the all in one NIC deployment scenario is working though.
This method "com.cloud.ha.KVMInvestigator.isAgentAlive(Host)" is dead
simple, if there is no NFS in the cluster (NFS storage pools found for a
host's cluster), KVM hosts will be detected as "disconnected" and not down
with that warning message you noticed.

When you say "all in one NIC", is it an advanced network deployment where
you put all traffic in a single network, or is it a basic networking that
you are doing?

On Thu, Jun 7, 2018 at 6:06 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> zone wide.
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 07 June 2018 10:04
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> What type of storage are you using? Zone wide? Or cluster "wide" storage?
>
> On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Rafael
> >
> >
> > Here is the output as requested -
> >
> >
> >
> > mysql> mysql> select * from cloud.storage_pool where removed is null;
> > +----+------+--------------------------------------+--------
> > -----------+------+----------------+--------+------------+--
> > ----------+----------------+--------------+-----------+-----
> > ------------+---------------------+---------+-------------+-
> > -------+-----------------------+-------+------------+-------
> > --+---------------+
> > | id | name | uuid                                 | pool_type         |
> > port | data_center_id | pod_id | cluster_id | used_bytes |
> capacity_bytes |
> > host_address | user_info | path            | created             |
> removed
> > | update_time | status | storage_provider_name | scope | hypervisor |
> > managed | capacity_iops |
> > +----+------+--------------------------------------+--------
> > -----------+------+----------------+--------+------------+--
> > ----------+----------------+--------------+-----------+-----
> > ------------+---------------------+---------+-------------+-
> > -------+-----------------------+-------+------------+-------
> > --+---------------+
> > |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 | NetworkFilesystem |
> > 2049 |              1 |   NULL |       NULL | 6059720704 |
> 79133933568 |
> > 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 | NULL
> > | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |
> >  0 |          NULL |
> > +----+------+--------------------------------------+--------
> > -----------+------+----------------+--------+------------+--
> > ----------+----------------+--------------+-----------+-----
> > ------------+---------------------+---------+-------------+-
> > -------+-----------------------+-------+------------+-------
> > --+---------------+
> > 1 row in set (0.00 sec)
> >
> > mysql>
> >
> > Do you think this problem is related to my NIC/bridge configuration or
> the
> > way I am configuring the zone ?
> >
> > Jon
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 07 June 2018 06:45
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > Can you also post the result of:
> > select * from cloud.storage_pool where removed is null
> >
> > On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <Dag.Sonstebo@shapeblue.com
> >
> > wrote:
> >
> > > Hi Jon,
> > >
> > > Still confused where your primary storage pools are – are you sure your
> > > hosts are in cluster 1?
> > >
> > > Quick question just to make sure - assuming management/storage is on
> the
> > > same NIC when I setup basic networking the physical network has the
> > > management and guest icons already there and I just edit the KVM
> labels.
> > If
> > > I am running storage over management do I need to drag the storage icon
> > to
> > > the physical network and use the same KVM label (cloudbr0) as the
> > > management or does CS automatically just use the management NIC ie. I
> > would
> > > only need to drag the storage icon across in basic setup if I wanted it
> > on
> > > a different NIC/IP subnet ?  (hope that makes sense !)
> > >
> > > >> I would do both – set up your 2/3 physical networks, name isn’t that
> > > important – but then drag the traffic types to the correct one and make
> > > sure the labels are correct.
> > > Regards,
> > > Dag Sonstebo
> > > Cloud Architect
> > > ShapeBlue
> > >
> > > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
> > >
> > >     Dag
> > >
> > >
> > >     Do you mean  check the pools with "Infrastructure -> Primary
> Storage"
> > > and "Infrastructure -> Secondary Storage" within the UI ?
> > >
> > >
> > >     If so Primary Storage has a state of UP, secondary storage does not
> > > show a state as such so not sure where else to check it ?
> > >
> > >
> > >     Rerun of the command -
> > >
> > >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> > >     Empty set (0.00 sec)
> > >
> > >     mysql>
> > >
> > >     I think it is something to do with my zone creation rather than the
> > > NIC, bridge setup although I can post those if needed.
> > >
> > >     I may try to setup just the 2 NIC solution you mentioned although
> as
> > I
> > > say I had the same issue with that ie. host goes to "Altert" state and
> > same
> > > error messages.  The only time I can get it to go to "Down" state is
> when
> > > it is all on the single NIC.
> > >
> > >     Quick question just to make sure - assuming management/storage is
> on
> > > the same NIC when I setup basic networking the physical network has the
> > > management and guest icons already there and I just edit the KVM
> labels.
> > If
> > > I am running storage over management do I need to drag the storage icon
> > to
> > > the physical network and use the same KVM label (cloudbr0) as the
> > > management or does CS automatically just use the management NIC ie. I
> > would
> > > only need to drag the storage icon across in basic setup if I wanted it
> > on
> > > a different NIC/IP subnet ?  (hope that makes sense !)
> > >
> > >     On the plus side I have been at this for so long now and done so
> many
> > > rebuilds I could do it in my sleep now 😊
> > >
> > >
> > >     ________________________________
> > >     From: Dag Sonstebo <Da...@shapeblue.com>
> > >     Sent: 06 June 2018 12:28
> > >     To: users@cloudstack.apache.org
> > >     Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > >     Looks OK to me Jon.
> > >
> > >     The one thing that throws me is your storage pools – can you rerun
> > > your query: select * from cloud.storage_pool where cluster_id = 1;
> > >
> > >     Do the pools show up as online in the CloudStack GUI?
> > >
> > >     Regards,
> > >     Dag Sonstebo
> > >     Cloud Architect
> > >     ShapeBlue
> > >
> > >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk>
> wrote:
> > >
> > >         Don't know whether this helps or not but I logged into the SSVM
> > > and ran an ifconfig -
> > >
> > >
> > >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > > 169.254.255.255
> > >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
> > >                 RX packets 141  bytes 20249 (19.7 KiB)
> > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > >                 TX packets 108  bytes 16287 (15.9 KiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > >                 inet 172.30.3.34  netmask 255.255.255.192  broadcast
> > > 172.30.3.63
> > >                 ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
> > >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> > >                 RX errors 0  dropped 44573  overruns 0  frame 0
> > >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > >                 inet 172.30.4.86  netmask 255.255.255.128  broadcast
> > > 172.30.4.127
> > >                 ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
> > >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> > >                 RX errors 0  dropped 39456  overruns 0  frame 0
> > >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> > >                 inet 172.30.5.14  netmask 255.255.255.240  broadcast
> > > 172.30.5.15
> > >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
> > >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> > >                 RX errors 0  dropped 39446  overruns 0  frame 0
> > >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> > >                 inet 127.0.0.1  netmask 255.0.0.0
> > >                 loop  txqueuelen 1  (Local Loopback)
> > >                 RX packets 18  bytes 1440 (1.4 KiB)
> > >                 RX errors 0  dropped 0  overruns 0  frame 0
> > >                 TX packets 18  bytes 1440 (1.4 KiB)
> > >                 TX errors 0  dropped 0 overruns 0  carrier 0
> collisions
> > 0
> > >
> > >
> > >         so it has interfaces in both the management and the storage
> > > subnets (as well as guest).
> > >
> > >
> > >
> > >         ________________________________
> > >         From: Jon Marshall <jm...@hotmail.co.uk>
> > >         Sent: 06 June 2018 11:08
> > >         To: users@cloudstack.apache.org
> > >         Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > >         Hi Rafael
> > >
> > >
> > >         Thanks for the help, really appreciate it.
> > >
> > >
> > >         So rerunning that command with all servers up -
> > >
> > >
> > >
> > >         mysql> select * from cloud.storage_pool where cluster_id = 1
> and
> > > removed is null;
> > >         Empty set (0.00 sec)
> > >
> > >         mysql>
> > >
> > >
> > >         As for the storage IP no I'm not setting it to be the
> management
> > > IP when I setup the zone but the output of the SQL command suggests
> that
> > is
> > > what has happened.
> > >
> > >         As I said to Dag I am using a different subnet for storage ie.
> > >
> > >         172.30.3.0/26  - management subnet
> > >         172.30.4.0/25 -  guest VM subnet
> > >         172.30.5.0/28 - storage
> > >
> > >         the NFS server IP is 172.30.5.2
> > >
> > >         each compute node has 3 NICs with an IP from each subnet (i am
> > > assuming the management node only needs an IP in the management network
> > ?)
> > >
> > >         When I add the zone in the UI I have one physical network with
> > > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> > >         When I fill in the storage traffic page I use the range
> > > 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to
> > the
> > > compute nodes and the NFS server.
> > >
> > >         I think maybe I am doing something wrong in the UI setup but it
> > is
> > > not obvious to me what it is.
> > >
> > >         What I might try today unless you want me to keep the setup I
> > have
> > > for more outputs is to go back to 2 NICs, one for storage/management
> and
> > > one for guest VMs.
> > >
> > >         I think with the 2 NICs setup the mistake I made last time when
> > > adding the zone was to assume storage would just run over management
> so I
> > > did not drag and drop the storage icon and assign it to cloudbr0 as
> with
> > > the management which I think is what I should do ?
> > >
> > >
> > >
> > >
> > >
> > >         ________________________________
> > >         From: Rafael Weingärtner <ra...@gmail.com>
> > >         Sent: 06 June 2018 10:54
> > >         To: users
> > >         Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > >         Jon, do not panic we are here to help you :)
> > >         So, I might have mistyped the SQL query. You you use select *
> > from
> > >         cloud.storage_pool where cluster_id = 1 and removed is not null
> > ",
> > > you are
> > >         listing the storage pools removed. Therefore, the right query
> > > would be "
> > >         select * from cloud.storage_pool where cluster_id = 1 and
> removed
> > > is null "
> > >
> > >         There is also something else I do not understand. You are
> setting
> > > the
> > >         storage IP in the management subnet? I am not sure if you
> should
> > > be doing
> > >         like this. Normally, I set all my storages (primary[when
> working
> > > with NFS]
> > >         and secondary) to IPs in the storage subnet.
> > >
> > >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > > Dag.Sonstebo@shapeblue.com>
> > >         wrote:
> > >
> > >         > Hi John,
> > >         >
> > >         > I’m late to this thread and have possibly missed some things
> –
> > > but a
> > >         > couple of observations:
> > >         >
> > >         > “When I add the zone and get to the storage web page I
> exclude
> > > the IPs
> > >         > already used for the compute node NICs and the NFS server
> > > itself. …..”
> > >         > “So the range is 172.30.5.1 -> 15 and the range I fill in is
> > > 172.30.5.10
> > >         > -> 172.30.5.14.”
> > >         >
> > >         > I think you may have some confusion around the use of the
> > > storage network.
> > >         > The important part here is to understand this is for
> *secondary
> > > storage*
> > >         > use only – it has nothing to do with primary storage. This
> > means
> > > this
> > >         > storage network needs to be accessible to the SSVM, to the
> > > hypervisors, and
> > >         > secondary storage NFS pools needs to be accessible on this
> > > network.
> > >         >
> > >         > The important part – this also means you *can not use the
> same
> > > IP ranges
> > >         > for management and storage networks* - doing so means you
> will
> > > have issues
> > >         > where effectively both hypervisors and SSVM can see the same
> > > subnet on two
> > >         > NICs – and you end up in a routing black hole.
> > >         >
> > >         > So – you need to either:
> > >         >
> > >         > 1) Use different IP subnets on management and storage, or
> > >         > 2) preferably just simplify your setup – stop using a
> secondary
> > > storage
> > >         > network altogether and just allow secondary storage to use
> the
> > > management
> > >         > network (which is default). Unless you have a very high I/O
> > > environment in
> > >         > production you are just adding complexity by running separate
> > > management
> > >         > and storage.
> > >         >
> > >         > Regards,
> > >         > Dag Sonstebo
> > >         > Cloud Architect
> > >         > ShapeBlue
> > >         >
> > >         > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk>
> > > wrote:
> > >         >
> > >         >     I will disconnect the host this morning and test but
> before
> > > I do that
> > >         > I ran this command when all hosts are up -
> > >         >
> > >         >
> > >         >
> > >         >
> > >         >
> > >         >      select * from cloud.host;
> > >         >     +----+-----------------+------
> > ------------------------------
> > >         > --+--------+--------------------+--------------------+------
> > >         > -----------+---------------------+--------------------+-----
> > >         > ------------+---------------------+----------------------+--
> > >         > ---------------------+-------------------+------------+-----
> > >         > --------------+-----------------+--------------------+------
> > >         > ------+----------------+--------+-------------+------+------
> > >         > -+-------------------------------------+---------+----------
> > >         > -------+--------------------+------------+----------+-------
> > >         > ---+--------+------------+--------------+-------------------
> > >         > --------------------------------------------+-----------+---
> > >         > ----+-------------+------------+----------------+-----------
> > >         > ----------+---------------------+---------+--------------+--
> > >         > --------------+-------+-------------+--------------+
> > >         >     | id | name            | uuid
> > >  | status
> > >         > | type               | private_ip_address | private_netmask |
> > >         > private_mac_address | storage_ip_address | storage_netmask |
> > >         > storage_mac_address | storage_ip_address_2 |
> > > storage_mac_address_2 |
> > >         > storage_netmask_2 | cluster_id | public_ip_address |
> > > public_netmask  |
> > >         > public_mac_address | proxy_port | data_center_id | pod_id |
> > > cpu_sockets |
> > >         > cpus | speed | url                                 | fs_type
> |
> > >         > hypervisor_type | hypervisor_version | ram        | resource
> |
> > > version  |
> > >         > parent | total_size | capabilities | guid
> > >         >                         | available | setup | dom0_memory |
> > > last_ping  |
> > >         > mgmt_server_id | disconnected        | created             |
> > > removed |
> > >         > update_count | resource_state | owner | lastUpdated |
> > > engine_state |
> > >         >     +----+-----------------+------
> > ------------------------------
> > >         > --+--------+--------------------+--------------------+------
> > >         > -----------+---------------------+--------------------+-----
> > >         > ------------+---------------------+----------------------+--
> > >         > ---------------------+-------------------+------------+-----
> > >         > --------------+-----------------+--------------------+------
> > >         > ------+----------------+--------+-------------+------+------
> > >         > -+-------------------------------------+---------+----------
> > >         > -------+--------------------+------------+----------+-------
> > >         > ---+--------+------------+--------------+-------------------
> > >         > --------------------------------------------+-----------+---
> > >         > ----+-------------+------------+----------------+-----------
> > >         > ----------+---------------------+---------+--------------+--
> > >         > --------------+-------+-------------+--------------+
> > >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> > eabd60012284
> > > | Up
> > >         >  | Routing            | 172.30.3.3         | 255.255.255.192
> |
> > >         > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> > >         > 00:22:19:92:4e:34   | NULL                 | NULL
> > >   | NULL
> > >         >             |          1 | 172.30.4.3        |
> 255.255.255.128
> > |
> > >         > 00:22:19:92:4e:35  |       NULL |              1 |      1 |
> > >      1 |
> > >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL
> |
> > > KVM
> > >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0
> |
> > > NULL   |
> > >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-
> f83e7a5e8efb-
> > > LibvirtComputingResource
> > >         > |         1 |     0 |           0 | 1492390408 |
> >  146457912294 |
> > >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > > 4 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> > 3f5e2c2215b0
> > > | Up
> > >         >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192
> |
> > >         > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
> > >         > 1e:00:80:00:00:14   | NULL                 | NULL
> > >   | NULL
> > >         >             |       NULL | 172.30.4.98       |
> 255.255.255.128
> > |
> > >         > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |
> > >   NULL |
> > >         > NULL |  NULL | NoIqn                               | NULL
> |
> > > NULL
> > >         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> > > NULL   |
> > >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> > >         >           |         1 |     0 |           0 | 1492390409 |
> > >  146457912294 |
> > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > > 7 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> > ff3845bb556c
> > > | Up
> > >         >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192
> |
> > >         > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
> > >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> > >   | NULL
> > >         >             |       NULL | 172.30.4.86       |
> 255.255.255.128
> > |
> > >         > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |
> > >   NULL |
> > >         > NULL |  NULL | NoIqn                               | NULL
> |
> > > NULL
> > >         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> > > NULL   |
> > >         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> > >         >             |         1 |     0 |           0 | 1492390407 |
> > >  146457912294
> > >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> > >   7 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> > 1a96ffac9794
> > > | Up
> > >         >  | Routing            | 172.30.3.4         | 255.255.255.192
> |
> > >         > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> > >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> > >   | NULL
> > >         >             |          1 | 172.30.4.4        |
> 255.255.255.128
> > |
> > >         > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |
> > >      1 |
> > >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL
> |
> > > KVM
> > >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0
> |
> > > NULL   |
> > >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-
> 16d0f99b11c9-
> > > LibvirtComputingResource
> > >         > |         1 |     0 |           0 | 1492450882 |
> >  146457912294 |
> > >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > > 8 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> > ee29d2b5c274
> > > | Up
> > >         >  | Routing            | 172.30.3.5         | 255.255.255.192
> |
> > >         > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> > >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> > >   | NULL
> > >         >             |          1 | 172.30.4.5        |
> 255.255.255.128
> > |
> > >         > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |
> > >      1 |
> > >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL
> |
> > > KVM
> > >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0
> |
> > > NULL   |
> > >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-
> 37f3eebad8fb-
> > > LibvirtComputingResource
> > >         > |         1 |     0 |           0 | 1492390408 |
> >  146457912294 |
> > >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > > 6 |
> > >         > Enabled        | NULL  | NULL        | Disabled     |
> > >         >     +----+-----------------+------
> > ------------------------------
> > >         > --+--------+--------------------+--------------------+------
> > >         > -----------+---------------------+--------------------+-----
> > >         > ------------+---------------------+----------------------+--
> > >         > ---------------------+-------------------+------------+-----
> > >         > --------------+-----------------+--------------------+------
> > >         > ------+----------------+--------+-------------+------+------
> > >         > -+-------------------------------------+---------+----------
> > >         > -------+--------------------+------------+----------+-------
> > >         > ---+--------+------------+--------------+-------------------
> > >         > --------------------------------------------+-----------+---
> > >         > ----+-------------+------------+----------------+-----------
> > >         > ----------+---------------------+---------+--------------+--
> > >         > --------------+-------+-------------+--------------+
> > >         >     5 rows in set (0.00 sec)
> > >         >
> > >         >
> > >         >
> > >         >     and you can see that it says the storage IP address is
> the
> > > same as the
> > >         > private IP address (the management network).
> > >         >
> > >         >
> > >         >     I also ran the command you provided using the Cluster ID
> > > number from
> > >         > the table above -
> > >         >
> > >         >
> > >         >
> > >         >     mysql> select * from cloud.storage_pool where cluster_id
> =
> > 1
> > > and
> > >         > removed is not null;
> > >         >     Empty set (0.00 sec)
> > >         >
> > >         >     mysql>
> > >         >
> > >         >     So assuming I am reading this correctly that seems to be
> > the
> > > issue.
> > >         >
> > >         >
> > >         >     I am at a loss as to why though.
> > >         >
> > >         >
> > >         >     I have a separate NIC for storage as described. When I
> add
> > > the zone
> > >         > and get to the storage web page I exclude the IPs already
> used
> > > for the
> > >         > compute node NICs and the NFS server itself. I do this
> because
> > > initially I
> > >         > didn't and the SSVM started using the IP address of the NFS
> > > server.
> > >         >
> > >         >
> > >         >     So the range is 172.30.5.1 -> 15 and the range I fill in
> is
> > >         > 172.30.5.10 -> 172.30.5.14.
> > >         >
> > >         >
> > >         >     And I used the label "cloudbr2" for storage.
> > >         >
> > >         >
> > >         >     I must be doing this wrong somehow.
> > >         >
> > >         >
> > >         >     Any pointers would be much appreciated.
> > >         >
> > >         >
> > >         >
> > >         >
> > >         >     ________________________________
> > >         >     From: Rafael Weingärtner <ra...@gmail.com>
> > >         >     Sent: 05 June 2018 16:13
> > >         >     To: users
> > >         >     Subject: Re: advanced networking with public IPs direct
> to
> > > VMs
> > >         >
> > >         >     That is interesting. Let's see the source of all truth...
> > >         >     This is the code that is generating that odd message.
> > >         >
> > >         >     >     List<StoragePoolVO> clusterPools =
> > >         >     > _storagePoolDao.listPoolsByCluster(agent.
> > getClusterId());
> > >         >     >         boolean hasNfs = false;
> > >         >     >         for (StoragePoolVO pool : clusterPools) {
> > >         >     >             if (pool.getPoolType() == StoragePoolType.
> > > NetworkFilesystem)
> > >         > {
> > >         >     >                 hasNfs = true;
> > >         >     >                 break;
> > >         >     >             }
> > >         >     >         }
> > >         >     >         if (!hasNfs) {
> > >         >     >             s_logger.warn(
> > >         >     >                     "Agent investigation was requested
> on
> > > host " +
> > >         > agent +
> > >         >     > ", but host does not support investigation because it
> has
> > > no NFS
> > >         > storage.
> > >         >     > Skipping investigation.");
> > >         >     >             return Status.Disconnected;
> > >         >     >         }
> > >         >     >
> > >         >
> > >         >     There are two possibilities here. You do not have any NFS
> > > storage? Is
> > >         > that
> > >         >     the case? Or maybe, for some reason, the call
> > >         >     "_storagePoolDao.listPoolsByCluster(agent.
> getClusterId())"
> > > is not
> > >         > returning
> > >         >     any NFS storage pools. Looking at the
> "listPoolsByCluster "
> > > we will see
> > >         >     that the following SQL is used:
> > >         >
> > >         >     Select * from storage_pool where cluster_id =
> > > <host'sClusterId> and
> > >         > removed
> > >         >     > is not null
> > >         >     >
> > >         >
> > >         >     Can you run that SQL to see the its return when your
> hosts
> > > are marked
> > >         > as
> > >         >     disconnected?
> > >         >
> > >         >
> > >         > Dag.Sonstebo@shapeblue.com
> > >         > www.shapeblue.com<http://www.shapeblue.com>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com<http://www.shapeblue.com>
> > ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> > >         Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > >         www.shapeblue.com<http://www.shapeblue.com>
> > >         ShapeBlue are the largest independent integrator of CloudStack
> > > technologies globally and are specialists in the design and
> > implementation
> > > of IaaS cloud infrastructures for both private and public cloud
> > > implementations.
> > >
> > >
> > >
> > >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >         > @shapeblue
> > >         >
> > >         >
> > >         >
> > >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > > jms.123@hotmail.co.uk>
> > >         > wrote:
> > >         >
> > >         >     > I reran the tests with the 3 NIC setup. When I
> configured
> > > the zone
> > >         > through
> > >         >     > the UI I used the labels cloudbr0 for management,
> > cloudbr1
> > > for guest
> > >         >     > traffic and cloudbr2 for NFS as per my original
> response
> > > to you.
> > >         >     >
> > >         >     >
> > >         >     > When I pull the power to the node (dcp-cscn2.local)
> after
> > > about 5
> > >         > mins
> > >         >     > the  host status goes to "Alert" but never to "Down"
> > >         >     >
> > >         >     >
> > >         >     > I get this in the logs -
> > >         >     >
> > >         >     >
> > >         >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> > > investigation
> > >         > was
> > >         >     > requested on host Host[-4-Routing], but host does not
> > > support
> > >         > investigation
> > >         >     > because it has no NFS storage. Skipping investigation.
> > >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > > HighAvailabilityManagerImpl]
> > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > > KVMInvestigator was
> > >         > able to
> > >         >     > determine host 4 is in Disconnected
> > >         >     > 2018-06-05 15:17:14,382 INFO
> [c.c.a.m.AgentManagerImpl]
> > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The
> agent
> > > from host
> > >         > 4 state
> > >         >     > determined is Disconnected
> > >         >     > 2018-06-05 15:17:14,382 WARN
> [c.c.a.m.AgentManagerImpl]
> > >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> is
> > >         > disconnected but
> > >         >     > the host is still up: 4-dcp-cscn2.local
> > >         >     >
> > >         >     > I don't understand why it thinks there is no NFS
> storage
> > > as each
> > >         > compute
> > >         >     > node has a dedicated storage NIC.
> > >         >     >
> > >         >     >
> > >         >     > I also don't understand why it thinks the host is still
> > up
> > > ie. what
> > >         > test
> > >         >     > is it doing to determine that ?
> > >         >     >
> > >         >     >
> > >         >     > Am I just trying to get something working that is not
> > > supported ?
> > >         >     >
> > >         >     >
> > >         >     > ________________________________
> > >         >     > From: Rafael Weingärtner <ra...@gmail.com>
> > >         >     > Sent: 04 June 2018 15:31
> > >         >     > To: users
> > >         >     > Subject: Re: advanced networking with public IPs direct
> > to
> > > VMs
> > >         >     >
> > >         >     > What type of failover are you talking about?
> > >         >     > What ACS version are you using?
> > >         >     > What hypervisor are you using?
> > >         >     > How are you configuring your NICs in the hypervisor?
> > >         >     > How are you configuring the traffic labels in ACS?
> > >         >     >
> > >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > > jms.123@hotmail.co.uk
> > >         > >
> > >         >     > wrote:
> > >         >     >
> > >         >     > > Hi all
> > >         >     > >
> > >         >     > >
> > >         >     > > I am close to giving up on basic networking as I just
> > > cannot get
> > >         > failover
> > >         >     > > working with multiple NICs (I am not even sure it is
> > > supported).
> > >         >     > >
> > >         >     > >
> > >         >     > > What I would like is to use 3 NICs for management,
> > > storage and
> > >         > guest
> > >         >     > > traffic. I would like to assign public IPs direct to
> > the
> > > VMs which
> > >         > is
> > >         >     > why I
> > >         >     > > originally chose basic.
> > >         >     > >
> > >         >     > >
> > >         >     > > If I switch to advanced networking do I just
> configure
> > a
> > > guest VM
> > >         > with
> > >         >     > > public IPs on one NIC and not both with the public
> > > traffic -
> > >         >     > >
> > >         >     > >
> > >         >     > > would this work ?
> > >         >     > >
> > >         >     >
> > >         >     >
> > >         >     >
> > >         >     > --
> > >         >     > Rafael Weingärtner
> > >         >     >
> > >         >
> > >         >
> > >         >
> > >         >     --
> > >         >     Rafael Weingärtner
> > >         >
> > >         >
> > >         >
> > >
> > >
> > >         --
> > >         Rafael Weingärtner
> > >
> > >
> > >
> > >     Dag.Sonstebo@shapeblue.com
> > >     www.shapeblue.com<http://www.shapeblue.com>
> > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > >     @shapeblue
> > >
> > >
> > >
> > >
> > >
> > >
> > > Dag.Sonstebo@shapeblue.com
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > @shapeblue
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> Rafael Weingärtner
>



-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
zone wide.


________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 07 June 2018 10:04
To: users
Subject: Re: advanced networking with public IPs direct to VMs

What type of storage are you using? Zone wide? Or cluster "wide" storage?

On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> Rafael
>
>
> Here is the output as requested -
>
>
>
> mysql> mysql> select * from cloud.storage_pool where removed is null;
> +----+------+--------------------------------------+--------
> -----------+------+----------------+--------+------------+--
> ----------+----------------+--------------+-----------+-----
> ------------+---------------------+---------+-------------+-
> -------+-----------------------+-------+------------+-------
> --+---------------+
> | id | name | uuid                                 | pool_type         |
> port | data_center_id | pod_id | cluster_id | used_bytes | capacity_bytes |
> host_address | user_info | path            | created             | removed
> | update_time | status | storage_provider_name | scope | hypervisor |
> managed | capacity_iops |
> +----+------+--------------------------------------+--------
> -----------+------+----------------+--------+------------+--
> ----------+----------------+--------------+-----------+-----
> ------------+---------------------+---------+-------------+-
> -------+-----------------------+-------+------------+-------
> --+---------------+
> |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 | NetworkFilesystem |
> 2049 |              1 |   NULL |       NULL | 6059720704 |    79133933568 |
> 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 | NULL
> | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |
>  0 |          NULL |
> +----+------+--------------------------------------+--------
> -----------+------+----------------+--------+------------+--
> ----------+----------------+--------------+-----------+-----
> ------------+---------------------+---------+-------------+-
> -------+-----------------------+-------+------------+-------
> --+---------------+
> 1 row in set (0.00 sec)
>
> mysql>
>
> Do you think this problem is related to my NIC/bridge configuration or the
> way I am configuring the zone ?
>
> Jon
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 07 June 2018 06:45
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Can you also post the result of:
> select * from cloud.storage_pool where removed is null
>
> On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <Da...@shapeblue.com>
> wrote:
>
> > Hi Jon,
> >
> > Still confused where your primary storage pools are – are you sure your
> > hosts are in cluster 1?
> >
> > Quick question just to make sure - assuming management/storage is on the
> > same NIC when I setup basic networking the physical network has the
> > management and guest icons already there and I just edit the KVM labels.
> If
> > I am running storage over management do I need to drag the storage icon
> to
> > the physical network and use the same KVM label (cloudbr0) as the
> > management or does CS automatically just use the management NIC ie. I
> would
> > only need to drag the storage icon across in basic setup if I wanted it
> on
> > a different NIC/IP subnet ?  (hope that makes sense !)
> >
> > >> I would do both – set up your 2/3 physical networks, name isn’t that
> > important – but then drag the traffic types to the correct one and make
> > sure the labels are correct.
> > Regards,
> > Dag Sonstebo
> > Cloud Architect
> > ShapeBlue
> >
> > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
> >
> >     Dag
> >
> >
> >     Do you mean  check the pools with "Infrastructure -> Primary Storage"
> > and "Infrastructure -> Secondary Storage" within the UI ?
> >
> >
> >     If so Primary Storage has a state of UP, secondary storage does not
> > show a state as such so not sure where else to check it ?
> >
> >
> >     Rerun of the command -
> >
> >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> >     Empty set (0.00 sec)
> >
> >     mysql>
> >
> >     I think it is something to do with my zone creation rather than the
> > NIC, bridge setup although I can post those if needed.
> >
> >     I may try to setup just the 2 NIC solution you mentioned although as
> I
> > say I had the same issue with that ie. host goes to "Altert" state and
> same
> > error messages.  The only time I can get it to go to "Down" state is when
> > it is all on the single NIC.
> >
> >     Quick question just to make sure - assuming management/storage is on
> > the same NIC when I setup basic networking the physical network has the
> > management and guest icons already there and I just edit the KVM labels.
> If
> > I am running storage over management do I need to drag the storage icon
> to
> > the physical network and use the same KVM label (cloudbr0) as the
> > management or does CS automatically just use the management NIC ie. I
> would
> > only need to drag the storage icon across in basic setup if I wanted it
> on
> > a different NIC/IP subnet ?  (hope that makes sense !)
> >
> >     On the plus side I have been at this for so long now and done so many
> > rebuilds I could do it in my sleep now 😊
> >
> >
> >     ________________________________
> >     From: Dag Sonstebo <Da...@shapeblue.com>
> >     Sent: 06 June 2018 12:28
> >     To: users@cloudstack.apache.org
> >     Subject: Re: advanced networking with public IPs direct to VMs
> >
> >     Looks OK to me Jon.
> >
> >     The one thing that throws me is your storage pools – can you rerun
> > your query: select * from cloud.storage_pool where cluster_id = 1;
> >
> >     Do the pools show up as online in the CloudStack GUI?
> >
> >     Regards,
> >     Dag Sonstebo
> >     Cloud Architect
> >     ShapeBlue
> >
> >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
> >
> >         Don't know whether this helps or not but I logged into the SSVM
> > and ran an ifconfig -
> >
> >
> >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > 169.254.255.255
> >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
> >                 RX packets 141  bytes 20249 (19.7 KiB)
> >                 RX errors 0  dropped 0  overruns 0  frame 0
> >                 TX packets 108  bytes 16287 (15.9 KiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >                 inet 172.30.3.34  netmask 255.255.255.192  broadcast
> > 172.30.3.63
> >                 ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
> >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> >                 RX errors 0  dropped 44573  overruns 0  frame 0
> >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >                 inet 172.30.4.86  netmask 255.255.255.128  broadcast
> > 172.30.4.127
> >                 ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
> >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> >                 RX errors 0  dropped 39456  overruns 0  frame 0
> >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >                 inet 172.30.5.14  netmask 255.255.255.240  broadcast
> > 172.30.5.15
> >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
> >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> >                 RX errors 0  dropped 39446  overruns 0  frame 0
> >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> >                 inet 127.0.0.1  netmask 255.0.0.0
> >                 loop  txqueuelen 1  (Local Loopback)
> >                 RX packets 18  bytes 1440 (1.4 KiB)
> >                 RX errors 0  dropped 0  overruns 0  frame 0
> >                 TX packets 18  bytes 1440 (1.4 KiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >
> >         so it has interfaces in both the management and the storage
> > subnets (as well as guest).
> >
> >
> >
> >         ________________________________
> >         From: Jon Marshall <jm...@hotmail.co.uk>
> >         Sent: 06 June 2018 11:08
> >         To: users@cloudstack.apache.org
> >         Subject: Re: advanced networking with public IPs direct to VMs
> >
> >         Hi Rafael
> >
> >
> >         Thanks for the help, really appreciate it.
> >
> >
> >         So rerunning that command with all servers up -
> >
> >
> >
> >         mysql> select * from cloud.storage_pool where cluster_id = 1 and
> > removed is null;
> >         Empty set (0.00 sec)
> >
> >         mysql>
> >
> >
> >         As for the storage IP no I'm not setting it to be the management
> > IP when I setup the zone but the output of the SQL command suggests that
> is
> > what has happened.
> >
> >         As I said to Dag I am using a different subnet for storage ie.
> >
> >         172.30.3.0/26  - management subnet
> >         172.30.4.0/25 -  guest VM subnet
> >         172.30.5.0/28 - storage
> >
> >         the NFS server IP is 172.30.5.2
> >
> >         each compute node has 3 NICs with an IP from each subnet (i am
> > assuming the management node only needs an IP in the management network
> ?)
> >
> >         When I add the zone in the UI I have one physical network with
> > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> >         When I fill in the storage traffic page I use the range
> > 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to
> the
> > compute nodes and the NFS server.
> >
> >         I think maybe I am doing something wrong in the UI setup but it
> is
> > not obvious to me what it is.
> >
> >         What I might try today unless you want me to keep the setup I
> have
> > for more outputs is to go back to 2 NICs, one for storage/management and
> > one for guest VMs.
> >
> >         I think with the 2 NICs setup the mistake I made last time when
> > adding the zone was to assume storage would just run over management so I
> > did not drag and drop the storage icon and assign it to cloudbr0 as with
> > the management which I think is what I should do ?
> >
> >
> >
> >
> >
> >         ________________________________
> >         From: Rafael Weingärtner <ra...@gmail.com>
> >         Sent: 06 June 2018 10:54
> >         To: users
> >         Subject: Re: advanced networking with public IPs direct to VMs
> >
> >         Jon, do not panic we are here to help you :)
> >         So, I might have mistyped the SQL query. You you use select *
> from
> >         cloud.storage_pool where cluster_id = 1 and removed is not null
> ",
> > you are
> >         listing the storage pools removed. Therefore, the right query
> > would be "
> >         select * from cloud.storage_pool where cluster_id = 1 and removed
> > is null "
> >
> >         There is also something else I do not understand. You are setting
> > the
> >         storage IP in the management subnet? I am not sure if you should
> > be doing
> >         like this. Normally, I set all my storages (primary[when working
> > with NFS]
> >         and secondary) to IPs in the storage subnet.
> >
> >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > Dag.Sonstebo@shapeblue.com>
> >         wrote:
> >
> >         > Hi John,
> >         >
> >         > I’m late to this thread and have possibly missed some things –
> > but a
> >         > couple of observations:
> >         >
> >         > “When I add the zone and get to the storage web page I exclude
> > the IPs
> >         > already used for the compute node NICs and the NFS server
> > itself. …..”
> >         > “So the range is 172.30.5.1 -> 15 and the range I fill in is
> > 172.30.5.10
> >         > -> 172.30.5.14.”
> >         >
> >         > I think you may have some confusion around the use of the
> > storage network.
> >         > The important part here is to understand this is for *secondary
> > storage*
> >         > use only – it has nothing to do with primary storage. This
> means
> > this
> >         > storage network needs to be accessible to the SSVM, to the
> > hypervisors, and
> >         > secondary storage NFS pools needs to be accessible on this
> > network.
> >         >
> >         > The important part – this also means you *can not use the same
> > IP ranges
> >         > for management and storage networks* - doing so means you will
> > have issues
> >         > where effectively both hypervisors and SSVM can see the same
> > subnet on two
> >         > NICs – and you end up in a routing black hole.
> >         >
> >         > So – you need to either:
> >         >
> >         > 1) Use different IP subnets on management and storage, or
> >         > 2) preferably just simplify your setup – stop using a secondary
> > storage
> >         > network altogether and just allow secondary storage to use the
> > management
> >         > network (which is default). Unless you have a very high I/O
> > environment in
> >         > production you are just adding complexity by running separate
> > management
> >         > and storage.
> >         >
> >         > Regards,
> >         > Dag Sonstebo
> >         > Cloud Architect
> >         > ShapeBlue
> >         >
> >         > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk>
> > wrote:
> >         >
> >         >     I will disconnect the host this morning and test but before
> > I do that
> >         > I ran this command when all hosts are up -
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >      select * from cloud.host;
> >         >     +----+-----------------+------
> ------------------------------
> >         > --+--------+--------------------+--------------------+------
> >         > -----------+---------------------+--------------------+-----
> >         > ------------+---------------------+----------------------+--
> >         > ---------------------+-------------------+------------+-----
> >         > --------------+-----------------+--------------------+------
> >         > ------+----------------+--------+-------------+------+------
> >         > -+-------------------------------------+---------+----------
> >         > -------+--------------------+------------+----------+-------
> >         > ---+--------+------------+--------------+-------------------
> >         > --------------------------------------------+-----------+---
> >         > ----+-------------+------------+----------------+-----------
> >         > ----------+---------------------+---------+--------------+--
> >         > --------------+-------+-------------+--------------+
> >         >     | id | name            | uuid
> >  | status
> >         > | type               | private_ip_address | private_netmask |
> >         > private_mac_address | storage_ip_address | storage_netmask |
> >         > storage_mac_address | storage_ip_address_2 |
> > storage_mac_address_2 |
> >         > storage_netmask_2 | cluster_id | public_ip_address |
> > public_netmask  |
> >         > public_mac_address | proxy_port | data_center_id | pod_id |
> > cpu_sockets |
> >         > cpus | speed | url                                 | fs_type |
> >         > hypervisor_type | hypervisor_version | ram        | resource |
> > version  |
> >         > parent | total_size | capabilities | guid
> >         >                         | available | setup | dom0_memory |
> > last_ping  |
> >         > mgmt_server_id | disconnected        | created             |
> > removed |
> >         > update_count | resource_state | owner | lastUpdated |
> > engine_state |
> >         >     +----+-----------------+------
> ------------------------------
> >         > --+--------+--------------------+--------------------+------
> >         > -----------+---------------------+--------------------+-----
> >         > ------------+---------------------+----------------------+--
> >         > ---------------------+-------------------+------------+-----
> >         > --------------+-----------------+--------------------+------
> >         > ------+----------------+--------+-------------+------+------
> >         > -+-------------------------------------+---------+----------
> >         > -------+--------------------+------------+----------+-------
> >         > ---+--------+------------+--------------+-------------------
> >         > --------------------------------------------+-----------+---
> >         > ----+-------------+------------+----------------+-----------
> >         > ----------+---------------------+---------+--------------+--
> >         > --------------+-------+-------------+--------------+
> >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> eabd60012284
> > | Up
> >         >  | Routing            | 172.30.3.3         | 255.255.255.192 |
> >         > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> >         > 00:22:19:92:4e:34   | NULL                 | NULL
> >   | NULL
> >         >             |          1 | 172.30.4.3        | 255.255.255.128
> |
> >         > 00:22:19:92:4e:35  |       NULL |              1 |      1 |
> >      1 |
> >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    |
> > KVM
> >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-
> > LibvirtComputingResource
> >         > |         1 |     0 |           0 | 1492390408 |
>  146457912294 |
> >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > 4 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> 3f5e2c2215b0
> > | Up
> >         >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
> >         > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
> >         > 1e:00:80:00:00:14   | NULL                 | NULL
> >   | NULL
> >         >             |       NULL | 172.30.4.98       | 255.255.255.128
> |
> >         > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |
> >   NULL |
> >         > NULL |  NULL | NoIqn                               | NULL    |
> > NULL
> >         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> >         >           |         1 |     0 |           0 | 1492390409 |
> >  146457912294 |
> >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > 7 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> ff3845bb556c
> > | Up
> >         >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
> >         > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
> >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> >   | NULL
> >         >             |       NULL | 172.30.4.86       | 255.255.255.128
> |
> >         > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |
> >   NULL |
> >         > NULL |  NULL | NoIqn                               | NULL    |
> > NULL
> >         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> >         >             |         1 |     0 |           0 | 1492390407 |
> >  146457912294
> >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> >   7 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> 1a96ffac9794
> > | Up
> >         >  | Routing            | 172.30.3.4         | 255.255.255.192 |
> >         > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> >   | NULL
> >         >             |          1 | 172.30.4.4        | 255.255.255.128
> |
> >         > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |
> >      1 |
> >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    |
> > KVM
> >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-
> > LibvirtComputingResource
> >         > |         1 |     0 |           0 | 1492450882 |
>  146457912294 |
> >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > 8 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> ee29d2b5c274
> > | Up
> >         >  | Routing            | 172.30.3.5         | 255.255.255.192 |
> >         > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> >   | NULL
> >         >             |          1 | 172.30.4.5        | 255.255.255.128
> |
> >         > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |
> >      1 |
> >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    |
> > KVM
> >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-
> > LibvirtComputingResource
> >         > |         1 |     0 |           0 | 1492390408 |
>  146457912294 |
> >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > 6 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     +----+-----------------+------
> ------------------------------
> >         > --+--------+--------------------+--------------------+------
> >         > -----------+---------------------+--------------------+-----
> >         > ------------+---------------------+----------------------+--
> >         > ---------------------+-------------------+------------+-----
> >         > --------------+-----------------+--------------------+------
> >         > ------+----------------+--------+-------------+------+------
> >         > -+-------------------------------------+---------+----------
> >         > -------+--------------------+------------+----------+-------
> >         > ---+--------+------------+--------------+-------------------
> >         > --------------------------------------------+-----------+---
> >         > ----+-------------+------------+----------------+-----------
> >         > ----------+---------------------+---------+--------------+--
> >         > --------------+-------+-------------+--------------+
> >         >     5 rows in set (0.00 sec)
> >         >
> >         >
> >         >
> >         >     and you can see that it says the storage IP address is the
> > same as the
> >         > private IP address (the management network).
> >         >
> >         >
> >         >     I also ran the command you provided using the Cluster ID
> > number from
> >         > the table above -
> >         >
> >         >
> >         >
> >         >     mysql> select * from cloud.storage_pool where cluster_id =
> 1
> > and
> >         > removed is not null;
> >         >     Empty set (0.00 sec)
> >         >
> >         >     mysql>
> >         >
> >         >     So assuming I am reading this correctly that seems to be
> the
> > issue.
> >         >
> >         >
> >         >     I am at a loss as to why though.
> >         >
> >         >
> >         >     I have a separate NIC for storage as described. When I add
> > the zone
> >         > and get to the storage web page I exclude the IPs already used
> > for the
> >         > compute node NICs and the NFS server itself. I do this because
> > initially I
> >         > didn't and the SSVM started using the IP address of the NFS
> > server.
> >         >
> >         >
> >         >     So the range is 172.30.5.1 -> 15 and the range I fill in is
> >         > 172.30.5.10 -> 172.30.5.14.
> >         >
> >         >
> >         >     And I used the label "cloudbr2" for storage.
> >         >
> >         >
> >         >     I must be doing this wrong somehow.
> >         >
> >         >
> >         >     Any pointers would be much appreciated.
> >         >
> >         >
> >         >
> >         >
> >         >     ________________________________
> >         >     From: Rafael Weingärtner <ra...@gmail.com>
> >         >     Sent: 05 June 2018 16:13
> >         >     To: users
> >         >     Subject: Re: advanced networking with public IPs direct to
> > VMs
> >         >
> >         >     That is interesting. Let's see the source of all truth...
> >         >     This is the code that is generating that odd message.
> >         >
> >         >     >     List<StoragePoolVO> clusterPools =
> >         >     > _storagePoolDao.listPoolsByCluster(agent.
> getClusterId());
> >         >     >         boolean hasNfs = false;
> >         >     >         for (StoragePoolVO pool : clusterPools) {
> >         >     >             if (pool.getPoolType() == StoragePoolType.
> > NetworkFilesystem)
> >         > {
> >         >     >                 hasNfs = true;
> >         >     >                 break;
> >         >     >             }
> >         >     >         }
> >         >     >         if (!hasNfs) {
> >         >     >             s_logger.warn(
> >         >     >                     "Agent investigation was requested on
> > host " +
> >         > agent +
> >         >     > ", but host does not support investigation because it has
> > no NFS
> >         > storage.
> >         >     > Skipping investigation.");
> >         >     >             return Status.Disconnected;
> >         >     >         }
> >         >     >
> >         >
> >         >     There are two possibilities here. You do not have any NFS
> > storage? Is
> >         > that
> >         >     the case? Or maybe, for some reason, the call
> >         >     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())"
> > is not
> >         > returning
> >         >     any NFS storage pools. Looking at the "listPoolsByCluster "
> > we will see
> >         >     that the following SQL is used:
> >         >
> >         >     Select * from storage_pool where cluster_id =
> > <host'sClusterId> and
> >         > removed
> >         >     > is not null
> >         >     >
> >         >
> >         >     Can you run that SQL to see the its return when your hosts
> > are marked
> >         > as
> >         >     disconnected?
> >         >
> >         >
> >         > Dag.Sonstebo@shapeblue.com
> >         > www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com<http://www.shapeblue.com>
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> >         Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >         www.shapeblue.com<http://www.shapeblue.com>
> >         ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >         > @shapeblue
> >         >
> >         >
> >         >
> >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > jms.123@hotmail.co.uk>
> >         > wrote:
> >         >
> >         >     > I reran the tests with the 3 NIC setup. When I configured
> > the zone
> >         > through
> >         >     > the UI I used the labels cloudbr0 for management,
> cloudbr1
> > for guest
> >         >     > traffic and cloudbr2 for NFS as per my original response
> > to you.
> >         >     >
> >         >     >
> >         >     > When I pull the power to the node (dcp-cscn2.local) after
> > about 5
> >         > mins
> >         >     > the  host status goes to "Alert" but never to "Down"
> >         >     >
> >         >     >
> >         >     > I get this in the logs -
> >         >     >
> >         >     >
> >         >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> > investigation
> >         > was
> >         >     > requested on host Host[-4-Routing], but host does not
> > support
> >         > investigation
> >         >     > because it has no NFS storage. Skipping investigation.
> >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > HighAvailabilityManagerImpl]
> >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > KVMInvestigator was
> >         > able to
> >         >     > determine host 4 is in Disconnected
> >         >     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
> >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent
> > from host
> >         > 4 state
> >         >     > determined is Disconnected
> >         >     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
> >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
> >         > disconnected but
> >         >     > the host is still up: 4-dcp-cscn2.local
> >         >     >
> >         >     > I don't understand why it thinks there is no NFS storage
> > as each
> >         > compute
> >         >     > node has a dedicated storage NIC.
> >         >     >
> >         >     >
> >         >     > I also don't understand why it thinks the host is still
> up
> > ie. what
> >         > test
> >         >     > is it doing to determine that ?
> >         >     >
> >         >     >
> >         >     > Am I just trying to get something working that is not
> > supported ?
> >         >     >
> >         >     >
> >         >     > ________________________________
> >         >     > From: Rafael Weingärtner <ra...@gmail.com>
> >         >     > Sent: 04 June 2018 15:31
> >         >     > To: users
> >         >     > Subject: Re: advanced networking with public IPs direct
> to
> > VMs
> >         >     >
> >         >     > What type of failover are you talking about?
> >         >     > What ACS version are you using?
> >         >     > What hypervisor are you using?
> >         >     > How are you configuring your NICs in the hypervisor?
> >         >     > How are you configuring the traffic labels in ACS?
> >         >     >
> >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > jms.123@hotmail.co.uk
> >         > >
> >         >     > wrote:
> >         >     >
> >         >     > > Hi all
> >         >     > >
> >         >     > >
> >         >     > > I am close to giving up on basic networking as I just
> > cannot get
> >         > failover
> >         >     > > working with multiple NICs (I am not even sure it is
> > supported).
> >         >     > >
> >         >     > >
> >         >     > > What I would like is to use 3 NICs for management,
> > storage and
> >         > guest
> >         >     > > traffic. I would like to assign public IPs direct to
> the
> > VMs which
> >         > is
> >         >     > why I
> >         >     > > originally chose basic.
> >         >     > >
> >         >     > >
> >         >     > > If I switch to advanced networking do I just configure
> a
> > guest VM
> >         > with
> >         >     > > public IPs on one NIC and not both with the public
> > traffic -
> >         >     > >
> >         >     > >
> >         >     > > would this work ?
> >         >     > >
> >         >     >
> >         >     >
> >         >     >
> >         >     > --
> >         >     > Rafael Weingärtner
> >         >     >
> >         >
> >         >
> >         >
> >         >     --
> >         >     Rafael Weingärtner
> >         >
> >         >
> >         >
> >
> >
> >         --
> >         Rafael Weingärtner
> >
> >
> >
> >     Dag.Sonstebo@shapeblue.com
> >     www.shapeblue.com<http://www.shapeblue.com>
> >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >     @shapeblue
> >
> >
> >
> >
> >
> >
> > Dag.Sonstebo@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
>
>
> --
> Rafael Weingärtner
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
What type of storage are you using? Zone wide? Or cluster "wide" storage?

On Thu, Jun 7, 2018 at 4:25 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> Rafael
>
>
> Here is the output as requested -
>
>
>
> mysql> mysql> select * from cloud.storage_pool where removed is null;
> +----+------+--------------------------------------+--------
> -----------+------+----------------+--------+------------+--
> ----------+----------------+--------------+-----------+-----
> ------------+---------------------+---------+-------------+-
> -------+-----------------------+-------+------------+-------
> --+---------------+
> | id | name | uuid                                 | pool_type         |
> port | data_center_id | pod_id | cluster_id | used_bytes | capacity_bytes |
> host_address | user_info | path            | created             | removed
> | update_time | status | storage_provider_name | scope | hypervisor |
> managed | capacity_iops |
> +----+------+--------------------------------------+--------
> -----------+------+----------------+--------+------------+--
> ----------+----------------+--------------+-----------+-----
> ------------+---------------------+---------+-------------+-
> -------+-----------------------+-------+------------+-------
> --+---------------+
> |  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 | NetworkFilesystem |
> 2049 |              1 |   NULL |       NULL | 6059720704 |    79133933568 |
> 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 | NULL
> | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |
>  0 |          NULL |
> +----+------+--------------------------------------+--------
> -----------+------+----------------+--------+------------+--
> ----------+----------------+--------------+-----------+-----
> ------------+---------------------+---------+-------------+-
> -------+-----------------------+-------+------------+-------
> --+---------------+
> 1 row in set (0.00 sec)
>
> mysql>
>
> Do you think this problem is related to my NIC/bridge configuration or the
> way I am configuring the zone ?
>
> Jon
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 07 June 2018 06:45
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> Can you also post the result of:
> select * from cloud.storage_pool where removed is null
>
> On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <Da...@shapeblue.com>
> wrote:
>
> > Hi Jon,
> >
> > Still confused where your primary storage pools are – are you sure your
> > hosts are in cluster 1?
> >
> > Quick question just to make sure - assuming management/storage is on the
> > same NIC when I setup basic networking the physical network has the
> > management and guest icons already there and I just edit the KVM labels.
> If
> > I am running storage over management do I need to drag the storage icon
> to
> > the physical network and use the same KVM label (cloudbr0) as the
> > management or does CS automatically just use the management NIC ie. I
> would
> > only need to drag the storage icon across in basic setup if I wanted it
> on
> > a different NIC/IP subnet ?  (hope that makes sense !)
> >
> > >> I would do both – set up your 2/3 physical networks, name isn’t that
> > important – but then drag the traffic types to the correct one and make
> > sure the labels are correct.
> > Regards,
> > Dag Sonstebo
> > Cloud Architect
> > ShapeBlue
> >
> > On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
> >
> >     Dag
> >
> >
> >     Do you mean  check the pools with "Infrastructure -> Primary Storage"
> > and "Infrastructure -> Secondary Storage" within the UI ?
> >
> >
> >     If so Primary Storage has a state of UP, secondary storage does not
> > show a state as such so not sure where else to check it ?
> >
> >
> >     Rerun of the command -
> >
> >     mysql> select * from cloud.storage_pool where cluster_id = 1;
> >     Empty set (0.00 sec)
> >
> >     mysql>
> >
> >     I think it is something to do with my zone creation rather than the
> > NIC, bridge setup although I can post those if needed.
> >
> >     I may try to setup just the 2 NIC solution you mentioned although as
> I
> > say I had the same issue with that ie. host goes to "Altert" state and
> same
> > error messages.  The only time I can get it to go to "Down" state is when
> > it is all on the single NIC.
> >
> >     Quick question just to make sure - assuming management/storage is on
> > the same NIC when I setup basic networking the physical network has the
> > management and guest icons already there and I just edit the KVM labels.
> If
> > I am running storage over management do I need to drag the storage icon
> to
> > the physical network and use the same KVM label (cloudbr0) as the
> > management or does CS automatically just use the management NIC ie. I
> would
> > only need to drag the storage icon across in basic setup if I wanted it
> on
> > a different NIC/IP subnet ?  (hope that makes sense !)
> >
> >     On the plus side I have been at this for so long now and done so many
> > rebuilds I could do it in my sleep now 😊
> >
> >
> >     ________________________________
> >     From: Dag Sonstebo <Da...@shapeblue.com>
> >     Sent: 06 June 2018 12:28
> >     To: users@cloudstack.apache.org
> >     Subject: Re: advanced networking with public IPs direct to VMs
> >
> >     Looks OK to me Jon.
> >
> >     The one thing that throws me is your storage pools – can you rerun
> > your query: select * from cloud.storage_pool where cluster_id = 1;
> >
> >     Do the pools show up as online in the CloudStack GUI?
> >
> >     Regards,
> >     Dag Sonstebo
> >     Cloud Architect
> >     ShapeBlue
> >
> >     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
> >
> >         Don't know whether this helps or not but I logged into the SSVM
> > and ran an ifconfig -
> >
> >
> >         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> > 169.254.255.255
> >                 ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
> >                 RX packets 141  bytes 20249 (19.7 KiB)
> >                 RX errors 0  dropped 0  overruns 0  frame 0
> >                 TX packets 108  bytes 16287 (15.9 KiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >                 inet 172.30.3.34  netmask 255.255.255.192  broadcast
> > 172.30.3.63
> >                 ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
> >                 RX packets 56722  bytes 4953133 (4.7 MiB)
> >                 RX errors 0  dropped 44573  overruns 0  frame 0
> >                 TX packets 11224  bytes 1234932 (1.1 MiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >                 inet 172.30.4.86  netmask 255.255.255.128  broadcast
> > 172.30.4.127
> >                 ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
> >                 RX packets 366191  bytes 435300557 (415.1 MiB)
> >                 RX errors 0  dropped 39456  overruns 0  frame 0
> >                 TX packets 145065  bytes 7978602 (7.6 MiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
> >                 inet 172.30.5.14  netmask 255.255.255.240  broadcast
> > 172.30.5.15
> >                 ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
> >                 RX packets 132440  bytes 426362982 (406.6 MiB)
> >                 RX errors 0  dropped 39446  overruns 0  frame 0
> >                 TX packets 67443  bytes 423670834 (404.0 MiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
> >                 inet 127.0.0.1  netmask 255.0.0.0
> >                 loop  txqueuelen 1  (Local Loopback)
> >                 RX packets 18  bytes 1440 (1.4 KiB)
> >                 RX errors 0  dropped 0  overruns 0  frame 0
> >                 TX packets 18  bytes 1440 (1.4 KiB)
> >                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions
> 0
> >
> >
> >         so it has interfaces in both the management and the storage
> > subnets (as well as guest).
> >
> >
> >
> >         ________________________________
> >         From: Jon Marshall <jm...@hotmail.co.uk>
> >         Sent: 06 June 2018 11:08
> >         To: users@cloudstack.apache.org
> >         Subject: Re: advanced networking with public IPs direct to VMs
> >
> >         Hi Rafael
> >
> >
> >         Thanks for the help, really appreciate it.
> >
> >
> >         So rerunning that command with all servers up -
> >
> >
> >
> >         mysql> select * from cloud.storage_pool where cluster_id = 1 and
> > removed is null;
> >         Empty set (0.00 sec)
> >
> >         mysql>
> >
> >
> >         As for the storage IP no I'm not setting it to be the management
> > IP when I setup the zone but the output of the SQL command suggests that
> is
> > what has happened.
> >
> >         As I said to Dag I am using a different subnet for storage ie.
> >
> >         172.30.3.0/26  - management subnet
> >         172.30.4.0/25 -  guest VM subnet
> >         172.30.5.0/28 - storage
> >
> >         the NFS server IP is 172.30.5.2
> >
> >         each compute node has 3 NICs with an IP from each subnet (i am
> > assuming the management node only needs an IP in the management network
> ?)
> >
> >         When I add the zone in the UI I have one physical network with
> > management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
> >         When I fill in the storage traffic page I use the range
> > 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to
> the
> > compute nodes and the NFS server.
> >
> >         I think maybe I am doing something wrong in the UI setup but it
> is
> > not obvious to me what it is.
> >
> >         What I might try today unless you want me to keep the setup I
> have
> > for more outputs is to go back to 2 NICs, one for storage/management and
> > one for guest VMs.
> >
> >         I think with the 2 NICs setup the mistake I made last time when
> > adding the zone was to assume storage would just run over management so I
> > did not drag and drop the storage icon and assign it to cloudbr0 as with
> > the management which I think is what I should do ?
> >
> >
> >
> >
> >
> >         ________________________________
> >         From: Rafael Weingärtner <ra...@gmail.com>
> >         Sent: 06 June 2018 10:54
> >         To: users
> >         Subject: Re: advanced networking with public IPs direct to VMs
> >
> >         Jon, do not panic we are here to help you :)
> >         So, I might have mistyped the SQL query. You you use select *
> from
> >         cloud.storage_pool where cluster_id = 1 and removed is not null
> ",
> > you are
> >         listing the storage pools removed. Therefore, the right query
> > would be "
> >         select * from cloud.storage_pool where cluster_id = 1 and removed
> > is null "
> >
> >         There is also something else I do not understand. You are setting
> > the
> >         storage IP in the management subnet? I am not sure if you should
> > be doing
> >         like this. Normally, I set all my storages (primary[when working
> > with NFS]
> >         and secondary) to IPs in the storage subnet.
> >
> >         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> > Dag.Sonstebo@shapeblue.com>
> >         wrote:
> >
> >         > Hi John,
> >         >
> >         > I’m late to this thread and have possibly missed some things –
> > but a
> >         > couple of observations:
> >         >
> >         > “When I add the zone and get to the storage web page I exclude
> > the IPs
> >         > already used for the compute node NICs and the NFS server
> > itself. …..”
> >         > “So the range is 172.30.5.1 -> 15 and the range I fill in is
> > 172.30.5.10
> >         > -> 172.30.5.14.”
> >         >
> >         > I think you may have some confusion around the use of the
> > storage network.
> >         > The important part here is to understand this is for *secondary
> > storage*
> >         > use only – it has nothing to do with primary storage. This
> means
> > this
> >         > storage network needs to be accessible to the SSVM, to the
> > hypervisors, and
> >         > secondary storage NFS pools needs to be accessible on this
> > network.
> >         >
> >         > The important part – this also means you *can not use the same
> > IP ranges
> >         > for management and storage networks* - doing so means you will
> > have issues
> >         > where effectively both hypervisors and SSVM can see the same
> > subnet on two
> >         > NICs – and you end up in a routing black hole.
> >         >
> >         > So – you need to either:
> >         >
> >         > 1) Use different IP subnets on management and storage, or
> >         > 2) preferably just simplify your setup – stop using a secondary
> > storage
> >         > network altogether and just allow secondary storage to use the
> > management
> >         > network (which is default). Unless you have a very high I/O
> > environment in
> >         > production you are just adding complexity by running separate
> > management
> >         > and storage.
> >         >
> >         > Regards,
> >         > Dag Sonstebo
> >         > Cloud Architect
> >         > ShapeBlue
> >         >
> >         > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk>
> > wrote:
> >         >
> >         >     I will disconnect the host this morning and test but before
> > I do that
> >         > I ran this command when all hosts are up -
> >         >
> >         >
> >         >
> >         >
> >         >
> >         >      select * from cloud.host;
> >         >     +----+-----------------+------
> ------------------------------
> >         > --+--------+--------------------+--------------------+------
> >         > -----------+---------------------+--------------------+-----
> >         > ------------+---------------------+----------------------+--
> >         > ---------------------+-------------------+------------+-----
> >         > --------------+-----------------+--------------------+------
> >         > ------+----------------+--------+-------------+------+------
> >         > -+-------------------------------------+---------+----------
> >         > -------+--------------------+------------+----------+-------
> >         > ---+--------+------------+--------------+-------------------
> >         > --------------------------------------------+-----------+---
> >         > ----+-------------+------------+----------------+-----------
> >         > ----------+---------------------+---------+--------------+--
> >         > --------------+-------+-------------+--------------+
> >         >     | id | name            | uuid
> >  | status
> >         > | type               | private_ip_address | private_netmask |
> >         > private_mac_address | storage_ip_address | storage_netmask |
> >         > storage_mac_address | storage_ip_address_2 |
> > storage_mac_address_2 |
> >         > storage_netmask_2 | cluster_id | public_ip_address |
> > public_netmask  |
> >         > public_mac_address | proxy_port | data_center_id | pod_id |
> > cpu_sockets |
> >         > cpus | speed | url                                 | fs_type |
> >         > hypervisor_type | hypervisor_version | ram        | resource |
> > version  |
> >         > parent | total_size | capabilities | guid
> >         >                         | available | setup | dom0_memory |
> > last_ping  |
> >         > mgmt_server_id | disconnected        | created             |
> > removed |
> >         > update_count | resource_state | owner | lastUpdated |
> > engine_state |
> >         >     +----+-----------------+------
> ------------------------------
> >         > --+--------+--------------------+--------------------+------
> >         > -----------+---------------------+--------------------+-----
> >         > ------------+---------------------+----------------------+--
> >         > ---------------------+-------------------+------------+-----
> >         > --------------+-----------------+--------------------+------
> >         > ------+----------------+--------+-------------+------+------
> >         > -+-------------------------------------+---------+----------
> >         > -------+--------------------+------------+----------+-------
> >         > ---+--------+------------+--------------+-------------------
> >         > --------------------------------------------+-----------+---
> >         > ----+-------------+------------+----------------+-----------
> >         > ----------+---------------------+---------+--------------+--
> >         > --------------+-------+-------------+--------------+
> >         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-
> eabd60012284
> > | Up
> >         >  | Routing            | 172.30.3.3         | 255.255.255.192 |
> >         > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> >         > 00:22:19:92:4e:34   | NULL                 | NULL
> >   | NULL
> >         >             |          1 | 172.30.4.3        | 255.255.255.128
> |
> >         > 00:22:19:92:4e:35  |       NULL |              1 |      1 |
> >      1 |
> >         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    |
> > KVM
> >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-
> > LibvirtComputingResource
> >         > |         1 |     0 |           0 | 1492390408 |
>  146457912294 |
> >         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> > 4 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-
> 3f5e2c2215b0
> > | Up
> >         >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
> >         > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
> >         > 1e:00:80:00:00:14   | NULL                 | NULL
> >   | NULL
> >         >             |       NULL | 172.30.4.98       | 255.255.255.128
> |
> >         > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |
> >   NULL |
> >         > NULL |  NULL | NoIqn                               | NULL    |
> > NULL
> >         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
> >         >           |         1 |     0 |           0 | 1492390409 |
> >  146457912294 |
> >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> > 7 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-
> ff3845bb556c
> > | Up
> >         >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
> >         > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
> >         > 1e:00:3b:00:00:05   | NULL                 | NULL
> >   | NULL
> >         >             |       NULL | 172.30.4.86       | 255.255.255.128
> |
> >         > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |
> >   NULL |
> >         > NULL |  NULL | NoIqn                               | NULL    |
> > NULL
> >         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
> >         >             |         1 |     0 |           0 | 1492390407 |
> >  146457912294
> >         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
> >   7 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-
> 1a96ffac9794
> > | Up
> >         >  | Routing            | 172.30.3.4         | 255.255.255.192 |
> >         > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> >         > 00:26:b9:4a:97:7d   | NULL                 | NULL
> >   | NULL
> >         >             |          1 | 172.30.4.4        | 255.255.255.128
> |
> >         > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |
> >      1 |
> >         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    |
> > KVM
> >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-
> > LibvirtComputingResource
> >         > |         1 |     0 |           0 | 1492450882 |
>  146457912294 |
> >         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> > 8 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-
> ee29d2b5c274
> > | Up
> >         >  | Routing            | 172.30.3.5         | 255.255.255.192 |
> >         > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> >         > 00:24:e8:73:6a:b2   | NULL                 | NULL
> >   | NULL
> >         >             |          1 | 172.30.4.5        | 255.255.255.128
> |
> >         > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |
> >      1 |
> >         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    |
> > KVM
> >         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> > NULL   |
> >         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-
> > LibvirtComputingResource
> >         > |         1 |     0 |           0 | 1492390408 |
>  146457912294 |
> >         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> > 6 |
> >         > Enabled        | NULL  | NULL        | Disabled     |
> >         >     +----+-----------------+------
> ------------------------------
> >         > --+--------+--------------------+--------------------+------
> >         > -----------+---------------------+--------------------+-----
> >         > ------------+---------------------+----------------------+--
> >         > ---------------------+-------------------+------------+-----
> >         > --------------+-----------------+--------------------+------
> >         > ------+----------------+--------+-------------+------+------
> >         > -+-------------------------------------+---------+----------
> >         > -------+--------------------+------------+----------+-------
> >         > ---+--------+------------+--------------+-------------------
> >         > --------------------------------------------+-----------+---
> >         > ----+-------------+------------+----------------+-----------
> >         > ----------+---------------------+---------+--------------+--
> >         > --------------+-------+-------------+--------------+
> >         >     5 rows in set (0.00 sec)
> >         >
> >         >
> >         >
> >         >     and you can see that it says the storage IP address is the
> > same as the
> >         > private IP address (the management network).
> >         >
> >         >
> >         >     I also ran the command you provided using the Cluster ID
> > number from
> >         > the table above -
> >         >
> >         >
> >         >
> >         >     mysql> select * from cloud.storage_pool where cluster_id =
> 1
> > and
> >         > removed is not null;
> >         >     Empty set (0.00 sec)
> >         >
> >         >     mysql>
> >         >
> >         >     So assuming I am reading this correctly that seems to be
> the
> > issue.
> >         >
> >         >
> >         >     I am at a loss as to why though.
> >         >
> >         >
> >         >     I have a separate NIC for storage as described. When I add
> > the zone
> >         > and get to the storage web page I exclude the IPs already used
> > for the
> >         > compute node NICs and the NFS server itself. I do this because
> > initially I
> >         > didn't and the SSVM started using the IP address of the NFS
> > server.
> >         >
> >         >
> >         >     So the range is 172.30.5.1 -> 15 and the range I fill in is
> >         > 172.30.5.10 -> 172.30.5.14.
> >         >
> >         >
> >         >     And I used the label "cloudbr2" for storage.
> >         >
> >         >
> >         >     I must be doing this wrong somehow.
> >         >
> >         >
> >         >     Any pointers would be much appreciated.
> >         >
> >         >
> >         >
> >         >
> >         >     ________________________________
> >         >     From: Rafael Weingärtner <ra...@gmail.com>
> >         >     Sent: 05 June 2018 16:13
> >         >     To: users
> >         >     Subject: Re: advanced networking with public IPs direct to
> > VMs
> >         >
> >         >     That is interesting. Let's see the source of all truth...
> >         >     This is the code that is generating that odd message.
> >         >
> >         >     >     List<StoragePoolVO> clusterPools =
> >         >     > _storagePoolDao.listPoolsByCluster(agent.
> getClusterId());
> >         >     >         boolean hasNfs = false;
> >         >     >         for (StoragePoolVO pool : clusterPools) {
> >         >     >             if (pool.getPoolType() == StoragePoolType.
> > NetworkFilesystem)
> >         > {
> >         >     >                 hasNfs = true;
> >         >     >                 break;
> >         >     >             }
> >         >     >         }
> >         >     >         if (!hasNfs) {
> >         >     >             s_logger.warn(
> >         >     >                     "Agent investigation was requested on
> > host " +
> >         > agent +
> >         >     > ", but host does not support investigation because it has
> > no NFS
> >         > storage.
> >         >     > Skipping investigation.");
> >         >     >             return Status.Disconnected;
> >         >     >         }
> >         >     >
> >         >
> >         >     There are two possibilities here. You do not have any NFS
> > storage? Is
> >         > that
> >         >     the case? Or maybe, for some reason, the call
> >         >     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())"
> > is not
> >         > returning
> >         >     any NFS storage pools. Looking at the "listPoolsByCluster "
> > we will see
> >         >     that the following SQL is used:
> >         >
> >         >     Select * from storage_pool where cluster_id =
> > <host'sClusterId> and
> >         > removed
> >         >     > is not null
> >         >     >
> >         >
> >         >     Can you run that SQL to see the its return when your hosts
> > are marked
> >         > as
> >         >     disconnected?
> >         >
> >         >
> >         > Dag.Sonstebo@shapeblue.com
> >         > www.shapeblue.com<http://www.shapeblue.com>
> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> www.shapeblue.com
> ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
> >         Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> >         www.shapeblue.com<http://www.shapeblue.com>
> >         ShapeBlue are the largest independent integrator of CloudStack
> > technologies globally and are specialists in the design and
> implementation
> > of IaaS cloud infrastructures for both private and public cloud
> > implementations.
> >
> >
> >
> >         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >         > @shapeblue
> >         >
> >         >
> >         >
> >         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> > jms.123@hotmail.co.uk>
> >         > wrote:
> >         >
> >         >     > I reran the tests with the 3 NIC setup. When I configured
> > the zone
> >         > through
> >         >     > the UI I used the labels cloudbr0 for management,
> cloudbr1
> > for guest
> >         >     > traffic and cloudbr2 for NFS as per my original response
> > to you.
> >         >     >
> >         >     >
> >         >     > When I pull the power to the node (dcp-cscn2.local) after
> > about 5
> >         > mins
> >         >     > the  host status goes to "Alert" but never to "Down"
> >         >     >
> >         >     >
> >         >     > I get this in the logs -
> >         >     >
> >         >     >
> >         >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> > investigation
> >         > was
> >         >     > requested on host Host[-4-Routing], but host does not
> > support
> >         > investigation
> >         >     > because it has no NFS storage. Skipping investigation.
> >         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> > HighAvailabilityManagerImpl]
> >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> > KVMInvestigator was
> >         > able to
> >         >     > determine host 4 is in Disconnected
> >         >     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
> >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent
> > from host
> >         > 4 state
> >         >     > determined is Disconnected
> >         >     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
> >         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
> >         > disconnected but
> >         >     > the host is still up: 4-dcp-cscn2.local
> >         >     >
> >         >     > I don't understand why it thinks there is no NFS storage
> > as each
> >         > compute
> >         >     > node has a dedicated storage NIC.
> >         >     >
> >         >     >
> >         >     > I also don't understand why it thinks the host is still
> up
> > ie. what
> >         > test
> >         >     > is it doing to determine that ?
> >         >     >
> >         >     >
> >         >     > Am I just trying to get something working that is not
> > supported ?
> >         >     >
> >         >     >
> >         >     > ________________________________
> >         >     > From: Rafael Weingärtner <ra...@gmail.com>
> >         >     > Sent: 04 June 2018 15:31
> >         >     > To: users
> >         >     > Subject: Re: advanced networking with public IPs direct
> to
> > VMs
> >         >     >
> >         >     > What type of failover are you talking about?
> >         >     > What ACS version are you using?
> >         >     > What hypervisor are you using?
> >         >     > How are you configuring your NICs in the hypervisor?
> >         >     > How are you configuring the traffic labels in ACS?
> >         >     >
> >         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> > jms.123@hotmail.co.uk
> >         > >
> >         >     > wrote:
> >         >     >
> >         >     > > Hi all
> >         >     > >
> >         >     > >
> >         >     > > I am close to giving up on basic networking as I just
> > cannot get
> >         > failover
> >         >     > > working with multiple NICs (I am not even sure it is
> > supported).
> >         >     > >
> >         >     > >
> >         >     > > What I would like is to use 3 NICs for management,
> > storage and
> >         > guest
> >         >     > > traffic. I would like to assign public IPs direct to
> the
> > VMs which
> >         > is
> >         >     > why I
> >         >     > > originally chose basic.
> >         >     > >
> >         >     > >
> >         >     > > If I switch to advanced networking do I just configure
> a
> > guest VM
> >         > with
> >         >     > > public IPs on one NIC and not both with the public
> > traffic -
> >         >     > >
> >         >     > >
> >         >     > > would this work ?
> >         >     > >
> >         >     >
> >         >     >
> >         >     >
> >         >     > --
> >         >     > Rafael Weingärtner
> >         >     >
> >         >
> >         >
> >         >
> >         >     --
> >         >     Rafael Weingärtner
> >         >
> >         >
> >         >
> >
> >
> >         --
> >         Rafael Weingärtner
> >
> >
> >
> >     Dag.Sonstebo@shapeblue.com
> >     www.shapeblue.com<http://www.shapeblue.com>
> >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >     @shapeblue
> >
> >
> >
> >
> >
> >
> > Dag.Sonstebo@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
>
>
> --
> Rafael Weingärtner
>



-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Rafael


Here is the output as requested -



mysql> mysql> select * from cloud.storage_pool where removed is null;
+----+------+--------------------------------------+-------------------+------+----------------+--------+------------+------------+----------------+--------------+-----------+-----------------+---------------------+---------+-------------+--------+-----------------------+-------+------------+---------+---------------+
| id | name | uuid                                 | pool_type         | port | data_center_id | pod_id | cluster_id | used_bytes | capacity_bytes | host_address | user_info | path            | created             | removed | update_time | status | storage_provider_name | scope | hypervisor | managed | capacity_iops |
+----+------+--------------------------------------+-------------------+------+----------------+--------+------------+------------+----------------+--------------+-----------+-----------------+---------------------+---------+-------------+--------+-----------------------+-------+------------+---------+---------------+
|  1 | ds1  | a234224f-05fb-3f4c-9b0f-c51ebdf9a601 | NetworkFilesystem | 2049 |              1 |   NULL |       NULL | 6059720704 |    79133933568 | 172.30.5.2   | NULL      | /export/primary | 2018-06-05 13:45:01 | NULL    | NULL        | Up     | DefaultPrimary        | ZONE  | KVM        |       0 |          NULL |
+----+------+--------------------------------------+-------------------+------+----------------+--------+------------+------------+----------------+--------------+-----------+-----------------+---------------------+---------+-------------+--------+-----------------------+-------+------------+---------+---------------+
1 row in set (0.00 sec)

mysql>

Do you think this problem is related to my NIC/bridge configuration or the way I am configuring the zone ?

Jon
________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 07 June 2018 06:45
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Can you also post the result of:
select * from cloud.storage_pool where removed is null

On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <Da...@shapeblue.com>
wrote:

> Hi Jon,
>
> Still confused where your primary storage pools are – are you sure your
> hosts are in cluster 1?
>
> Quick question just to make sure - assuming management/storage is on the
> same NIC when I setup basic networking the physical network has the
> management and guest icons already there and I just edit the KVM labels. If
> I am running storage over management do I need to drag the storage icon to
> the physical network and use the same KVM label (cloudbr0) as the
> management or does CS automatically just use the management NIC ie. I would
> only need to drag the storage icon across in basic setup if I wanted it on
> a different NIC/IP subnet ?  (hope that makes sense !)
>
> >> I would do both – set up your 2/3 physical networks, name isn’t that
> important – but then drag the traffic types to the correct one and make
> sure the labels are correct.
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
> On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
>
>     Dag
>
>
>     Do you mean  check the pools with "Infrastructure -> Primary Storage"
> and "Infrastructure -> Secondary Storage" within the UI ?
>
>
>     If so Primary Storage has a state of UP, secondary storage does not
> show a state as such so not sure where else to check it ?
>
>
>     Rerun of the command -
>
>     mysql> select * from cloud.storage_pool where cluster_id = 1;
>     Empty set (0.00 sec)
>
>     mysql>
>
>     I think it is something to do with my zone creation rather than the
> NIC, bridge setup although I can post those if needed.
>
>     I may try to setup just the 2 NIC solution you mentioned although as I
> say I had the same issue with that ie. host goes to "Altert" state and same
> error messages.  The only time I can get it to go to "Down" state is when
> it is all on the single NIC.
>
>     Quick question just to make sure - assuming management/storage is on
> the same NIC when I setup basic networking the physical network has the
> management and guest icons already there and I just edit the KVM labels. If
> I am running storage over management do I need to drag the storage icon to
> the physical network and use the same KVM label (cloudbr0) as the
> management or does CS automatically just use the management NIC ie. I would
> only need to drag the storage icon across in basic setup if I wanted it on
> a different NIC/IP subnet ?  (hope that makes sense !)
>
>     On the plus side I have been at this for so long now and done so many
> rebuilds I could do it in my sleep now 😊
>
>
>     ________________________________
>     From: Dag Sonstebo <Da...@shapeblue.com>
>     Sent: 06 June 2018 12:28
>     To: users@cloudstack.apache.org
>     Subject: Re: advanced networking with public IPs direct to VMs
>
>     Looks OK to me Jon.
>
>     The one thing that throws me is your storage pools – can you rerun
> your query: select * from cloud.storage_pool where cluster_id = 1;
>
>     Do the pools show up as online in the CloudStack GUI?
>
>     Regards,
>     Dag Sonstebo
>     Cloud Architect
>     ShapeBlue
>
>     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
>
>         Don't know whether this helps or not but I logged into the SSVM
> and ran an ifconfig -
>
>
>         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> 169.254.255.255
>                 ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
>                 RX packets 141  bytes 20249 (19.7 KiB)
>                 RX errors 0  dropped 0  overruns 0  frame 0
>                 TX packets 108  bytes 16287 (15.9 KiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>                 inet 172.30.3.34  netmask 255.255.255.192  broadcast
> 172.30.3.63
>                 ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
>                 RX packets 56722  bytes 4953133 (4.7 MiB)
>                 RX errors 0  dropped 44573  overruns 0  frame 0
>                 TX packets 11224  bytes 1234932 (1.1 MiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>                 inet 172.30.4.86  netmask 255.255.255.128  broadcast
> 172.30.4.127
>                 ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
>                 RX packets 366191  bytes 435300557 (415.1 MiB)
>                 RX errors 0  dropped 39456  overruns 0  frame 0
>                 TX packets 145065  bytes 7978602 (7.6 MiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>                 inet 172.30.5.14  netmask 255.255.255.240  broadcast
> 172.30.5.15
>                 ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
>                 RX packets 132440  bytes 426362982 (406.6 MiB)
>                 RX errors 0  dropped 39446  overruns 0  frame 0
>                 TX packets 67443  bytes 423670834 (404.0 MiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
>                 inet 127.0.0.1  netmask 255.0.0.0
>                 loop  txqueuelen 1  (Local Loopback)
>                 RX packets 18  bytes 1440 (1.4 KiB)
>                 RX errors 0  dropped 0  overruns 0  frame 0
>                 TX packets 18  bytes 1440 (1.4 KiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>
>         so it has interfaces in both the management and the storage
> subnets (as well as guest).
>
>
>
>         ________________________________
>         From: Jon Marshall <jm...@hotmail.co.uk>
>         Sent: 06 June 2018 11:08
>         To: users@cloudstack.apache.org
>         Subject: Re: advanced networking with public IPs direct to VMs
>
>         Hi Rafael
>
>
>         Thanks for the help, really appreciate it.
>
>
>         So rerunning that command with all servers up -
>
>
>
>         mysql> select * from cloud.storage_pool where cluster_id = 1 and
> removed is null;
>         Empty set (0.00 sec)
>
>         mysql>
>
>
>         As for the storage IP no I'm not setting it to be the management
> IP when I setup the zone but the output of the SQL command suggests that is
> what has happened.
>
>         As I said to Dag I am using a different subnet for storage ie.
>
>         172.30.3.0/26  - management subnet
>         172.30.4.0/25 -  guest VM subnet
>         172.30.5.0/28 - storage
>
>         the NFS server IP is 172.30.5.2
>
>         each compute node has 3 NICs with an IP from each subnet (i am
> assuming the management node only needs an IP in the management network ?)
>
>         When I add the zone in the UI I have one physical network with
> management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
>         When I fill in the storage traffic page I use the range
> 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the
> compute nodes and the NFS server.
>
>         I think maybe I am doing something wrong in the UI setup but it is
> not obvious to me what it is.
>
>         What I might try today unless you want me to keep the setup I have
> for more outputs is to go back to 2 NICs, one for storage/management and
> one for guest VMs.
>
>         I think with the 2 NICs setup the mistake I made last time when
> adding the zone was to assume storage would just run over management so I
> did not drag and drop the storage icon and assign it to cloudbr0 as with
> the management which I think is what I should do ?
>
>
>
>
>
>         ________________________________
>         From: Rafael Weingärtner <ra...@gmail.com>
>         Sent: 06 June 2018 10:54
>         To: users
>         Subject: Re: advanced networking with public IPs direct to VMs
>
>         Jon, do not panic we are here to help you :)
>         So, I might have mistyped the SQL query. You you use select * from
>         cloud.storage_pool where cluster_id = 1 and removed is not null ",
> you are
>         listing the storage pools removed. Therefore, the right query
> would be "
>         select * from cloud.storage_pool where cluster_id = 1 and removed
> is null "
>
>         There is also something else I do not understand. You are setting
> the
>         storage IP in the management subnet? I am not sure if you should
> be doing
>         like this. Normally, I set all my storages (primary[when working
> with NFS]
>         and secondary) to IPs in the storage subnet.
>
>         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> Dag.Sonstebo@shapeblue.com>
>         wrote:
>
>         > Hi John,
>         >
>         > I’m late to this thread and have possibly missed some things –
> but a
>         > couple of observations:
>         >
>         > “When I add the zone and get to the storage web page I exclude
> the IPs
>         > already used for the compute node NICs and the NFS server
> itself. …..”
>         > “So the range is 172.30.5.1 -> 15 and the range I fill in is
> 172.30.5.10
>         > -> 172.30.5.14.”
>         >
>         > I think you may have some confusion around the use of the
> storage network.
>         > The important part here is to understand this is for *secondary
> storage*
>         > use only – it has nothing to do with primary storage. This means
> this
>         > storage network needs to be accessible to the SSVM, to the
> hypervisors, and
>         > secondary storage NFS pools needs to be accessible on this
> network.
>         >
>         > The important part – this also means you *can not use the same
> IP ranges
>         > for management and storage networks* - doing so means you will
> have issues
>         > where effectively both hypervisors and SSVM can see the same
> subnet on two
>         > NICs – and you end up in a routing black hole.
>         >
>         > So – you need to either:
>         >
>         > 1) Use different IP subnets on management and storage, or
>         > 2) preferably just simplify your setup – stop using a secondary
> storage
>         > network altogether and just allow secondary storage to use the
> management
>         > network (which is default). Unless you have a very high I/O
> environment in
>         > production you are just adding complexity by running separate
> management
>         > and storage.
>         >
>         > Regards,
>         > Dag Sonstebo
>         > Cloud Architect
>         > ShapeBlue
>         >
>         > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk>
> wrote:
>         >
>         >     I will disconnect the host this morning and test but before
> I do that
>         > I ran this command when all hosts are up -
>         >
>         >
>         >
>         >
>         >
>         >      select * from cloud.host;
>         >     +----+-----------------+------------------------------------
>         > --+--------+--------------------+--------------------+------
>         > -----------+---------------------+--------------------+-----
>         > ------------+---------------------+----------------------+--
>         > ---------------------+-------------------+------------+-----
>         > --------------+-----------------+--------------------+------
>         > ------+----------------+--------+-------------+------+------
>         > -+-------------------------------------+---------+----------
>         > -------+--------------------+------------+----------+-------
>         > ---+--------+------------+--------------+-------------------
>         > --------------------------------------------+-----------+---
>         > ----+-------------+------------+----------------+-----------
>         > ----------+---------------------+---------+--------------+--
>         > --------------+-------+-------------+--------------+
>         >     | id | name            | uuid
>  | status
>         > | type               | private_ip_address | private_netmask |
>         > private_mac_address | storage_ip_address | storage_netmask |
>         > storage_mac_address | storage_ip_address_2 |
> storage_mac_address_2 |
>         > storage_netmask_2 | cluster_id | public_ip_address |
> public_netmask  |
>         > public_mac_address | proxy_port | data_center_id | pod_id |
> cpu_sockets |
>         > cpus | speed | url                                 | fs_type |
>         > hypervisor_type | hypervisor_version | ram        | resource |
> version  |
>         > parent | total_size | capabilities | guid
>         >                         | available | setup | dom0_memory |
> last_ping  |
>         > mgmt_server_id | disconnected        | created             |
> removed |
>         > update_count | resource_state | owner | lastUpdated |
> engine_state |
>         >     +----+-----------------+------------------------------------
>         > --+--------+--------------------+--------------------+------
>         > -----------+---------------------+--------------------+-----
>         > ------------+---------------------+----------------------+--
>         > ---------------------+-------------------+------------+-----
>         > --------------+-----------------+--------------------+------
>         > ------+----------------+--------+-------------+------+------
>         > -+-------------------------------------+---------+----------
>         > -------+--------------------+------------+----------+-------
>         > ---+--------+------------+--------------+-------------------
>         > --------------------------------------------+-----------+---
>         > ----+-------------+------------+----------------+-----------
>         > ----------+---------------------+---------+--------------+--
>         > --------------+-------+-------------+--------------+
>         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284
> | Up
>         >  | Routing            | 172.30.3.3         | 255.255.255.192 |
>         > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
>         > 00:22:19:92:4e:34   | NULL                 | NULL
>   | NULL
>         >             |          1 | 172.30.4.3        | 255.255.255.128 |
>         > 00:22:19:92:4e:35  |       NULL |              1 |      1 |
>      1 |
>         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    |
> KVM
>         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> NULL   |
>         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-
> LibvirtComputingResource
>         > |         1 |     0 |           0 | 1492390408 |   146457912294 |
>         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> 4 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0
> | Up
>         >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
>         > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
>         > 1e:00:80:00:00:14   | NULL                 | NULL
>   | NULL
>         >             |       NULL | 172.30.4.98       | 255.255.255.128 |
>         > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |
>   NULL |
>         > NULL |  NULL | NoIqn                               | NULL    |
> NULL
>         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> NULL   |
>         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
>         >           |         1 |     0 |           0 | 1492390409 |
>  146457912294 |
>         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> 7 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c
> | Up
>         >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
>         > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
>         > 1e:00:3b:00:00:05   | NULL                 | NULL
>   | NULL
>         >             |       NULL | 172.30.4.86       | 255.255.255.128 |
>         > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |
>   NULL |
>         > NULL |  NULL | NoIqn                               | NULL    |
> NULL
>         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> NULL   |
>         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
>         >             |         1 |     0 |           0 | 1492390407 |
>  146457912294
>         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
>   7 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794
> | Up
>         >  | Routing            | 172.30.3.4         | 255.255.255.192 |
>         > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
>         > 00:26:b9:4a:97:7d   | NULL                 | NULL
>   | NULL
>         >             |          1 | 172.30.4.4        | 255.255.255.128 |
>         > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |
>      1 |
>         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    |
> KVM
>         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> NULL   |
>         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-
> LibvirtComputingResource
>         > |         1 |     0 |           0 | 1492450882 |   146457912294 |
>         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> 8 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274
> | Up
>         >  | Routing            | 172.30.3.5         | 255.255.255.192 |
>         > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
>         > 00:24:e8:73:6a:b2   | NULL                 | NULL
>   | NULL
>         >             |          1 | 172.30.4.5        | 255.255.255.128 |
>         > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |
>      1 |
>         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    |
> KVM
>         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> NULL   |
>         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-
> LibvirtComputingResource
>         > |         1 |     0 |           0 | 1492390408 |   146457912294 |
>         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> 6 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     +----+-----------------+------------------------------------
>         > --+--------+--------------------+--------------------+------
>         > -----------+---------------------+--------------------+-----
>         > ------------+---------------------+----------------------+--
>         > ---------------------+-------------------+------------+-----
>         > --------------+-----------------+--------------------+------
>         > ------+----------------+--------+-------------+------+------
>         > -+-------------------------------------+---------+----------
>         > -------+--------------------+------------+----------+-------
>         > ---+--------+------------+--------------+-------------------
>         > --------------------------------------------+-----------+---
>         > ----+-------------+------------+----------------+-----------
>         > ----------+---------------------+---------+--------------+--
>         > --------------+-------+-------------+--------------+
>         >     5 rows in set (0.00 sec)
>         >
>         >
>         >
>         >     and you can see that it says the storage IP address is the
> same as the
>         > private IP address (the management network).
>         >
>         >
>         >     I also ran the command you provided using the Cluster ID
> number from
>         > the table above -
>         >
>         >
>         >
>         >     mysql> select * from cloud.storage_pool where cluster_id = 1
> and
>         > removed is not null;
>         >     Empty set (0.00 sec)
>         >
>         >     mysql>
>         >
>         >     So assuming I am reading this correctly that seems to be the
> issue.
>         >
>         >
>         >     I am at a loss as to why though.
>         >
>         >
>         >     I have a separate NIC for storage as described. When I add
> the zone
>         > and get to the storage web page I exclude the IPs already used
> for the
>         > compute node NICs and the NFS server itself. I do this because
> initially I
>         > didn't and the SSVM started using the IP address of the NFS
> server.
>         >
>         >
>         >     So the range is 172.30.5.1 -> 15 and the range I fill in is
>         > 172.30.5.10 -> 172.30.5.14.
>         >
>         >
>         >     And I used the label "cloudbr2" for storage.
>         >
>         >
>         >     I must be doing this wrong somehow.
>         >
>         >
>         >     Any pointers would be much appreciated.
>         >
>         >
>         >
>         >
>         >     ________________________________
>         >     From: Rafael Weingärtner <ra...@gmail.com>
>         >     Sent: 05 June 2018 16:13
>         >     To: users
>         >     Subject: Re: advanced networking with public IPs direct to
> VMs
>         >
>         >     That is interesting. Let's see the source of all truth...
>         >     This is the code that is generating that odd message.
>         >
>         >     >     List<StoragePoolVO> clusterPools =
>         >     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
>         >     >         boolean hasNfs = false;
>         >     >         for (StoragePoolVO pool : clusterPools) {
>         >     >             if (pool.getPoolType() == StoragePoolType.
> NetworkFilesystem)
>         > {
>         >     >                 hasNfs = true;
>         >     >                 break;
>         >     >             }
>         >     >         }
>         >     >         if (!hasNfs) {
>         >     >             s_logger.warn(
>         >     >                     "Agent investigation was requested on
> host " +
>         > agent +
>         >     > ", but host does not support investigation because it has
> no NFS
>         > storage.
>         >     > Skipping investigation.");
>         >     >             return Status.Disconnected;
>         >     >         }
>         >     >
>         >
>         >     There are two possibilities here. You do not have any NFS
> storage? Is
>         > that
>         >     the case? Or maybe, for some reason, the call
>         >     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())"
> is not
>         > returning
>         >     any NFS storage pools. Looking at the "listPoolsByCluster "
> we will see
>         >     that the following SQL is used:
>         >
>         >     Select * from storage_pool where cluster_id =
> <host'sClusterId> and
>         > removed
>         >     > is not null
>         >     >
>         >
>         >     Can you run that SQL to see the its return when your hosts
> are marked
>         > as
>         >     disconnected?
>         >
>         >
>         > Dag.Sonstebo@shapeblue.com
>         > www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



>         Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>         www.shapeblue.com<http://www.shapeblue.com>
>         ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
>         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>         > @shapeblue
>         >
>         >
>         >
>         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> jms.123@hotmail.co.uk>
>         > wrote:
>         >
>         >     > I reran the tests with the 3 NIC setup. When I configured
> the zone
>         > through
>         >     > the UI I used the labels cloudbr0 for management, cloudbr1
> for guest
>         >     > traffic and cloudbr2 for NFS as per my original response
> to you.
>         >     >
>         >     >
>         >     > When I pull the power to the node (dcp-cscn2.local) after
> about 5
>         > mins
>         >     > the  host status goes to "Alert" but never to "Down"
>         >     >
>         >     >
>         >     > I get this in the logs -
>         >     >
>         >     >
>         >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
>         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> investigation
>         > was
>         >     > requested on host Host[-4-Routing], but host does not
> support
>         > investigation
>         >     > because it has no NFS storage. Skipping investigation.
>         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> HighAvailabilityManagerImpl]
>         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> KVMInvestigator was
>         > able to
>         >     > determine host 4 is in Disconnected
>         >     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
>         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent
> from host
>         > 4 state
>         >     > determined is Disconnected
>         >     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
>         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
>         > disconnected but
>         >     > the host is still up: 4-dcp-cscn2.local
>         >     >
>         >     > I don't understand why it thinks there is no NFS storage
> as each
>         > compute
>         >     > node has a dedicated storage NIC.
>         >     >
>         >     >
>         >     > I also don't understand why it thinks the host is still up
> ie. what
>         > test
>         >     > is it doing to determine that ?
>         >     >
>         >     >
>         >     > Am I just trying to get something working that is not
> supported ?
>         >     >
>         >     >
>         >     > ________________________________
>         >     > From: Rafael Weingärtner <ra...@gmail.com>
>         >     > Sent: 04 June 2018 15:31
>         >     > To: users
>         >     > Subject: Re: advanced networking with public IPs direct to
> VMs
>         >     >
>         >     > What type of failover are you talking about?
>         >     > What ACS version are you using?
>         >     > What hypervisor are you using?
>         >     > How are you configuring your NICs in the hypervisor?
>         >     > How are you configuring the traffic labels in ACS?
>         >     >
>         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> jms.123@hotmail.co.uk
>         > >
>         >     > wrote:
>         >     >
>         >     > > Hi all
>         >     > >
>         >     > >
>         >     > > I am close to giving up on basic networking as I just
> cannot get
>         > failover
>         >     > > working with multiple NICs (I am not even sure it is
> supported).
>         >     > >
>         >     > >
>         >     > > What I would like is to use 3 NICs for management,
> storage and
>         > guest
>         >     > > traffic. I would like to assign public IPs direct to the
> VMs which
>         > is
>         >     > why I
>         >     > > originally chose basic.
>         >     > >
>         >     > >
>         >     > > If I switch to advanced networking do I just configure a
> guest VM
>         > with
>         >     > > public IPs on one NIC and not both with the public
> traffic -
>         >     > >
>         >     > >
>         >     > > would this work ?
>         >     > >
>         >     >
>         >     >
>         >     >
>         >     > --
>         >     > Rafael Weingärtner
>         >     >
>         >
>         >
>         >
>         >     --
>         >     Rafael Weingärtner
>         >
>         >
>         >
>
>
>         --
>         Rafael Weingärtner
>
>
>
>     Dag.Sonstebo@shapeblue.com
>     www.shapeblue.com<http://www.shapeblue.com>
>     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>     @shapeblue
>
>
>
>
>
>
> Dag.Sonstebo@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>


--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
Can you also post the result of:
select * from cloud.storage_pool where removed is null

On Wed, Jun 6, 2018 at 3:06 PM, Dag Sonstebo <Da...@shapeblue.com>
wrote:

> Hi Jon,
>
> Still confused where your primary storage pools are – are you sure your
> hosts are in cluster 1?
>
> Quick question just to make sure - assuming management/storage is on the
> same NIC when I setup basic networking the physical network has the
> management and guest icons already there and I just edit the KVM labels. If
> I am running storage over management do I need to drag the storage icon to
> the physical network and use the same KVM label (cloudbr0) as the
> management or does CS automatically just use the management NIC ie. I would
> only need to drag the storage icon across in basic setup if I wanted it on
> a different NIC/IP subnet ?  (hope that makes sense !)
>
> >> I would do both – set up your 2/3 physical networks, name isn’t that
> important – but then drag the traffic types to the correct one and make
> sure the labels are correct.
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
> On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
>
>     Dag
>
>
>     Do you mean  check the pools with "Infrastructure -> Primary Storage"
> and "Infrastructure -> Secondary Storage" within the UI ?
>
>
>     If so Primary Storage has a state of UP, secondary storage does not
> show a state as such so not sure where else to check it ?
>
>
>     Rerun of the command -
>
>     mysql> select * from cloud.storage_pool where cluster_id = 1;
>     Empty set (0.00 sec)
>
>     mysql>
>
>     I think it is something to do with my zone creation rather than the
> NIC, bridge setup although I can post those if needed.
>
>     I may try to setup just the 2 NIC solution you mentioned although as I
> say I had the same issue with that ie. host goes to "Altert" state and same
> error messages.  The only time I can get it to go to "Down" state is when
> it is all on the single NIC.
>
>     Quick question just to make sure - assuming management/storage is on
> the same NIC when I setup basic networking the physical network has the
> management and guest icons already there and I just edit the KVM labels. If
> I am running storage over management do I need to drag the storage icon to
> the physical network and use the same KVM label (cloudbr0) as the
> management or does CS automatically just use the management NIC ie. I would
> only need to drag the storage icon across in basic setup if I wanted it on
> a different NIC/IP subnet ?  (hope that makes sense !)
>
>     On the plus side I have been at this for so long now and done so many
> rebuilds I could do it in my sleep now 😊
>
>
>     ________________________________
>     From: Dag Sonstebo <Da...@shapeblue.com>
>     Sent: 06 June 2018 12:28
>     To: users@cloudstack.apache.org
>     Subject: Re: advanced networking with public IPs direct to VMs
>
>     Looks OK to me Jon.
>
>     The one thing that throws me is your storage pools – can you rerun
> your query: select * from cloud.storage_pool where cluster_id = 1;
>
>     Do the pools show up as online in the CloudStack GUI?
>
>     Regards,
>     Dag Sonstebo
>     Cloud Architect
>     ShapeBlue
>
>     On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
>
>         Don't know whether this helps or not but I logged into the SSVM
> and ran an ifconfig -
>
>
>         eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>                 inet 169.254.3.35  netmask 255.255.0.0  broadcast
> 169.254.255.255
>                 ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
>                 RX packets 141  bytes 20249 (19.7 KiB)
>                 RX errors 0  dropped 0  overruns 0  frame 0
>                 TX packets 108  bytes 16287 (15.9 KiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>         eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>                 inet 172.30.3.34  netmask 255.255.255.192  broadcast
> 172.30.3.63
>                 ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
>                 RX packets 56722  bytes 4953133 (4.7 MiB)
>                 RX errors 0  dropped 44573  overruns 0  frame 0
>                 TX packets 11224  bytes 1234932 (1.1 MiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>         eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>                 inet 172.30.4.86  netmask 255.255.255.128  broadcast
> 172.30.4.127
>                 ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
>                 RX packets 366191  bytes 435300557 (415.1 MiB)
>                 RX errors 0  dropped 39456  overruns 0  frame 0
>                 TX packets 145065  bytes 7978602 (7.6 MiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>         eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>                 inet 172.30.5.14  netmask 255.255.255.240  broadcast
> 172.30.5.15
>                 ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
>                 RX packets 132440  bytes 426362982 (406.6 MiB)
>                 RX errors 0  dropped 39446  overruns 0  frame 0
>                 TX packets 67443  bytes 423670834 (404.0 MiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>         lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
>                 inet 127.0.0.1  netmask 255.0.0.0
>                 loop  txqueuelen 1  (Local Loopback)
>                 RX packets 18  bytes 1440 (1.4 KiB)
>                 RX errors 0  dropped 0  overruns 0  frame 0
>                 TX packets 18  bytes 1440 (1.4 KiB)
>                 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
>
>         so it has interfaces in both the management and the storage
> subnets (as well as guest).
>
>
>
>         ________________________________
>         From: Jon Marshall <jm...@hotmail.co.uk>
>         Sent: 06 June 2018 11:08
>         To: users@cloudstack.apache.org
>         Subject: Re: advanced networking with public IPs direct to VMs
>
>         Hi Rafael
>
>
>         Thanks for the help, really appreciate it.
>
>
>         So rerunning that command with all servers up -
>
>
>
>         mysql> select * from cloud.storage_pool where cluster_id = 1 and
> removed is null;
>         Empty set (0.00 sec)
>
>         mysql>
>
>
>         As for the storage IP no I'm not setting it to be the management
> IP when I setup the zone but the output of the SQL command suggests that is
> what has happened.
>
>         As I said to Dag I am using a different subnet for storage ie.
>
>         172.30.3.0/26  - management subnet
>         172.30.4.0/25 -  guest VM subnet
>         172.30.5.0/28 - storage
>
>         the NFS server IP is 172.30.5.2
>
>         each compute node has 3 NICs with an IP from each subnet (i am
> assuming the management node only needs an IP in the management network ?)
>
>         When I add the zone in the UI I have one physical network with
> management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
>         When I fill in the storage traffic page I use the range
> 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the
> compute nodes and the NFS server.
>
>         I think maybe I am doing something wrong in the UI setup but it is
> not obvious to me what it is.
>
>         What I might try today unless you want me to keep the setup I have
> for more outputs is to go back to 2 NICs, one for storage/management and
> one for guest VMs.
>
>         I think with the 2 NICs setup the mistake I made last time when
> adding the zone was to assume storage would just run over management so I
> did not drag and drop the storage icon and assign it to cloudbr0 as with
> the management which I think is what I should do ?
>
>
>
>
>
>         ________________________________
>         From: Rafael Weingärtner <ra...@gmail.com>
>         Sent: 06 June 2018 10:54
>         To: users
>         Subject: Re: advanced networking with public IPs direct to VMs
>
>         Jon, do not panic we are here to help you :)
>         So, I might have mistyped the SQL query. You you use select * from
>         cloud.storage_pool where cluster_id = 1 and removed is not null ",
> you are
>         listing the storage pools removed. Therefore, the right query
> would be "
>         select * from cloud.storage_pool where cluster_id = 1 and removed
> is null "
>
>         There is also something else I do not understand. You are setting
> the
>         storage IP in the management subnet? I am not sure if you should
> be doing
>         like this. Normally, I set all my storages (primary[when working
> with NFS]
>         and secondary) to IPs in the storage subnet.
>
>         On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <
> Dag.Sonstebo@shapeblue.com>
>         wrote:
>
>         > Hi John,
>         >
>         > I’m late to this thread and have possibly missed some things –
> but a
>         > couple of observations:
>         >
>         > “When I add the zone and get to the storage web page I exclude
> the IPs
>         > already used for the compute node NICs and the NFS server
> itself. …..”
>         > “So the range is 172.30.5.1 -> 15 and the range I fill in is
> 172.30.5.10
>         > -> 172.30.5.14.”
>         >
>         > I think you may have some confusion around the use of the
> storage network.
>         > The important part here is to understand this is for *secondary
> storage*
>         > use only – it has nothing to do with primary storage. This means
> this
>         > storage network needs to be accessible to the SSVM, to the
> hypervisors, and
>         > secondary storage NFS pools needs to be accessible on this
> network.
>         >
>         > The important part – this also means you *can not use the same
> IP ranges
>         > for management and storage networks* - doing so means you will
> have issues
>         > where effectively both hypervisors and SSVM can see the same
> subnet on two
>         > NICs – and you end up in a routing black hole.
>         >
>         > So – you need to either:
>         >
>         > 1) Use different IP subnets on management and storage, or
>         > 2) preferably just simplify your setup – stop using a secondary
> storage
>         > network altogether and just allow secondary storage to use the
> management
>         > network (which is default). Unless you have a very high I/O
> environment in
>         > production you are just adding complexity by running separate
> management
>         > and storage.
>         >
>         > Regards,
>         > Dag Sonstebo
>         > Cloud Architect
>         > ShapeBlue
>         >
>         > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk>
> wrote:
>         >
>         >     I will disconnect the host this morning and test but before
> I do that
>         > I ran this command when all hosts are up -
>         >
>         >
>         >
>         >
>         >
>         >      select * from cloud.host;
>         >     +----+-----------------+------------------------------------
>         > --+--------+--------------------+--------------------+------
>         > -----------+---------------------+--------------------+-----
>         > ------------+---------------------+----------------------+--
>         > ---------------------+-------------------+------------+-----
>         > --------------+-----------------+--------------------+------
>         > ------+----------------+--------+-------------+------+------
>         > -+-------------------------------------+---------+----------
>         > -------+--------------------+------------+----------+-------
>         > ---+--------+------------+--------------+-------------------
>         > --------------------------------------------+-----------+---
>         > ----+-------------+------------+----------------+-----------
>         > ----------+---------------------+---------+--------------+--
>         > --------------+-------+-------------+--------------+
>         >     | id | name            | uuid
>  | status
>         > | type               | private_ip_address | private_netmask |
>         > private_mac_address | storage_ip_address | storage_netmask |
>         > storage_mac_address | storage_ip_address_2 |
> storage_mac_address_2 |
>         > storage_netmask_2 | cluster_id | public_ip_address |
> public_netmask  |
>         > public_mac_address | proxy_port | data_center_id | pod_id |
> cpu_sockets |
>         > cpus | speed | url                                 | fs_type |
>         > hypervisor_type | hypervisor_version | ram        | resource |
> version  |
>         > parent | total_size | capabilities | guid
>         >                         | available | setup | dom0_memory |
> last_ping  |
>         > mgmt_server_id | disconnected        | created             |
> removed |
>         > update_count | resource_state | owner | lastUpdated |
> engine_state |
>         >     +----+-----------------+------------------------------------
>         > --+--------+--------------------+--------------------+------
>         > -----------+---------------------+--------------------+-----
>         > ------------+---------------------+----------------------+--
>         > ---------------------+-------------------+------------+-----
>         > --------------+-----------------+--------------------+------
>         > ------+----------------+--------+-------------+------+------
>         > -+-------------------------------------+---------+----------
>         > -------+--------------------+------------+----------+-------
>         > ---+--------+------------+--------------+-------------------
>         > --------------------------------------------+-----------+---
>         > ----+-------------+------------+----------------+-----------
>         > ----------+---------------------+---------+--------------+--
>         > --------------+-------+-------------+--------------+
>         >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284
> | Up
>         >  | Routing            | 172.30.3.3         | 255.255.255.192 |
>         > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
>         > 00:22:19:92:4e:34   | NULL                 | NULL
>   | NULL
>         >             |          1 | 172.30.4.3        | 255.255.255.128 |
>         > 00:22:19:92:4e:35  |       NULL |              1 |      1 |
>      1 |
>         >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    |
> KVM
>         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> NULL   |
>         >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-
> LibvirtComputingResource
>         > |         1 |     0 |           0 | 1492390408 |   146457912294 |
>         > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |
> 4 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0
> | Up
>         >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
>         > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
>         > 1e:00:80:00:00:14   | NULL                 | NULL
>   | NULL
>         >             |       NULL | 172.30.4.98       | 255.255.255.128 |
>         > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |
>   NULL |
>         > NULL |  NULL | NoIqn                               | NULL    |
> NULL
>         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> NULL   |
>         >  NULL | NULL         | Proxy.2-ConsoleProxyResource
>         >           |         1 |     0 |           0 | 1492390409 |
>  146457912294 |
>         > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |
> 7 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c
> | Up
>         >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
>         > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
>         > 1e:00:3b:00:00:05   | NULL                 | NULL
>   | NULL
>         >             |       NULL | 172.30.4.86       | 255.255.255.128 |
>         > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |
>   NULL |
>         > NULL |  NULL | NoIqn                               | NULL    |
> NULL
>         >     | NULL               |          0 | NULL     | 4.11.0.0 |
> NULL   |
>         >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
>         >             |         1 |     0 |           0 | 1492390407 |
>  146457912294
>         > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |
>   7 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794
> | Up
>         >  | Routing            | 172.30.3.4         | 255.255.255.192 |
>         > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
>         > 00:26:b9:4a:97:7d   | NULL                 | NULL
>   | NULL
>         >             |          1 | 172.30.4.4        | 255.255.255.128 |
>         > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |
>      1 |
>         >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    |
> KVM
>         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> NULL   |
>         >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-
> LibvirtComputingResource
>         > |         1 |     0 |           0 | 1492450882 |   146457912294 |
>         > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |
> 8 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274
> | Up
>         >  | Routing            | 172.30.3.5         | 255.255.255.192 |
>         > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
>         > 00:24:e8:73:6a:b2   | NULL                 | NULL
>   | NULL
>         >             |          1 | 172.30.4.5        | 255.255.255.128 |
>         > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |
>      1 |
>         >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    |
> KVM
>         >      | NULL               | 7510159360 | NULL     | 4.11.0.0 |
> NULL   |
>         >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-
> LibvirtComputingResource
>         > |         1 |     0 |           0 | 1492390408 |   146457912294 |
>         > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |
> 6 |
>         > Enabled        | NULL  | NULL        | Disabled     |
>         >     +----+-----------------+------------------------------------
>         > --+--------+--------------------+--------------------+------
>         > -----------+---------------------+--------------------+-----
>         > ------------+---------------------+----------------------+--
>         > ---------------------+-------------------+------------+-----
>         > --------------+-----------------+--------------------+------
>         > ------+----------------+--------+-------------+------+------
>         > -+-------------------------------------+---------+----------
>         > -------+--------------------+------------+----------+-------
>         > ---+--------+------------+--------------+-------------------
>         > --------------------------------------------+-----------+---
>         > ----+-------------+------------+----------------+-----------
>         > ----------+---------------------+---------+--------------+--
>         > --------------+-------+-------------+--------------+
>         >     5 rows in set (0.00 sec)
>         >
>         >
>         >
>         >     and you can see that it says the storage IP address is the
> same as the
>         > private IP address (the management network).
>         >
>         >
>         >     I also ran the command you provided using the Cluster ID
> number from
>         > the table above -
>         >
>         >
>         >
>         >     mysql> select * from cloud.storage_pool where cluster_id = 1
> and
>         > removed is not null;
>         >     Empty set (0.00 sec)
>         >
>         >     mysql>
>         >
>         >     So assuming I am reading this correctly that seems to be the
> issue.
>         >
>         >
>         >     I am at a loss as to why though.
>         >
>         >
>         >     I have a separate NIC for storage as described. When I add
> the zone
>         > and get to the storage web page I exclude the IPs already used
> for the
>         > compute node NICs and the NFS server itself. I do this because
> initially I
>         > didn't and the SSVM started using the IP address of the NFS
> server.
>         >
>         >
>         >     So the range is 172.30.5.1 -> 15 and the range I fill in is
>         > 172.30.5.10 -> 172.30.5.14.
>         >
>         >
>         >     And I used the label "cloudbr2" for storage.
>         >
>         >
>         >     I must be doing this wrong somehow.
>         >
>         >
>         >     Any pointers would be much appreciated.
>         >
>         >
>         >
>         >
>         >     ________________________________
>         >     From: Rafael Weingärtner <ra...@gmail.com>
>         >     Sent: 05 June 2018 16:13
>         >     To: users
>         >     Subject: Re: advanced networking with public IPs direct to
> VMs
>         >
>         >     That is interesting. Let's see the source of all truth...
>         >     This is the code that is generating that odd message.
>         >
>         >     >     List<StoragePoolVO> clusterPools =
>         >     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
>         >     >         boolean hasNfs = false;
>         >     >         for (StoragePoolVO pool : clusterPools) {
>         >     >             if (pool.getPoolType() == StoragePoolType.
> NetworkFilesystem)
>         > {
>         >     >                 hasNfs = true;
>         >     >                 break;
>         >     >             }
>         >     >         }
>         >     >         if (!hasNfs) {
>         >     >             s_logger.warn(
>         >     >                     "Agent investigation was requested on
> host " +
>         > agent +
>         >     > ", but host does not support investigation because it has
> no NFS
>         > storage.
>         >     > Skipping investigation.");
>         >     >             return Status.Disconnected;
>         >     >         }
>         >     >
>         >
>         >     There are two possibilities here. You do not have any NFS
> storage? Is
>         > that
>         >     the case? Or maybe, for some reason, the call
>         >     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())"
> is not
>         > returning
>         >     any NFS storage pools. Looking at the "listPoolsByCluster "
> we will see
>         >     that the following SQL is used:
>         >
>         >     Select * from storage_pool where cluster_id =
> <host'sClusterId> and
>         > removed
>         >     > is not null
>         >     >
>         >
>         >     Can you run that SQL to see the its return when your hosts
> are marked
>         > as
>         >     disconnected?
>         >
>         >
>         > Dag.Sonstebo@shapeblue.com
>         > www.shapeblue.com<http://www.shapeblue.com>
>         Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
>         www.shapeblue.com<http://www.shapeblue.com>
>         ShapeBlue are the largest independent integrator of CloudStack
> technologies globally and are specialists in the design and implementation
> of IaaS cloud infrastructures for both private and public cloud
> implementations.
>
>
>
>         > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>         > @shapeblue
>         >
>         >
>         >
>         > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <
> jms.123@hotmail.co.uk>
>         > wrote:
>         >
>         >     > I reran the tests with the 3 NIC setup. When I configured
> the zone
>         > through
>         >     > the UI I used the labels cloudbr0 for management, cloudbr1
> for guest
>         >     > traffic and cloudbr2 for NFS as per my original response
> to you.
>         >     >
>         >     >
>         >     > When I pull the power to the node (dcp-cscn2.local) after
> about 5
>         > mins
>         >     > the  host status goes to "Alert" but never to "Down"
>         >     >
>         >     >
>         >     > I get this in the logs -
>         >     >
>         >     >
>         >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
>         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent
> investigation
>         > was
>         >     > requested on host Host[-4-Routing], but host does not
> support
>         > investigation
>         >     > because it has no NFS storage. Skipping investigation.
>         >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.
> HighAvailabilityManagerImpl]
>         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93)
> KVMInvestigator was
>         > able to
>         >     > determine host 4 is in Disconnected
>         >     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
>         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent
> from host
>         > 4 state
>         >     > determined is Disconnected
>         >     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
>         >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
>         > disconnected but
>         >     > the host is still up: 4-dcp-cscn2.local
>         >     >
>         >     > I don't understand why it thinks there is no NFS storage
> as each
>         > compute
>         >     > node has a dedicated storage NIC.
>         >     >
>         >     >
>         >     > I also don't understand why it thinks the host is still up
> ie. what
>         > test
>         >     > is it doing to determine that ?
>         >     >
>         >     >
>         >     > Am I just trying to get something working that is not
> supported ?
>         >     >
>         >     >
>         >     > ________________________________
>         >     > From: Rafael Weingärtner <ra...@gmail.com>
>         >     > Sent: 04 June 2018 15:31
>         >     > To: users
>         >     > Subject: Re: advanced networking with public IPs direct to
> VMs
>         >     >
>         >     > What type of failover are you talking about?
>         >     > What ACS version are you using?
>         >     > What hypervisor are you using?
>         >     > How are you configuring your NICs in the hypervisor?
>         >     > How are you configuring the traffic labels in ACS?
>         >     >
>         >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <
> jms.123@hotmail.co.uk
>         > >
>         >     > wrote:
>         >     >
>         >     > > Hi all
>         >     > >
>         >     > >
>         >     > > I am close to giving up on basic networking as I just
> cannot get
>         > failover
>         >     > > working with multiple NICs (I am not even sure it is
> supported).
>         >     > >
>         >     > >
>         >     > > What I would like is to use 3 NICs for management,
> storage and
>         > guest
>         >     > > traffic. I would like to assign public IPs direct to the
> VMs which
>         > is
>         >     > why I
>         >     > > originally chose basic.
>         >     > >
>         >     > >
>         >     > > If I switch to advanced networking do I just configure a
> guest VM
>         > with
>         >     > > public IPs on one NIC and not both with the public
> traffic -
>         >     > >
>         >     > >
>         >     > > would this work ?
>         >     > >
>         >     >
>         >     >
>         >     >
>         >     > --
>         >     > Rafael Weingärtner
>         >     >
>         >
>         >
>         >
>         >     --
>         >     Rafael Weingärtner
>         >
>         >
>         >
>
>
>         --
>         Rafael Weingärtner
>
>
>
>     Dag.Sonstebo@shapeblue.com
>     www.shapeblue.com<http://www.shapeblue.com>
>     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>     @shapeblue
>
>
>
>
>
>
> Dag.Sonstebo@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>


-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Dag Sonstebo <Da...@shapeblue.com>.
Hi Jon,

Still confused where your primary storage pools are – are you sure your hosts are in cluster 1?

Quick question just to make sure - assuming management/storage is on the same NIC when I setup basic networking the physical network has the management and guest icons already there and I just edit the KVM labels. If I am running storage over management do I need to drag the storage icon to the physical network and use the same KVM label (cloudbr0) as the management or does CS automatically just use the management NIC ie. I would only need to drag the storage icon across in basic setup if I wanted it on a different NIC/IP subnet ?  (hope that makes sense !)

>> I would do both – set up your 2/3 physical networks, name isn’t that important – but then drag the traffic types to the correct one and make sure the labels are correct.
Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue

On 06/06/2018, 12:39, "Jon Marshall" <jm...@hotmail.co.uk> wrote:

    Dag
    
    
    Do you mean  check the pools with "Infrastructure -> Primary Storage" and "Infrastructure -> Secondary Storage" within the UI ?
    
    
    If so Primary Storage has a state of UP, secondary storage does not show a state as such so not sure where else to check it ?
    
    
    Rerun of the command -
    
    mysql> select * from cloud.storage_pool where cluster_id = 1;
    Empty set (0.00 sec)
    
    mysql>
    
    I think it is something to do with my zone creation rather than the NIC, bridge setup although I can post those if needed.
    
    I may try to setup just the 2 NIC solution you mentioned although as I say I had the same issue with that ie. host goes to "Altert" state and same error messages.  The only time I can get it to go to "Down" state is when it is all on the single NIC.
    
    Quick question just to make sure - assuming management/storage is on the same NIC when I setup basic networking the physical network has the management and guest icons already there and I just edit the KVM labels. If I am running storage over management do I need to drag the storage icon to the physical network and use the same KVM label (cloudbr0) as the management or does CS automatically just use the management NIC ie. I would only need to drag the storage icon across in basic setup if I wanted it on a different NIC/IP subnet ?  (hope that makes sense !)
    
    On the plus side I have been at this for so long now and done so many rebuilds I could do it in my sleep now 😊
    
    
    ________________________________
    From: Dag Sonstebo <Da...@shapeblue.com>
    Sent: 06 June 2018 12:28
    To: users@cloudstack.apache.org
    Subject: Re: advanced networking with public IPs direct to VMs
    
    Looks OK to me Jon.
    
    The one thing that throws me is your storage pools – can you rerun your query: select * from cloud.storage_pool where cluster_id = 1;
    
    Do the pools show up as online in the CloudStack GUI?
    
    Regards,
    Dag Sonstebo
    Cloud Architect
    ShapeBlue
    
    On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
    
        Don't know whether this helps or not but I logged into the SSVM and ran an ifconfig -
    
    
        eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
                inet 169.254.3.35  netmask 255.255.0.0  broadcast 169.254.255.255
                ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
                RX packets 141  bytes 20249 (19.7 KiB)
                RX errors 0  dropped 0  overruns 0  frame 0
                TX packets 108  bytes 16287 (15.9 KiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
        eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
                inet 172.30.3.34  netmask 255.255.255.192  broadcast 172.30.3.63
                ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
                RX packets 56722  bytes 4953133 (4.7 MiB)
                RX errors 0  dropped 44573  overruns 0  frame 0
                TX packets 11224  bytes 1234932 (1.1 MiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
        eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
                inet 172.30.4.86  netmask 255.255.255.128  broadcast 172.30.4.127
                ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
                RX packets 366191  bytes 435300557 (415.1 MiB)
                RX errors 0  dropped 39456  overruns 0  frame 0
                TX packets 145065  bytes 7978602 (7.6 MiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
        eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
                inet 172.30.5.14  netmask 255.255.255.240  broadcast 172.30.5.15
                ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
                RX packets 132440  bytes 426362982 (406.6 MiB)
                RX errors 0  dropped 39446  overruns 0  frame 0
                TX packets 67443  bytes 423670834 (404.0 MiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
        lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
                inet 127.0.0.1  netmask 255.0.0.0
                loop  txqueuelen 1  (Local Loopback)
                RX packets 18  bytes 1440 (1.4 KiB)
                RX errors 0  dropped 0  overruns 0  frame 0
                TX packets 18  bytes 1440 (1.4 KiB)
                TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    
        so it has interfaces in both the management and the storage subnets (as well as guest).
    
    
    
        ________________________________
        From: Jon Marshall <jm...@hotmail.co.uk>
        Sent: 06 June 2018 11:08
        To: users@cloudstack.apache.org
        Subject: Re: advanced networking with public IPs direct to VMs
    
        Hi Rafael
    
    
        Thanks for the help, really appreciate it.
    
    
        So rerunning that command with all servers up -
    
    
    
        mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is null;
        Empty set (0.00 sec)
    
        mysql>
    
    
        As for the storage IP no I'm not setting it to be the management IP when I setup the zone but the output of the SQL command suggests that is what has happened.
    
        As I said to Dag I am using a different subnet for storage ie.
    
        172.30.3.0/26  - management subnet
        172.30.4.0/25 -  guest VM subnet
        172.30.5.0/28 - storage
    
        the NFS server IP is 172.30.5.2
    
        each compute node has 3 NICs with an IP from each subnet (i am assuming the management node only needs an IP in the management network ?)
    
        When I add the zone in the UI I have one physical network with management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
        When I fill in the storage traffic page I use the range 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the compute nodes and the NFS server.
    
        I think maybe I am doing something wrong in the UI setup but it is not obvious to me what it is.
    
        What I might try today unless you want me to keep the setup I have for more outputs is to go back to 2 NICs, one for storage/management and one for guest VMs.
    
        I think with the 2 NICs setup the mistake I made last time when adding the zone was to assume storage would just run over management so I did not drag and drop the storage icon and assign it to cloudbr0 as with the management which I think is what I should do ?
    
    
    
    
    
        ________________________________
        From: Rafael Weingärtner <ra...@gmail.com>
        Sent: 06 June 2018 10:54
        To: users
        Subject: Re: advanced networking with public IPs direct to VMs
    
        Jon, do not panic we are here to help you :)
        So, I might have mistyped the SQL query. You you use select * from
        cloud.storage_pool where cluster_id = 1 and removed is not null ", you are
        listing the storage pools removed. Therefore, the right query would be "
        select * from cloud.storage_pool where cluster_id = 1 and removed is null "
    
        There is also something else I do not understand. You are setting the
        storage IP in the management subnet? I am not sure if you should be doing
        like this. Normally, I set all my storages (primary[when working with NFS]
        and secondary) to IPs in the storage subnet.
    
        On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <Da...@shapeblue.com>
        wrote:
    
        > Hi John,
        >
        > I’m late to this thread and have possibly missed some things – but a
        > couple of observations:
        >
        > “When I add the zone and get to the storage web page I exclude the IPs
        > already used for the compute node NICs and the NFS server itself. …..”
        > “So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10
        > -> 172.30.5.14.”
        >
        > I think you may have some confusion around the use of the storage network.
        > The important part here is to understand this is for *secondary storage*
        > use only – it has nothing to do with primary storage. This means this
        > storage network needs to be accessible to the SSVM, to the hypervisors, and
        > secondary storage NFS pools needs to be accessible on this network.
        >
        > The important part – this also means you *can not use the same IP ranges
        > for management and storage networks* - doing so means you will have issues
        > where effectively both hypervisors and SSVM can see the same subnet on two
        > NICs – and you end up in a routing black hole.
        >
        > So – you need to either:
        >
        > 1) Use different IP subnets on management and storage, or
        > 2) preferably just simplify your setup – stop using a secondary storage
        > network altogether and just allow secondary storage to use the management
        > network (which is default). Unless you have a very high I/O environment in
        > production you are just adding complexity by running separate management
        > and storage.
        >
        > Regards,
        > Dag Sonstebo
        > Cloud Architect
        > ShapeBlue
        >
        > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
        >
        >     I will disconnect the host this morning and test but before I do that
        > I ran this command when all hosts are up -
        >
        >
        >
        >
        >
        >      select * from cloud.host;
        >     +----+-----------------+------------------------------------
        > --+--------+--------------------+--------------------+------
        > -----------+---------------------+--------------------+-----
        > ------------+---------------------+----------------------+--
        > ---------------------+-------------------+------------+-----
        > --------------+-----------------+--------------------+------
        > ------+----------------+--------+-------------+------+------
        > -+-------------------------------------+---------+----------
        > -------+--------------------+------------+----------+-------
        > ---+--------+------------+--------------+-------------------
        > --------------------------------------------+-----------+---
        > ----+-------------+------------+----------------+-----------
        > ----------+---------------------+---------+--------------+--
        > --------------+-------+-------------+--------------+
        >     | id | name            | uuid                                 | status
        > | type               | private_ip_address | private_netmask |
        > private_mac_address | storage_ip_address | storage_netmask |
        > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
        > storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
        > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
        > cpus | speed | url                                 | fs_type |
        > hypervisor_type | hypervisor_version | ram        | resource | version  |
        > parent | total_size | capabilities | guid
        >                         | available | setup | dom0_memory | last_ping  |
        > mgmt_server_id | disconnected        | created             | removed |
        > update_count | resource_state | owner | lastUpdated | engine_state |
        >     +----+-----------------+------------------------------------
        > --+--------+--------------------+--------------------+------
        > -----------+---------------------+--------------------+-----
        > ------------+---------------------+----------------------+--
        > ---------------------+-------------------+------------+-----
        > --------------+-----------------+--------------------+------
        > ------+----------------+--------+-------------+------+------
        > -+-------------------------------------+---------+----------
        > -------+--------------------+------------+----------+-------
        > ---+--------+------------+--------------+-------------------
        > --------------------------------------------+-----------+---
        > ----+-------------+------------+----------------+-----------
        > ----------+---------------------+---------+--------------+--
        > --------------+-------+-------------+--------------+
        >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up
        >  | Routing            | 172.30.3.3         | 255.255.255.192 |
        > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
        > 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
        >             |          1 | 172.30.4.3        | 255.255.255.128 |
        > 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
        >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
        >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
        >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
        > |         1 |     0 |           0 | 1492390408 |   146457912294 |
        > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up
        >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
        > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
        > 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL
        >             |       NULL | 172.30.4.98       | 255.255.255.128 |
        > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL |
        > NULL |  NULL | NoIqn                               | NULL    | NULL
        >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
        >  NULL | NULL         | Proxy.2-ConsoleProxyResource
        >           |         1 |     0 |           0 | 1492390409 |   146457912294 |
        > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up
        >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
        > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
        > 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL
        >             |       NULL | 172.30.4.86       | 255.255.255.128 |
        > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL |
        > NULL |  NULL | NoIqn                               | NULL    | NULL
        >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
        >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
        >             |         1 |     0 |           0 | 1492390407 |   146457912294
        > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up
        >  | Routing            | 172.30.3.4         | 255.255.255.192 |
        > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
        > 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
        >             |          1 | 172.30.4.4        | 255.255.255.128 |
        > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
        >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
        >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
        >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
        > |         1 |     0 |           0 | 1492450882 |   146457912294 |
        > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up
        >  | Routing            | 172.30.3.5         | 255.255.255.192 |
        > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
        > 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
        >             |          1 | 172.30.4.5        | 255.255.255.128 |
        > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
        >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
        >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
        >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
        > |         1 |     0 |           0 | 1492390408 |   146457912294 |
        > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 |
        > Enabled        | NULL  | NULL        | Disabled     |
        >     +----+-----------------+------------------------------------
        > --+--------+--------------------+--------------------+------
        > -----------+---------------------+--------------------+-----
        > ------------+---------------------+----------------------+--
        > ---------------------+-------------------+------------+-----
        > --------------+-----------------+--------------------+------
        > ------+----------------+--------+-------------+------+------
        > -+-------------------------------------+---------+----------
        > -------+--------------------+------------+----------+-------
        > ---+--------+------------+--------------+-------------------
        > --------------------------------------------+-----------+---
        > ----+-------------+------------+----------------+-----------
        > ----------+---------------------+---------+--------------+--
        > --------------+-------+-------------+--------------+
        >     5 rows in set (0.00 sec)
        >
        >
        >
        >     and you can see that it says the storage IP address is the same as the
        > private IP address (the management network).
        >
        >
        >     I also ran the command you provided using the Cluster ID number from
        > the table above -
        >
        >
        >
        >     mysql> select * from cloud.storage_pool where cluster_id = 1 and
        > removed is not null;
        >     Empty set (0.00 sec)
        >
        >     mysql>
        >
        >     So assuming I am reading this correctly that seems to be the issue.
        >
        >
        >     I am at a loss as to why though.
        >
        >
        >     I have a separate NIC for storage as described. When I add the zone
        > and get to the storage web page I exclude the IPs already used for the
        > compute node NICs and the NFS server itself. I do this because initially I
        > didn't and the SSVM started using the IP address of the NFS server.
        >
        >
        >     So the range is 172.30.5.1 -> 15 and the range I fill in is
        > 172.30.5.10 -> 172.30.5.14.
        >
        >
        >     And I used the label "cloudbr2" for storage.
        >
        >
        >     I must be doing this wrong somehow.
        >
        >
        >     Any pointers would be much appreciated.
        >
        >
        >
        >
        >     ________________________________
        >     From: Rafael Weingärtner <ra...@gmail.com>
        >     Sent: 05 June 2018 16:13
        >     To: users
        >     Subject: Re: advanced networking with public IPs direct to VMs
        >
        >     That is interesting. Let's see the source of all truth...
        >     This is the code that is generating that odd message.
        >
        >     >     List<StoragePoolVO> clusterPools =
        >     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
        >     >         boolean hasNfs = false;
        >     >         for (StoragePoolVO pool : clusterPools) {
        >     >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem)
        > {
        >     >                 hasNfs = true;
        >     >                 break;
        >     >             }
        >     >         }
        >     >         if (!hasNfs) {
        >     >             s_logger.warn(
        >     >                     "Agent investigation was requested on host " +
        > agent +
        >     > ", but host does not support investigation because it has no NFS
        > storage.
        >     > Skipping investigation.");
        >     >             return Status.Disconnected;
        >     >         }
        >     >
        >
        >     There are two possibilities here. You do not have any NFS storage? Is
        > that
        >     the case? Or maybe, for some reason, the call
        >     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
        > returning
        >     any NFS storage pools. Looking at the "listPoolsByCluster " we will see
        >     that the following SQL is used:
        >
        >     Select * from storage_pool where cluster_id = <host'sClusterId> and
        > removed
        >     > is not null
        >     >
        >
        >     Can you run that SQL to see the its return when your hosts are marked
        > as
        >     disconnected?
        >
        >
        > Dag.Sonstebo@shapeblue.com
        > www.shapeblue.com<http://www.shapeblue.com>
        Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
        www.shapeblue.com<http://www.shapeblue.com>
        ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.
    
    
    
        > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
        > @shapeblue
        >
        >
        >
        > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk>
        > wrote:
        >
        >     > I reran the tests with the 3 NIC setup. When I configured the zone
        > through
        >     > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
        >     > traffic and cloudbr2 for NFS as per my original response to you.
        >     >
        >     >
        >     > When I pull the power to the node (dcp-cscn2.local) after about 5
        > mins
        >     > the  host status goes to "Alert" but never to "Down"
        >     >
        >     >
        >     > I get this in the logs -
        >     >
        >     >
        >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
        >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation
        > was
        >     > requested on host Host[-4-Routing], but host does not support
        > investigation
        >     > because it has no NFS storage. Skipping investigation.
        >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
        >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was
        > able to
        >     > determine host 4 is in Disconnected
        >     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
        >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host
        > 4 state
        >     > determined is Disconnected
        >     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
        >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
        > disconnected but
        >     > the host is still up: 4-dcp-cscn2.local
        >     >
        >     > I don't understand why it thinks there is no NFS storage as each
        > compute
        >     > node has a dedicated storage NIC.
        >     >
        >     >
        >     > I also don't understand why it thinks the host is still up ie. what
        > test
        >     > is it doing to determine that ?
        >     >
        >     >
        >     > Am I just trying to get something working that is not supported ?
        >     >
        >     >
        >     > ________________________________
        >     > From: Rafael Weingärtner <ra...@gmail.com>
        >     > Sent: 04 June 2018 15:31
        >     > To: users
        >     > Subject: Re: advanced networking with public IPs direct to VMs
        >     >
        >     > What type of failover are you talking about?
        >     > What ACS version are you using?
        >     > What hypervisor are you using?
        >     > How are you configuring your NICs in the hypervisor?
        >     > How are you configuring the traffic labels in ACS?
        >     >
        >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms.123@hotmail.co.uk
        > >
        >     > wrote:
        >     >
        >     > > Hi all
        >     > >
        >     > >
        >     > > I am close to giving up on basic networking as I just cannot get
        > failover
        >     > > working with multiple NICs (I am not even sure it is supported).
        >     > >
        >     > >
        >     > > What I would like is to use 3 NICs for management, storage and
        > guest
        >     > > traffic. I would like to assign public IPs direct to the VMs which
        > is
        >     > why I
        >     > > originally chose basic.
        >     > >
        >     > >
        >     > > If I switch to advanced networking do I just configure a guest VM
        > with
        >     > > public IPs on one NIC and not both with the public traffic -
        >     > >
        >     > >
        >     > > would this work ?
        >     > >
        >     >
        >     >
        >     >
        >     > --
        >     > Rafael Weingärtner
        >     >
        >
        >
        >
        >     --
        >     Rafael Weingärtner
        >
        >
        >
    
    
        --
        Rafael Weingärtner
    
    
    
    Dag.Sonstebo@shapeblue.com
    www.shapeblue.com<http://www.shapeblue.com>
    53 Chandos Place, Covent Garden, London  WC2N 4HSUK
    @shapeblue
    
    
    
    


Dag.Sonstebo@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Dag


Do you mean  check the pools with "Infrastructure -> Primary Storage" and "Infrastructure -> Secondary Storage" within the UI ?


If so Primary Storage has a state of UP, secondary storage does not show a state as such so not sure where else to check it ?


Rerun of the command -

mysql> select * from cloud.storage_pool where cluster_id = 1;
Empty set (0.00 sec)

mysql>

I think it is something to do with my zone creation rather than the NIC, bridge setup although I can post those if needed.

I may try to setup just the 2 NIC solution you mentioned although as I say I had the same issue with that ie. host goes to "Altert" state and same error messages.  The only time I can get it to go to "Down" state is when it is all on the single NIC.

Quick question just to make sure - assuming management/storage is on the same NIC when I setup basic networking the physical network has the management and guest icons already there and I just edit the KVM labels. If I am running storage over management do I need to drag the storage icon to the physical network and use the same KVM label (cloudbr0) as the management or does CS automatically just use the management NIC ie. I would only need to drag the storage icon across in basic setup if I wanted it on a different NIC/IP subnet ?  (hope that makes sense !)

On the plus side I have been at this for so long now and done so many rebuilds I could do it in my sleep now 😊


________________________________
From: Dag Sonstebo <Da...@shapeblue.com>
Sent: 06 June 2018 12:28
To: users@cloudstack.apache.org
Subject: Re: advanced networking with public IPs direct to VMs

Looks OK to me Jon.

The one thing that throws me is your storage pools – can you rerun your query: select * from cloud.storage_pool where cluster_id = 1;

Do the pools show up as online in the CloudStack GUI?

Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue

On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk> wrote:

    Don't know whether this helps or not but I logged into the SSVM and ran an ifconfig -


    eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 169.254.3.35  netmask 255.255.0.0  broadcast 169.254.255.255
            ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
            RX packets 141  bytes 20249 (19.7 KiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 108  bytes 16287 (15.9 KiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

    eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.30.3.34  netmask 255.255.255.192  broadcast 172.30.3.63
            ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
            RX packets 56722  bytes 4953133 (4.7 MiB)
            RX errors 0  dropped 44573  overruns 0  frame 0
            TX packets 11224  bytes 1234932 (1.1 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

    eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.30.4.86  netmask 255.255.255.128  broadcast 172.30.4.127
            ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
            RX packets 366191  bytes 435300557 (415.1 MiB)
            RX errors 0  dropped 39456  overruns 0  frame 0
            TX packets 145065  bytes 7978602 (7.6 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

    eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.30.5.14  netmask 255.255.255.240  broadcast 172.30.5.15
            ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
            RX packets 132440  bytes 426362982 (406.6 MiB)
            RX errors 0  dropped 39446  overruns 0  frame 0
            TX packets 67443  bytes 423670834 (404.0 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

    lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
            inet 127.0.0.1  netmask 255.0.0.0
            loop  txqueuelen 1  (Local Loopback)
            RX packets 18  bytes 1440 (1.4 KiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 18  bytes 1440 (1.4 KiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


    so it has interfaces in both the management and the storage subnets (as well as guest).



    ________________________________
    From: Jon Marshall <jm...@hotmail.co.uk>
    Sent: 06 June 2018 11:08
    To: users@cloudstack.apache.org
    Subject: Re: advanced networking with public IPs direct to VMs

    Hi Rafael


    Thanks for the help, really appreciate it.


    So rerunning that command with all servers up -



    mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is null;
    Empty set (0.00 sec)

    mysql>


    As for the storage IP no I'm not setting it to be the management IP when I setup the zone but the output of the SQL command suggests that is what has happened.

    As I said to Dag I am using a different subnet for storage ie.

    172.30.3.0/26  - management subnet
    172.30.4.0/25 -  guest VM subnet
    172.30.5.0/28 - storage

    the NFS server IP is 172.30.5.2

    each compute node has 3 NICs with an IP from each subnet (i am assuming the management node only needs an IP in the management network ?)

    When I add the zone in the UI I have one physical network with management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
    When I fill in the storage traffic page I use the range 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the compute nodes and the NFS server.

    I think maybe I am doing something wrong in the UI setup but it is not obvious to me what it is.

    What I might try today unless you want me to keep the setup I have for more outputs is to go back to 2 NICs, one for storage/management and one for guest VMs.

    I think with the 2 NICs setup the mistake I made last time when adding the zone was to assume storage would just run over management so I did not drag and drop the storage icon and assign it to cloudbr0 as with the management which I think is what I should do ?





    ________________________________
    From: Rafael Weingärtner <ra...@gmail.com>
    Sent: 06 June 2018 10:54
    To: users
    Subject: Re: advanced networking with public IPs direct to VMs

    Jon, do not panic we are here to help you :)
    So, I might have mistyped the SQL query. You you use select * from
    cloud.storage_pool where cluster_id = 1 and removed is not null ", you are
    listing the storage pools removed. Therefore, the right query would be "
    select * from cloud.storage_pool where cluster_id = 1 and removed is null "

    There is also something else I do not understand. You are setting the
    storage IP in the management subnet? I am not sure if you should be doing
    like this. Normally, I set all my storages (primary[when working with NFS]
    and secondary) to IPs in the storage subnet.

    On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <Da...@shapeblue.com>
    wrote:

    > Hi John,
    >
    > I’m late to this thread and have possibly missed some things – but a
    > couple of observations:
    >
    > “When I add the zone and get to the storage web page I exclude the IPs
    > already used for the compute node NICs and the NFS server itself. …..”
    > “So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10
    > -> 172.30.5.14.”
    >
    > I think you may have some confusion around the use of the storage network.
    > The important part here is to understand this is for *secondary storage*
    > use only – it has nothing to do with primary storage. This means this
    > storage network needs to be accessible to the SSVM, to the hypervisors, and
    > secondary storage NFS pools needs to be accessible on this network.
    >
    > The important part – this also means you *can not use the same IP ranges
    > for management and storage networks* - doing so means you will have issues
    > where effectively both hypervisors and SSVM can see the same subnet on two
    > NICs – and you end up in a routing black hole.
    >
    > So – you need to either:
    >
    > 1) Use different IP subnets on management and storage, or
    > 2) preferably just simplify your setup – stop using a secondary storage
    > network altogether and just allow secondary storage to use the management
    > network (which is default). Unless you have a very high I/O environment in
    > production you are just adding complexity by running separate management
    > and storage.
    >
    > Regards,
    > Dag Sonstebo
    > Cloud Architect
    > ShapeBlue
    >
    > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
    >
    >     I will disconnect the host this morning and test but before I do that
    > I ran this command when all hosts are up -
    >
    >
    >
    >
    >
    >      select * from cloud.host;
    >     +----+-----------------+------------------------------------
    > --+--------+--------------------+--------------------+------
    > -----------+---------------------+--------------------+-----
    > ------------+---------------------+----------------------+--
    > ---------------------+-------------------+------------+-----
    > --------------+-----------------+--------------------+------
    > ------+----------------+--------+-------------+------+------
    > -+-------------------------------------+---------+----------
    > -------+--------------------+------------+----------+-------
    > ---+--------+------------+--------------+-------------------
    > --------------------------------------------+-----------+---
    > ----+-------------+------------+----------------+-----------
    > ----------+---------------------+---------+--------------+--
    > --------------+-------+-------------+--------------+
    >     | id | name            | uuid                                 | status
    > | type               | private_ip_address | private_netmask |
    > private_mac_address | storage_ip_address | storage_netmask |
    > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
    > storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
    > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
    > cpus | speed | url                                 | fs_type |
    > hypervisor_type | hypervisor_version | ram        | resource | version  |
    > parent | total_size | capabilities | guid
    >                         | available | setup | dom0_memory | last_ping  |
    > mgmt_server_id | disconnected        | created             | removed |
    > update_count | resource_state | owner | lastUpdated | engine_state |
    >     +----+-----------------+------------------------------------
    > --+--------+--------------------+--------------------+------
    > -----------+---------------------+--------------------+-----
    > ------------+---------------------+----------------------+--
    > ---------------------+-------------------+------------+-----
    > --------------+-----------------+--------------------+------
    > ------+----------------+--------+-------------+------+------
    > -+-------------------------------------+---------+----------
    > -------+--------------------+------------+----------+-------
    > ---+--------+------------+--------------+-------------------
    > --------------------------------------------+-----------+---
    > ----+-------------+------------+----------------+-----------
    > ----------+---------------------+---------+--------------+--
    > --------------+-------+-------------+--------------+
    >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up
    >  | Routing            | 172.30.3.3         | 255.255.255.192 |
    > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
    > 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
    >             |          1 | 172.30.4.3        | 255.255.255.128 |
    > 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
    >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
    >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
    >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
    > |         1 |     0 |           0 | 1492390408 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up
    >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
    > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
    > 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL
    >             |       NULL | 172.30.4.98       | 255.255.255.128 |
    > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL |
    > NULL |  NULL | NoIqn                               | NULL    | NULL
    >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
    >  NULL | NULL         | Proxy.2-ConsoleProxyResource
    >           |         1 |     0 |           0 | 1492390409 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up
    >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
    > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
    > 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL
    >             |       NULL | 172.30.4.86       | 255.255.255.128 |
    > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL |
    > NULL |  NULL | NoIqn                               | NULL    | NULL
    >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
    >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
    >             |         1 |     0 |           0 | 1492390407 |   146457912294
    > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up
    >  | Routing            | 172.30.3.4         | 255.255.255.192 |
    > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
    > 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
    >             |          1 | 172.30.4.4        | 255.255.255.128 |
    > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
    >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
    >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
    >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
    > |         1 |     0 |           0 | 1492450882 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up
    >  | Routing            | 172.30.3.5         | 255.255.255.192 |
    > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
    > 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
    >             |          1 | 172.30.4.5        | 255.255.255.128 |
    > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
    >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
    >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
    >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
    > |         1 |     0 |           0 | 1492390408 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     +----+-----------------+------------------------------------
    > --+--------+--------------------+--------------------+------
    > -----------+---------------------+--------------------+-----
    > ------------+---------------------+----------------------+--
    > ---------------------+-------------------+------------+-----
    > --------------+-----------------+--------------------+------
    > ------+----------------+--------+-------------+------+------
    > -+-------------------------------------+---------+----------
    > -------+--------------------+------------+----------+-------
    > ---+--------+------------+--------------+-------------------
    > --------------------------------------------+-----------+---
    > ----+-------------+------------+----------------+-----------
    > ----------+---------------------+---------+--------------+--
    > --------------+-------+-------------+--------------+
    >     5 rows in set (0.00 sec)
    >
    >
    >
    >     and you can see that it says the storage IP address is the same as the
    > private IP address (the management network).
    >
    >
    >     I also ran the command you provided using the Cluster ID number from
    > the table above -
    >
    >
    >
    >     mysql> select * from cloud.storage_pool where cluster_id = 1 and
    > removed is not null;
    >     Empty set (0.00 sec)
    >
    >     mysql>
    >
    >     So assuming I am reading this correctly that seems to be the issue.
    >
    >
    >     I am at a loss as to why though.
    >
    >
    >     I have a separate NIC for storage as described. When I add the zone
    > and get to the storage web page I exclude the IPs already used for the
    > compute node NICs and the NFS server itself. I do this because initially I
    > didn't and the SSVM started using the IP address of the NFS server.
    >
    >
    >     So the range is 172.30.5.1 -> 15 and the range I fill in is
    > 172.30.5.10 -> 172.30.5.14.
    >
    >
    >     And I used the label "cloudbr2" for storage.
    >
    >
    >     I must be doing this wrong somehow.
    >
    >
    >     Any pointers would be much appreciated.
    >
    >
    >
    >
    >     ________________________________
    >     From: Rafael Weingärtner <ra...@gmail.com>
    >     Sent: 05 June 2018 16:13
    >     To: users
    >     Subject: Re: advanced networking with public IPs direct to VMs
    >
    >     That is interesting. Let's see the source of all truth...
    >     This is the code that is generating that odd message.
    >
    >     >     List<StoragePoolVO> clusterPools =
    >     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
    >     >         boolean hasNfs = false;
    >     >         for (StoragePoolVO pool : clusterPools) {
    >     >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem)
    > {
    >     >                 hasNfs = true;
    >     >                 break;
    >     >             }
    >     >         }
    >     >         if (!hasNfs) {
    >     >             s_logger.warn(
    >     >                     "Agent investigation was requested on host " +
    > agent +
    >     > ", but host does not support investigation because it has no NFS
    > storage.
    >     > Skipping investigation.");
    >     >             return Status.Disconnected;
    >     >         }
    >     >
    >
    >     There are two possibilities here. You do not have any NFS storage? Is
    > that
    >     the case? Or maybe, for some reason, the call
    >     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
    > returning
    >     any NFS storage pools. Looking at the "listPoolsByCluster " we will see
    >     that the following SQL is used:
    >
    >     Select * from storage_pool where cluster_id = <host'sClusterId> and
    > removed
    >     > is not null
    >     >
    >
    >     Can you run that SQL to see the its return when your hosts are marked
    > as
    >     disconnected?
    >
    >
    > Dag.Sonstebo@shapeblue.com
    > www.shapeblue.com<http://www.shapeblue.com>
    Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
    www.shapeblue.com<http://www.shapeblue.com>
    ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



    > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
    > @shapeblue
    >
    >
    >
    > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk>
    > wrote:
    >
    >     > I reran the tests with the 3 NIC setup. When I configured the zone
    > through
    >     > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
    >     > traffic and cloudbr2 for NFS as per my original response to you.
    >     >
    >     >
    >     > When I pull the power to the node (dcp-cscn2.local) after about 5
    > mins
    >     > the  host status goes to "Alert" but never to "Down"
    >     >
    >     >
    >     > I get this in the logs -
    >     >
    >     >
    >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation
    > was
    >     > requested on host Host[-4-Routing], but host does not support
    > investigation
    >     > because it has no NFS storage. Skipping investigation.
    >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was
    > able to
    >     > determine host 4 is in Disconnected
    >     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host
    > 4 state
    >     > determined is Disconnected
    >     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
    > disconnected but
    >     > the host is still up: 4-dcp-cscn2.local
    >     >
    >     > I don't understand why it thinks there is no NFS storage as each
    > compute
    >     > node has a dedicated storage NIC.
    >     >
    >     >
    >     > I also don't understand why it thinks the host is still up ie. what
    > test
    >     > is it doing to determine that ?
    >     >
    >     >
    >     > Am I just trying to get something working that is not supported ?
    >     >
    >     >
    >     > ________________________________
    >     > From: Rafael Weingärtner <ra...@gmail.com>
    >     > Sent: 04 June 2018 15:31
    >     > To: users
    >     > Subject: Re: advanced networking with public IPs direct to VMs
    >     >
    >     > What type of failover are you talking about?
    >     > What ACS version are you using?
    >     > What hypervisor are you using?
    >     > How are you configuring your NICs in the hypervisor?
    >     > How are you configuring the traffic labels in ACS?
    >     >
    >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms.123@hotmail.co.uk
    > >
    >     > wrote:
    >     >
    >     > > Hi all
    >     > >
    >     > >
    >     > > I am close to giving up on basic networking as I just cannot get
    > failover
    >     > > working with multiple NICs (I am not even sure it is supported).
    >     > >
    >     > >
    >     > > What I would like is to use 3 NICs for management, storage and
    > guest
    >     > > traffic. I would like to assign public IPs direct to the VMs which
    > is
    >     > why I
    >     > > originally chose basic.
    >     > >
    >     > >
    >     > > If I switch to advanced networking do I just configure a guest VM
    > with
    >     > > public IPs on one NIC and not both with the public traffic -
    >     > >
    >     > >
    >     > > would this work ?
    >     > >
    >     >
    >     >
    >     >
    >     > --
    >     > Rafael Weingärtner
    >     >
    >
    >
    >
    >     --
    >     Rafael Weingärtner
    >
    >
    >


    --
    Rafael Weingärtner



Dag.Sonstebo@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




Re: advanced networking with public IPs direct to VMs

Posted by Dag Sonstebo <Da...@shapeblue.com>.
Looks OK to me Jon. 

The one thing that throws me is your storage pools – can you rerun your query: select * from cloud.storage_pool where cluster_id = 1;

Do the pools show up as online in the CloudStack GUI?

Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue

On 06/06/2018, 12:08, "Jon Marshall" <jm...@hotmail.co.uk> wrote:

    Don't know whether this helps or not but I logged into the SSVM and ran an ifconfig -
    
    
    eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 169.254.3.35  netmask 255.255.0.0  broadcast 169.254.255.255
            ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
            RX packets 141  bytes 20249 (19.7 KiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 108  bytes 16287 (15.9 KiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.30.3.34  netmask 255.255.255.192  broadcast 172.30.3.63
            ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
            RX packets 56722  bytes 4953133 (4.7 MiB)
            RX errors 0  dropped 44573  overruns 0  frame 0
            TX packets 11224  bytes 1234932 (1.1 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.30.4.86  netmask 255.255.255.128  broadcast 172.30.4.127
            ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
            RX packets 366191  bytes 435300557 (415.1 MiB)
            RX errors 0  dropped 39456  overruns 0  frame 0
            TX packets 145065  bytes 7978602 (7.6 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.30.5.14  netmask 255.255.255.240  broadcast 172.30.5.15
            ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
            RX packets 132440  bytes 426362982 (406.6 MiB)
            RX errors 0  dropped 39446  overruns 0  frame 0
            TX packets 67443  bytes 423670834 (404.0 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
            inet 127.0.0.1  netmask 255.0.0.0
            loop  txqueuelen 1  (Local Loopback)
            RX packets 18  bytes 1440 (1.4 KiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 18  bytes 1440 (1.4 KiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    
    so it has interfaces in both the management and the storage subnets (as well as guest).
    
    
    
    ________________________________
    From: Jon Marshall <jm...@hotmail.co.uk>
    Sent: 06 June 2018 11:08
    To: users@cloudstack.apache.org
    Subject: Re: advanced networking with public IPs direct to VMs
    
    Hi Rafael
    
    
    Thanks for the help, really appreciate it.
    
    
    So rerunning that command with all servers up -
    
    
    
    mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is null;
    Empty set (0.00 sec)
    
    mysql>
    
    
    As for the storage IP no I'm not setting it to be the management IP when I setup the zone but the output of the SQL command suggests that is what has happened.
    
    As I said to Dag I am using a different subnet for storage ie.
    
    172.30.3.0/26  - management subnet
    172.30.4.0/25 -  guest VM subnet
    172.30.5.0/28 - storage
    
    the NFS server IP is 172.30.5.2
    
    each compute node has 3 NICs with an IP from each subnet (i am assuming the management node only needs an IP in the management network ?)
    
    When I add the zone in the UI I have one physical network with management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
    When I fill in the storage traffic page I use the range 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the compute nodes and the NFS server.
    
    I think maybe I am doing something wrong in the UI setup but it is not obvious to me what it is.
    
    What I might try today unless you want me to keep the setup I have for more outputs is to go back to 2 NICs, one for storage/management and one for guest VMs.
    
    I think with the 2 NICs setup the mistake I made last time when adding the zone was to assume storage would just run over management so I did not drag and drop the storage icon and assign it to cloudbr0 as with the management which I think is what I should do ?
    
    
    
    
    
    ________________________________
    From: Rafael Weingärtner <ra...@gmail.com>
    Sent: 06 June 2018 10:54
    To: users
    Subject: Re: advanced networking with public IPs direct to VMs
    
    Jon, do not panic we are here to help you :)
    So, I might have mistyped the SQL query. You you use select * from
    cloud.storage_pool where cluster_id = 1 and removed is not null ", you are
    listing the storage pools removed. Therefore, the right query would be "
    select * from cloud.storage_pool where cluster_id = 1 and removed is null "
    
    There is also something else I do not understand. You are setting the
    storage IP in the management subnet? I am not sure if you should be doing
    like this. Normally, I set all my storages (primary[when working with NFS]
    and secondary) to IPs in the storage subnet.
    
    On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <Da...@shapeblue.com>
    wrote:
    
    > Hi John,
    >
    > I’m late to this thread and have possibly missed some things – but a
    > couple of observations:
    >
    > “When I add the zone and get to the storage web page I exclude the IPs
    > already used for the compute node NICs and the NFS server itself. …..”
    > “So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10
    > -> 172.30.5.14.”
    >
    > I think you may have some confusion around the use of the storage network.
    > The important part here is to understand this is for *secondary storage*
    > use only – it has nothing to do with primary storage. This means this
    > storage network needs to be accessible to the SSVM, to the hypervisors, and
    > secondary storage NFS pools needs to be accessible on this network.
    >
    > The important part – this also means you *can not use the same IP ranges
    > for management and storage networks* - doing so means you will have issues
    > where effectively both hypervisors and SSVM can see the same subnet on two
    > NICs – and you end up in a routing black hole.
    >
    > So – you need to either:
    >
    > 1) Use different IP subnets on management and storage, or
    > 2) preferably just simplify your setup – stop using a secondary storage
    > network altogether and just allow secondary storage to use the management
    > network (which is default). Unless you have a very high I/O environment in
    > production you are just adding complexity by running separate management
    > and storage.
    >
    > Regards,
    > Dag Sonstebo
    > Cloud Architect
    > ShapeBlue
    >
    > On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
    >
    >     I will disconnect the host this morning and test but before I do that
    > I ran this command when all hosts are up -
    >
    >
    >
    >
    >
    >      select * from cloud.host;
    >     +----+-----------------+------------------------------------
    > --+--------+--------------------+--------------------+------
    > -----------+---------------------+--------------------+-----
    > ------------+---------------------+----------------------+--
    > ---------------------+-------------------+------------+-----
    > --------------+-----------------+--------------------+------
    > ------+----------------+--------+-------------+------+------
    > -+-------------------------------------+---------+----------
    > -------+--------------------+------------+----------+-------
    > ---+--------+------------+--------------+-------------------
    > --------------------------------------------+-----------+---
    > ----+-------------+------------+----------------+-----------
    > ----------+---------------------+---------+--------------+--
    > --------------+-------+-------------+--------------+
    >     | id | name            | uuid                                 | status
    > | type               | private_ip_address | private_netmask |
    > private_mac_address | storage_ip_address | storage_netmask |
    > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
    > storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
    > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
    > cpus | speed | url                                 | fs_type |
    > hypervisor_type | hypervisor_version | ram        | resource | version  |
    > parent | total_size | capabilities | guid
    >                         | available | setup | dom0_memory | last_ping  |
    > mgmt_server_id | disconnected        | created             | removed |
    > update_count | resource_state | owner | lastUpdated | engine_state |
    >     +----+-----------------+------------------------------------
    > --+--------+--------------------+--------------------+------
    > -----------+---------------------+--------------------+-----
    > ------------+---------------------+----------------------+--
    > ---------------------+-------------------+------------+-----
    > --------------+-----------------+--------------------+------
    > ------+----------------+--------+-------------+------+------
    > -+-------------------------------------+---------+----------
    > -------+--------------------+------------+----------+-------
    > ---+--------+------------+--------------+-------------------
    > --------------------------------------------+-----------+---
    > ----+-------------+------------+----------------+-----------
    > ----------+---------------------+---------+--------------+--
    > --------------+-------+-------------+--------------+
    >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up
    >  | Routing            | 172.30.3.3         | 255.255.255.192 |
    > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
    > 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
    >             |          1 | 172.30.4.3        | 255.255.255.128 |
    > 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
    >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
    >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
    >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
    > |         1 |     0 |           0 | 1492390408 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up
    >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
    > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
    > 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL
    >             |       NULL | 172.30.4.98       | 255.255.255.128 |
    > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL |
    > NULL |  NULL | NoIqn                               | NULL    | NULL
    >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
    >  NULL | NULL         | Proxy.2-ConsoleProxyResource
    >           |         1 |     0 |           0 | 1492390409 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up
    >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
    > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
    > 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL
    >             |       NULL | 172.30.4.86       | 255.255.255.128 |
    > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL |
    > NULL |  NULL | NoIqn                               | NULL    | NULL
    >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
    >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
    >             |         1 |     0 |           0 | 1492390407 |   146457912294
    > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up
    >  | Routing            | 172.30.3.4         | 255.255.255.192 |
    > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
    > 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
    >             |          1 | 172.30.4.4        | 255.255.255.128 |
    > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
    >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
    >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
    >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
    > |         1 |     0 |           0 | 1492450882 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up
    >  | Routing            | 172.30.3.5         | 255.255.255.192 |
    > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
    > 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
    >             |          1 | 172.30.4.5        | 255.255.255.128 |
    > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
    >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
    >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
    >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
    > |         1 |     0 |           0 | 1492390408 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     +----+-----------------+------------------------------------
    > --+--------+--------------------+--------------------+------
    > -----------+---------------------+--------------------+-----
    > ------------+---------------------+----------------------+--
    > ---------------------+-------------------+------------+-----
    > --------------+-----------------+--------------------+------
    > ------+----------------+--------+-------------+------+------
    > -+-------------------------------------+---------+----------
    > -------+--------------------+------------+----------+-------
    > ---+--------+------------+--------------+-------------------
    > --------------------------------------------+-----------+---
    > ----+-------------+------------+----------------+-----------
    > ----------+---------------------+---------+--------------+--
    > --------------+-------+-------------+--------------+
    >     5 rows in set (0.00 sec)
    >
    >
    >
    >     and you can see that it says the storage IP address is the same as the
    > private IP address (the management network).
    >
    >
    >     I also ran the command you provided using the Cluster ID number from
    > the table above -
    >
    >
    >
    >     mysql> select * from cloud.storage_pool where cluster_id = 1 and
    > removed is not null;
    >     Empty set (0.00 sec)
    >
    >     mysql>
    >
    >     So assuming I am reading this correctly that seems to be the issue.
    >
    >
    >     I am at a loss as to why though.
    >
    >
    >     I have a separate NIC for storage as described. When I add the zone
    > and get to the storage web page I exclude the IPs already used for the
    > compute node NICs and the NFS server itself. I do this because initially I
    > didn't and the SSVM started using the IP address of the NFS server.
    >
    >
    >     So the range is 172.30.5.1 -> 15 and the range I fill in is
    > 172.30.5.10 -> 172.30.5.14.
    >
    >
    >     And I used the label "cloudbr2" for storage.
    >
    >
    >     I must be doing this wrong somehow.
    >
    >
    >     Any pointers would be much appreciated.
    >
    >
    >
    >
    >     ________________________________
    >     From: Rafael Weingärtner <ra...@gmail.com>
    >     Sent: 05 June 2018 16:13
    >     To: users
    >     Subject: Re: advanced networking with public IPs direct to VMs
    >
    >     That is interesting. Let's see the source of all truth...
    >     This is the code that is generating that odd message.
    >
    >     >     List<StoragePoolVO> clusterPools =
    >     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
    >     >         boolean hasNfs = false;
    >     >         for (StoragePoolVO pool : clusterPools) {
    >     >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem)
    > {
    >     >                 hasNfs = true;
    >     >                 break;
    >     >             }
    >     >         }
    >     >         if (!hasNfs) {
    >     >             s_logger.warn(
    >     >                     "Agent investigation was requested on host " +
    > agent +
    >     > ", but host does not support investigation because it has no NFS
    > storage.
    >     > Skipping investigation.");
    >     >             return Status.Disconnected;
    >     >         }
    >     >
    >
    >     There are two possibilities here. You do not have any NFS storage? Is
    > that
    >     the case? Or maybe, for some reason, the call
    >     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
    > returning
    >     any NFS storage pools. Looking at the "listPoolsByCluster " we will see
    >     that the following SQL is used:
    >
    >     Select * from storage_pool where cluster_id = <host'sClusterId> and
    > removed
    >     > is not null
    >     >
    >
    >     Can you run that SQL to see the its return when your hosts are marked
    > as
    >     disconnected?
    >
    >
    > Dag.Sonstebo@shapeblue.com
    > www.shapeblue.com<http://www.shapeblue.com>
    Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
    www.shapeblue.com
    ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.
    
    
    
    > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
    > @shapeblue
    >
    >
    >
    > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk>
    > wrote:
    >
    >     > I reran the tests with the 3 NIC setup. When I configured the zone
    > through
    >     > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
    >     > traffic and cloudbr2 for NFS as per my original response to you.
    >     >
    >     >
    >     > When I pull the power to the node (dcp-cscn2.local) after about 5
    > mins
    >     > the  host status goes to "Alert" but never to "Down"
    >     >
    >     >
    >     > I get this in the logs -
    >     >
    >     >
    >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation
    > was
    >     > requested on host Host[-4-Routing], but host does not support
    > investigation
    >     > because it has no NFS storage. Skipping investigation.
    >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was
    > able to
    >     > determine host 4 is in Disconnected
    >     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host
    > 4 state
    >     > determined is Disconnected
    >     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
    > disconnected but
    >     > the host is still up: 4-dcp-cscn2.local
    >     >
    >     > I don't understand why it thinks there is no NFS storage as each
    > compute
    >     > node has a dedicated storage NIC.
    >     >
    >     >
    >     > I also don't understand why it thinks the host is still up ie. what
    > test
    >     > is it doing to determine that ?
    >     >
    >     >
    >     > Am I just trying to get something working that is not supported ?
    >     >
    >     >
    >     > ________________________________
    >     > From: Rafael Weingärtner <ra...@gmail.com>
    >     > Sent: 04 June 2018 15:31
    >     > To: users
    >     > Subject: Re: advanced networking with public IPs direct to VMs
    >     >
    >     > What type of failover are you talking about?
    >     > What ACS version are you using?
    >     > What hypervisor are you using?
    >     > How are you configuring your NICs in the hypervisor?
    >     > How are you configuring the traffic labels in ACS?
    >     >
    >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms.123@hotmail.co.uk
    > >
    >     > wrote:
    >     >
    >     > > Hi all
    >     > >
    >     > >
    >     > > I am close to giving up on basic networking as I just cannot get
    > failover
    >     > > working with multiple NICs (I am not even sure it is supported).
    >     > >
    >     > >
    >     > > What I would like is to use 3 NICs for management, storage and
    > guest
    >     > > traffic. I would like to assign public IPs direct to the VMs which
    > is
    >     > why I
    >     > > originally chose basic.
    >     > >
    >     > >
    >     > > If I switch to advanced networking do I just configure a guest VM
    > with
    >     > > public IPs on one NIC and not both with the public traffic -
    >     > >
    >     > >
    >     > > would this work ?
    >     > >
    >     >
    >     >
    >     >
    >     > --
    >     > Rafael Weingärtner
    >     >
    >
    >
    >
    >     --
    >     Rafael Weingärtner
    >
    >
    >
    
    
    --
    Rafael Weingärtner
    


Dag.Sonstebo@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Don't know whether this helps or not but I logged into the SSVM and ran an ifconfig -


eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 169.254.3.35  netmask 255.255.0.0  broadcast 169.254.255.255
        ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
        RX packets 141  bytes 20249 (19.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 108  bytes 16287 (15.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.30.3.34  netmask 255.255.255.192  broadcast 172.30.3.63
        ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
        RX packets 56722  bytes 4953133 (4.7 MiB)
        RX errors 0  dropped 44573  overruns 0  frame 0
        TX packets 11224  bytes 1234932 (1.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.30.4.86  netmask 255.255.255.128  broadcast 172.30.4.127
        ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
        RX packets 366191  bytes 435300557 (415.1 MiB)
        RX errors 0  dropped 39456  overruns 0  frame 0
        TX packets 145065  bytes 7978602 (7.6 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.30.5.14  netmask 255.255.255.240  broadcast 172.30.5.15
        ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
        RX packets 132440  bytes 426362982 (406.6 MiB)
        RX errors 0  dropped 39446  overruns 0  frame 0
        TX packets 67443  bytes 423670834 (404.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1  (Local Loopback)
        RX packets 18  bytes 1440 (1.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 18  bytes 1440 (1.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


so it has interfaces in both the management and the storage subnets (as well as guest).



________________________________
From: Jon Marshall <jm...@hotmail.co.uk>
Sent: 06 June 2018 11:08
To: users@cloudstack.apache.org
Subject: Re: advanced networking with public IPs direct to VMs

Hi Rafael


Thanks for the help, really appreciate it.


So rerunning that command with all servers up -



mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is null;
Empty set (0.00 sec)

mysql>


As for the storage IP no I'm not setting it to be the management IP when I setup the zone but the output of the SQL command suggests that is what has happened.

As I said to Dag I am using a different subnet for storage ie.

172.30.3.0/26  - management subnet
172.30.4.0/25 -  guest VM subnet
172.30.5.0/28 - storage

the NFS server IP is 172.30.5.2

each compute node has 3 NICs with an IP from each subnet (i am assuming the management node only needs an IP in the management network ?)

When I add the zone in the UI I have one physical network with management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
When I fill in the storage traffic page I use the range 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the compute nodes and the NFS server.

I think maybe I am doing something wrong in the UI setup but it is not obvious to me what it is.

What I might try today unless you want me to keep the setup I have for more outputs is to go back to 2 NICs, one for storage/management and one for guest VMs.

I think with the 2 NICs setup the mistake I made last time when adding the zone was to assume storage would just run over management so I did not drag and drop the storage icon and assign it to cloudbr0 as with the management which I think is what I should do ?





________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 06 June 2018 10:54
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Jon, do not panic we are here to help you :)
So, I might have mistyped the SQL query. You you use select * from
cloud.storage_pool where cluster_id = 1 and removed is not null ", you are
listing the storage pools removed. Therefore, the right query would be "
select * from cloud.storage_pool where cluster_id = 1 and removed is null "

There is also something else I do not understand. You are setting the
storage IP in the management subnet? I am not sure if you should be doing
like this. Normally, I set all my storages (primary[when working with NFS]
and secondary) to IPs in the storage subnet.

On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <Da...@shapeblue.com>
wrote:

> Hi John,
>
> I’m late to this thread and have possibly missed some things – but a
> couple of observations:
>
> “When I add the zone and get to the storage web page I exclude the IPs
> already used for the compute node NICs and the NFS server itself. …..”
> “So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10
> -> 172.30.5.14.”
>
> I think you may have some confusion around the use of the storage network.
> The important part here is to understand this is for *secondary storage*
> use only – it has nothing to do with primary storage. This means this
> storage network needs to be accessible to the SSVM, to the hypervisors, and
> secondary storage NFS pools needs to be accessible on this network.
>
> The important part – this also means you *can not use the same IP ranges
> for management and storage networks* - doing so means you will have issues
> where effectively both hypervisors and SSVM can see the same subnet on two
> NICs – and you end up in a routing black hole.
>
> So – you need to either:
>
> 1) Use different IP subnets on management and storage, or
> 2) preferably just simplify your setup – stop using a secondary storage
> network altogether and just allow secondary storage to use the management
> network (which is default). Unless you have a very high I/O environment in
> production you are just adding complexity by running separate management
> and storage.
>
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
> On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
>
>     I will disconnect the host this morning and test but before I do that
> I ran this command when all hosts are up -
>
>
>
>
>
>      select * from cloud.host;
>     +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
>     | id | name            | uuid                                 | status
> | type               | private_ip_address | private_netmask |
> private_mac_address | storage_ip_address | storage_netmask |
> storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> cpus | speed | url                                 | fs_type |
> hypervisor_type | hypervisor_version | ram        | resource | version  |
> parent | total_size | capabilities | guid
>                         | available | setup | dom0_memory | last_ping  |
> mgmt_server_id | disconnected        | created             | removed |
> update_count | resource_state | owner | lastUpdated | engine_state |
>     +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
>     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up
>  | Routing            | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.3        | 255.255.255.128 |
> 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
>   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492390408 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up
>  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
> 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
> 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.98       | 255.255.255.128 |
> 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | Proxy.2-ConsoleProxyResource
>           |         1 |     0 |           0 | 1492390409 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up
>  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
> 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
> 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.86       | 255.255.255.128 |
> 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
>             |         1 |     0 |           0 | 1492390407 |   146457912294
> | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up
>  | Routing            | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.4        | 255.255.255.128 |
> 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
>   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
> |         1 |     0 |           0 | 1492450882 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up
>  | Routing            | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.5        | 255.255.255.128 |
> 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492390408 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 |
> Enabled        | NULL  | NULL        | Disabled     |
>     +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
>     5 rows in set (0.00 sec)
>
>
>
>     and you can see that it says the storage IP address is the same as the
> private IP address (the management network).
>
>
>     I also ran the command you provided using the Cluster ID number from
> the table above -
>
>
>
>     mysql> select * from cloud.storage_pool where cluster_id = 1 and
> removed is not null;
>     Empty set (0.00 sec)
>
>     mysql>
>
>     So assuming I am reading this correctly that seems to be the issue.
>
>
>     I am at a loss as to why though.
>
>
>     I have a separate NIC for storage as described. When I add the zone
> and get to the storage web page I exclude the IPs already used for the
> compute node NICs and the NFS server itself. I do this because initially I
> didn't and the SSVM started using the IP address of the NFS server.
>
>
>     So the range is 172.30.5.1 -> 15 and the range I fill in is
> 172.30.5.10 -> 172.30.5.14.
>
>
>     And I used the label "cloudbr2" for storage.
>
>
>     I must be doing this wrong somehow.
>
>
>     Any pointers would be much appreciated.
>
>
>
>
>     ________________________________
>     From: Rafael Weingärtner <ra...@gmail.com>
>     Sent: 05 June 2018 16:13
>     To: users
>     Subject: Re: advanced networking with public IPs direct to VMs
>
>     That is interesting. Let's see the source of all truth...
>     This is the code that is generating that odd message.
>
>     >     List<StoragePoolVO> clusterPools =
>     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
>     >         boolean hasNfs = false;
>     >         for (StoragePoolVO pool : clusterPools) {
>     >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem)
> {
>     >                 hasNfs = true;
>     >                 break;
>     >             }
>     >         }
>     >         if (!hasNfs) {
>     >             s_logger.warn(
>     >                     "Agent investigation was requested on host " +
> agent +
>     > ", but host does not support investigation because it has no NFS
> storage.
>     > Skipping investigation.");
>     >             return Status.Disconnected;
>     >         }
>     >
>
>     There are two possibilities here. You do not have any NFS storage? Is
> that
>     the case? Or maybe, for some reason, the call
>     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
> returning
>     any NFS storage pools. Looking at the "listPoolsByCluster " we will see
>     that the following SQL is used:
>
>     Select * from storage_pool where cluster_id = <host'sClusterId> and
> removed
>     > is not null
>     >
>
>     Can you run that SQL to see the its return when your hosts are marked
> as
>     disconnected?
>
>
> Dag.Sonstebo@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
> On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
>     > I reran the tests with the 3 NIC setup. When I configured the zone
> through
>     > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
>     > traffic and cloudbr2 for NFS as per my original response to you.
>     >
>     >
>     > When I pull the power to the node (dcp-cscn2.local) after about 5
> mins
>     > the  host status goes to "Alert" but never to "Down"
>     >
>     >
>     > I get this in the logs -
>     >
>     >
>     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation
> was
>     > requested on host Host[-4-Routing], but host does not support
> investigation
>     > because it has no NFS storage. Skipping investigation.
>     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was
> able to
>     > determine host 4 is in Disconnected
>     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host
> 4 state
>     > determined is Disconnected
>     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
> disconnected but
>     > the host is still up: 4-dcp-cscn2.local
>     >
>     > I don't understand why it thinks there is no NFS storage as each
> compute
>     > node has a dedicated storage NIC.
>     >
>     >
>     > I also don't understand why it thinks the host is still up ie. what
> test
>     > is it doing to determine that ?
>     >
>     >
>     > Am I just trying to get something working that is not supported ?
>     >
>     >
>     > ________________________________
>     > From: Rafael Weingärtner <ra...@gmail.com>
>     > Sent: 04 June 2018 15:31
>     > To: users
>     > Subject: Re: advanced networking with public IPs direct to VMs
>     >
>     > What type of failover are you talking about?
>     > What ACS version are you using?
>     > What hypervisor are you using?
>     > How are you configuring your NICs in the hypervisor?
>     > How are you configuring the traffic labels in ACS?
>     >
>     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms.123@hotmail.co.uk
> >
>     > wrote:
>     >
>     > > Hi all
>     > >
>     > >
>     > > I am close to giving up on basic networking as I just cannot get
> failover
>     > > working with multiple NICs (I am not even sure it is supported).
>     > >
>     > >
>     > > What I would like is to use 3 NICs for management, storage and
> guest
>     > > traffic. I would like to assign public IPs direct to the VMs which
> is
>     > why I
>     > > originally chose basic.
>     > >
>     > >
>     > > If I switch to advanced networking do I just configure a guest VM
> with
>     > > public IPs on one NIC and not both with the public traffic -
>     > >
>     > >
>     > > would this work ?
>     > >
>     >
>     >
>     >
>     > --
>     > Rafael Weingärtner
>     >
>
>
>
>     --
>     Rafael Weingärtner
>
>
>


--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Hi Rafael


Thanks for the help, really appreciate it.


So rerunning that command with all servers up -



mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is null;
Empty set (0.00 sec)

mysql>


As for the storage IP no I'm not setting it to be the management IP when I setup the zone but the output of the SQL command suggests that is what has happened.

As I said to Dag I am using a different subnet for storage ie.

172.30.3.0/26  - management subnet
172.30.4.0/25 -  guest VM subnet
172.30.5.0/28 - storage

the NFS server IP is 172.30.5.2

each compute node has 3 NICs with an IP from each subnet (i am assuming the management node only needs an IP in the management network ?)

When I add the zone in the UI I have one physical network with management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
When I fill in the storage traffic page I use the range 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the compute nodes and the NFS server.

I think maybe I am doing something wrong in the UI setup but it is not obvious to me what it is.

What I might try today unless you want me to keep the setup I have for more outputs is to go back to 2 NICs, one for storage/management and one for guest VMs.

I think with the 2 NICs setup the mistake I made last time when adding the zone was to assume storage would just run over management so I did not drag and drop the storage icon and assign it to cloudbr0 as with the management which I think is what I should do ?





________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 06 June 2018 10:54
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Jon, do not panic we are here to help you :)
So, I might have mistyped the SQL query. You you use select * from
cloud.storage_pool where cluster_id = 1 and removed is not null ", you are
listing the storage pools removed. Therefore, the right query would be "
select * from cloud.storage_pool where cluster_id = 1 and removed is null "

There is also something else I do not understand. You are setting the
storage IP in the management subnet? I am not sure if you should be doing
like this. Normally, I set all my storages (primary[when working with NFS]
and secondary) to IPs in the storage subnet.

On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <Da...@shapeblue.com>
wrote:

> Hi John,
>
> I’m late to this thread and have possibly missed some things – but a
> couple of observations:
>
> “When I add the zone and get to the storage web page I exclude the IPs
> already used for the compute node NICs and the NFS server itself. …..”
> “So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10
> -> 172.30.5.14.”
>
> I think you may have some confusion around the use of the storage network.
> The important part here is to understand this is for *secondary storage*
> use only – it has nothing to do with primary storage. This means this
> storage network needs to be accessible to the SSVM, to the hypervisors, and
> secondary storage NFS pools needs to be accessible on this network.
>
> The important part – this also means you *can not use the same IP ranges
> for management and storage networks* - doing so means you will have issues
> where effectively both hypervisors and SSVM can see the same subnet on two
> NICs – and you end up in a routing black hole.
>
> So – you need to either:
>
> 1) Use different IP subnets on management and storage, or
> 2) preferably just simplify your setup – stop using a secondary storage
> network altogether and just allow secondary storage to use the management
> network (which is default). Unless you have a very high I/O environment in
> production you are just adding complexity by running separate management
> and storage.
>
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
> On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
>
>     I will disconnect the host this morning and test but before I do that
> I ran this command when all hosts are up -
>
>
>
>
>
>      select * from cloud.host;
>     +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
>     | id | name            | uuid                                 | status
> | type               | private_ip_address | private_netmask |
> private_mac_address | storage_ip_address | storage_netmask |
> storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> cpus | speed | url                                 | fs_type |
> hypervisor_type | hypervisor_version | ram        | resource | version  |
> parent | total_size | capabilities | guid
>                         | available | setup | dom0_memory | last_ping  |
> mgmt_server_id | disconnected        | created             | removed |
> update_count | resource_state | owner | lastUpdated | engine_state |
>     +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
>     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up
>  | Routing            | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.3        | 255.255.255.128 |
> 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
>   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492390408 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up
>  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
> 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
> 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.98       | 255.255.255.128 |
> 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | Proxy.2-ConsoleProxyResource
>           |         1 |     0 |           0 | 1492390409 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up
>  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
> 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
> 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.86       | 255.255.255.128 |
> 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
>             |         1 |     0 |           0 | 1492390407 |   146457912294
> | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up
>  | Routing            | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.4        | 255.255.255.128 |
> 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
>   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
> |         1 |     0 |           0 | 1492450882 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up
>  | Routing            | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.5        | 255.255.255.128 |
> 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492390408 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 |
> Enabled        | NULL  | NULL        | Disabled     |
>     +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
>     5 rows in set (0.00 sec)
>
>
>
>     and you can see that it says the storage IP address is the same as the
> private IP address (the management network).
>
>
>     I also ran the command you provided using the Cluster ID number from
> the table above -
>
>
>
>     mysql> select * from cloud.storage_pool where cluster_id = 1 and
> removed is not null;
>     Empty set (0.00 sec)
>
>     mysql>
>
>     So assuming I am reading this correctly that seems to be the issue.
>
>
>     I am at a loss as to why though.
>
>
>     I have a separate NIC for storage as described. When I add the zone
> and get to the storage web page I exclude the IPs already used for the
> compute node NICs and the NFS server itself. I do this because initially I
> didn't and the SSVM started using the IP address of the NFS server.
>
>
>     So the range is 172.30.5.1 -> 15 and the range I fill in is
> 172.30.5.10 -> 172.30.5.14.
>
>
>     And I used the label "cloudbr2" for storage.
>
>
>     I must be doing this wrong somehow.
>
>
>     Any pointers would be much appreciated.
>
>
>
>
>     ________________________________
>     From: Rafael Weingärtner <ra...@gmail.com>
>     Sent: 05 June 2018 16:13
>     To: users
>     Subject: Re: advanced networking with public IPs direct to VMs
>
>     That is interesting. Let's see the source of all truth...
>     This is the code that is generating that odd message.
>
>     >     List<StoragePoolVO> clusterPools =
>     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
>     >         boolean hasNfs = false;
>     >         for (StoragePoolVO pool : clusterPools) {
>     >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem)
> {
>     >                 hasNfs = true;
>     >                 break;
>     >             }
>     >         }
>     >         if (!hasNfs) {
>     >             s_logger.warn(
>     >                     "Agent investigation was requested on host " +
> agent +
>     > ", but host does not support investigation because it has no NFS
> storage.
>     > Skipping investigation.");
>     >             return Status.Disconnected;
>     >         }
>     >
>
>     There are two possibilities here. You do not have any NFS storage? Is
> that
>     the case? Or maybe, for some reason, the call
>     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
> returning
>     any NFS storage pools. Looking at the "listPoolsByCluster " we will see
>     that the following SQL is used:
>
>     Select * from storage_pool where cluster_id = <host'sClusterId> and
> removed
>     > is not null
>     >
>
>     Can you run that SQL to see the its return when your hosts are marked
> as
>     disconnected?
>
>
> Dag.Sonstebo@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
> On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
>     > I reran the tests with the 3 NIC setup. When I configured the zone
> through
>     > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
>     > traffic and cloudbr2 for NFS as per my original response to you.
>     >
>     >
>     > When I pull the power to the node (dcp-cscn2.local) after about 5
> mins
>     > the  host status goes to "Alert" but never to "Down"
>     >
>     >
>     > I get this in the logs -
>     >
>     >
>     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation
> was
>     > requested on host Host[-4-Routing], but host does not support
> investigation
>     > because it has no NFS storage. Skipping investigation.
>     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was
> able to
>     > determine host 4 is in Disconnected
>     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host
> 4 state
>     > determined is Disconnected
>     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
> disconnected but
>     > the host is still up: 4-dcp-cscn2.local
>     >
>     > I don't understand why it thinks there is no NFS storage as each
> compute
>     > node has a dedicated storage NIC.
>     >
>     >
>     > I also don't understand why it thinks the host is still up ie. what
> test
>     > is it doing to determine that ?
>     >
>     >
>     > Am I just trying to get something working that is not supported ?
>     >
>     >
>     > ________________________________
>     > From: Rafael Weingärtner <ra...@gmail.com>
>     > Sent: 04 June 2018 15:31
>     > To: users
>     > Subject: Re: advanced networking with public IPs direct to VMs
>     >
>     > What type of failover are you talking about?
>     > What ACS version are you using?
>     > What hypervisor are you using?
>     > How are you configuring your NICs in the hypervisor?
>     > How are you configuring the traffic labels in ACS?
>     >
>     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms.123@hotmail.co.uk
> >
>     > wrote:
>     >
>     > > Hi all
>     > >
>     > >
>     > > I am close to giving up on basic networking as I just cannot get
> failover
>     > > working with multiple NICs (I am not even sure it is supported).
>     > >
>     > >
>     > > What I would like is to use 3 NICs for management, storage and
> guest
>     > > traffic. I would like to assign public IPs direct to the VMs which
> is
>     > why I
>     > > originally chose basic.
>     > >
>     > >
>     > > If I switch to advanced networking do I just configure a guest VM
> with
>     > > public IPs on one NIC and not both with the public traffic -
>     > >
>     > >
>     > > would this work ?
>     > >
>     >
>     >
>     >
>     > --
>     > Rafael Weingärtner
>     >
>
>
>
>     --
>     Rafael Weingärtner
>
>
>


--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
Jon, do not panic we are here to help you :)
So, I might have mistyped the SQL query. You you use select * from
cloud.storage_pool where cluster_id = 1 and removed is not null ", you are
listing the storage pools removed. Therefore, the right query would be "
select * from cloud.storage_pool where cluster_id = 1 and removed is null "

There is also something else I do not understand. You are setting the
storage IP in the management subnet? I am not sure if you should be doing
like this. Normally, I set all my storages (primary[when working with NFS]
and secondary) to IPs in the storage subnet.

On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <Da...@shapeblue.com>
wrote:

> Hi John,
>
> I’m late to this thread and have possibly missed some things – but a
> couple of observations:
>
> “When I add the zone and get to the storage web page I exclude the IPs
> already used for the compute node NICs and the NFS server itself. …..”
> “So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10
> -> 172.30.5.14.”
>
> I think you may have some confusion around the use of the storage network.
> The important part here is to understand this is for *secondary storage*
> use only – it has nothing to do with primary storage. This means this
> storage network needs to be accessible to the SSVM, to the hypervisors, and
> secondary storage NFS pools needs to be accessible on this network.
>
> The important part – this also means you *can not use the same IP ranges
> for management and storage networks* - doing so means you will have issues
> where effectively both hypervisors and SSVM can see the same subnet on two
> NICs – and you end up in a routing black hole.
>
> So – you need to either:
>
> 1) Use different IP subnets on management and storage, or
> 2) preferably just simplify your setup – stop using a secondary storage
> network altogether and just allow secondary storage to use the management
> network (which is default). Unless you have a very high I/O environment in
> production you are just adding complexity by running separate management
> and storage.
>
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
> On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk> wrote:
>
>     I will disconnect the host this morning and test but before I do that
> I ran this command when all hosts are up -
>
>
>
>
>
>      select * from cloud.host;
>     +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
>     | id | name            | uuid                                 | status
> | type               | private_ip_address | private_netmask |
> private_mac_address | storage_ip_address | storage_netmask |
> storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
> storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
> public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
> cpus | speed | url                                 | fs_type |
> hypervisor_type | hypervisor_version | ram        | resource | version  |
> parent | total_size | capabilities | guid
>                         | available | setup | dom0_memory | last_ping  |
> mgmt_server_id | disconnected        | created             | removed |
> update_count | resource_state | owner | lastUpdated | engine_state |
>     +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
>     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up
>  | Routing            | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
> 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.3        | 255.255.255.128 |
> 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
>   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492390408 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up
>  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
> 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
> 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.98       | 255.255.255.128 |
> 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | Proxy.2-ConsoleProxyResource
>           |         1 |     0 |           0 | 1492390409 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up
>  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
> 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
> 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL
>             |       NULL | 172.30.4.86       | 255.255.255.128 |
> 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL |
> NULL |  NULL | NoIqn                               | NULL    | NULL
>     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
>  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
>             |         1 |     0 |           0 | 1492390407 |   146457912294
> | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up
>  | Routing            | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
> 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.4        | 255.255.255.128 |
> 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
>   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
> |         1 |     0 |           0 | 1492450882 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 |
> Enabled        | NULL  | NULL        | Disabled     |
>     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up
>  | Routing            | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
> 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
>             |          1 | 172.30.4.5        | 255.255.255.128 |
> 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
>   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
>      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
>    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
> |         1 |     0 |           0 | 1492390408 |   146457912294 |
> 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 |
> Enabled        | NULL  | NULL        | Disabled     |
>     +----+-----------------+------------------------------------
> --+--------+--------------------+--------------------+------
> -----------+---------------------+--------------------+-----
> ------------+---------------------+----------------------+--
> ---------------------+-------------------+------------+-----
> --------------+-----------------+--------------------+------
> ------+----------------+--------+-------------+------+------
> -+-------------------------------------+---------+----------
> -------+--------------------+------------+----------+-------
> ---+--------+------------+--------------+-------------------
> --------------------------------------------+-----------+---
> ----+-------------+------------+----------------+-----------
> ----------+---------------------+---------+--------------+--
> --------------+-------+-------------+--------------+
>     5 rows in set (0.00 sec)
>
>
>
>     and you can see that it says the storage IP address is the same as the
> private IP address (the management network).
>
>
>     I also ran the command you provided using the Cluster ID number from
> the table above -
>
>
>
>     mysql> select * from cloud.storage_pool where cluster_id = 1 and
> removed is not null;
>     Empty set (0.00 sec)
>
>     mysql>
>
>     So assuming I am reading this correctly that seems to be the issue.
>
>
>     I am at a loss as to why though.
>
>
>     I have a separate NIC for storage as described. When I add the zone
> and get to the storage web page I exclude the IPs already used for the
> compute node NICs and the NFS server itself. I do this because initially I
> didn't and the SSVM started using the IP address of the NFS server.
>
>
>     So the range is 172.30.5.1 -> 15 and the range I fill in is
> 172.30.5.10 -> 172.30.5.14.
>
>
>     And I used the label "cloudbr2" for storage.
>
>
>     I must be doing this wrong somehow.
>
>
>     Any pointers would be much appreciated.
>
>
>
>
>     ________________________________
>     From: Rafael Weingärtner <ra...@gmail.com>
>     Sent: 05 June 2018 16:13
>     To: users
>     Subject: Re: advanced networking with public IPs direct to VMs
>
>     That is interesting. Let's see the source of all truth...
>     This is the code that is generating that odd message.
>
>     >     List<StoragePoolVO> clusterPools =
>     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
>     >         boolean hasNfs = false;
>     >         for (StoragePoolVO pool : clusterPools) {
>     >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem)
> {
>     >                 hasNfs = true;
>     >                 break;
>     >             }
>     >         }
>     >         if (!hasNfs) {
>     >             s_logger.warn(
>     >                     "Agent investigation was requested on host " +
> agent +
>     > ", but host does not support investigation because it has no NFS
> storage.
>     > Skipping investigation.");
>     >             return Status.Disconnected;
>     >         }
>     >
>
>     There are two possibilities here. You do not have any NFS storage? Is
> that
>     the case? Or maybe, for some reason, the call
>     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
> returning
>     any NFS storage pools. Looking at the "listPoolsByCluster " we will see
>     that the following SQL is used:
>
>     Select * from storage_pool where cluster_id = <host'sClusterId> and
> removed
>     > is not null
>     >
>
>     Can you run that SQL to see the its return when your hosts are marked
> as
>     disconnected?
>
>
> Dag.Sonstebo@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
> On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
>     > I reran the tests with the 3 NIC setup. When I configured the zone
> through
>     > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
>     > traffic and cloudbr2 for NFS as per my original response to you.
>     >
>     >
>     > When I pull the power to the node (dcp-cscn2.local) after about 5
> mins
>     > the  host status goes to "Alert" but never to "Down"
>     >
>     >
>     > I get this in the logs -
>     >
>     >
>     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation
> was
>     > requested on host Host[-4-Routing], but host does not support
> investigation
>     > because it has no NFS storage. Skipping investigation.
>     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was
> able to
>     > determine host 4 is in Disconnected
>     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host
> 4 state
>     > determined is Disconnected
>     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
>     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
> disconnected but
>     > the host is still up: 4-dcp-cscn2.local
>     >
>     > I don't understand why it thinks there is no NFS storage as each
> compute
>     > node has a dedicated storage NIC.
>     >
>     >
>     > I also don't understand why it thinks the host is still up ie. what
> test
>     > is it doing to determine that ?
>     >
>     >
>     > Am I just trying to get something working that is not supported ?
>     >
>     >
>     > ________________________________
>     > From: Rafael Weingärtner <ra...@gmail.com>
>     > Sent: 04 June 2018 15:31
>     > To: users
>     > Subject: Re: advanced networking with public IPs direct to VMs
>     >
>     > What type of failover are you talking about?
>     > What ACS version are you using?
>     > What hypervisor are you using?
>     > How are you configuring your NICs in the hypervisor?
>     > How are you configuring the traffic labels in ACS?
>     >
>     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms.123@hotmail.co.uk
> >
>     > wrote:
>     >
>     > > Hi all
>     > >
>     > >
>     > > I am close to giving up on basic networking as I just cannot get
> failover
>     > > working with multiple NICs (I am not even sure it is supported).
>     > >
>     > >
>     > > What I would like is to use 3 NICs for management, storage and
> guest
>     > > traffic. I would like to assign public IPs direct to the VMs which
> is
>     > why I
>     > > originally chose basic.
>     > >
>     > >
>     > > If I switch to advanced networking do I just configure a guest VM
> with
>     > > public IPs on one NIC and not both with the public traffic -
>     > >
>     > >
>     > > would this work ?
>     > >
>     >
>     >
>     >
>     > --
>     > Rafael Weingärtner
>     >
>
>
>
>     --
>     Rafael Weingärtner
>
>
>


-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Dag Sonstebo <Da...@shapeblue.com>.
Hi John,

I’m late to this thread and have possibly missed some things – but a couple of observations:

“When I add the zone and get to the storage web page I exclude the IPs already used for the compute node NICs and the NFS server itself. …..”
“So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10 -> 172.30.5.14.”

I think you may have some confusion around the use of the storage network. The important part here is to understand this is for *secondary storage* use only – it has nothing to do with primary storage. This means this storage network needs to be accessible to the SSVM, to the hypervisors, and secondary storage NFS pools needs to be accessible on this network.

The important part – this also means you *can not use the same IP ranges for management and storage networks* - doing so means you will have issues where effectively both hypervisors and SSVM can see the same subnet on two NICs – and you end up in a routing black hole.

So – you need to either:

1) Use different IP subnets on management and storage, or
2) preferably just simplify your setup – stop using a secondary storage network altogether and just allow secondary storage to use the management network (which is default). Unless you have a very high I/O environment in production you are just adding complexity by running separate management and storage.

Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue

On 06/06/2018, 10:18, "Jon Marshall" <jm...@hotmail.co.uk> wrote:

    I will disconnect the host this morning and test but before I do that I ran this command when all hosts are up -
    
    
    
    
    
     select * from cloud.host;
    +----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
    | id | name            | uuid                                 | status | type               | private_ip_address | private_netmask | private_mac_address | storage_ip_address | storage_netmask | storage_mac_address | storage_ip_address_2 | storage_mac_address_2 | storage_netmask_2 | cluster_id | public_ip_address | public_netmask  | public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets | cpus | speed | url                                 | fs_type | hypervisor_type | hypervisor_version | ram        | resource | version  | parent | total_size | capabilities | guid                                                          | available | setup | dom0_memory | last_ping  | mgmt_server_id | disconnected        | created             | removed | update_count | resource_state | owner | lastUpdated | engine_state |
    +----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
    |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up     | Routing            | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.3        | 255.255.255.128 | 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |    2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource |         1 |     0 |           0 | 1492390408 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 | Enabled        | NULL  | NULL        | Disabled     |
    |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up     | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 | 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 | 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.98       | 255.255.255.128 | 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | Proxy.2-ConsoleProxyResource                                  |         1 |     0 |           0 | 1492390409 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
    |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up     | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 | 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 | 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.86       | 255.255.255.128 | 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | s-1-VM-NfsSecondaryStorageResource                            |         1 |     0 |           0 | 1492390407 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
    |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up     | Routing            | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.4        | 255.255.255.128 | 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |    2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource |         1 |     0 |           0 | 1492450882 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 | Enabled        | NULL  | NULL        | Disabled     |
    |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up     | Routing            | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.5        | 255.255.255.128 | 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |    2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource |         1 |     0 |           0 | 1492390408 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 | Enabled        | NULL  | NULL        | Disabled     |
    +----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
    5 rows in set (0.00 sec)
    
    
    
    and you can see that it says the storage IP address is the same as the private IP address (the management network).
    
    
    I also ran the command you provided using the Cluster ID number from the table above -
    
    
    
    mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is not null;
    Empty set (0.00 sec)
    
    mysql>
    
    So assuming I am reading this correctly that seems to be the issue.
    
    
    I am at a loss as to why though.
    
    
    I have a separate NIC for storage as described. When I add the zone and get to the storage web page I exclude the IPs already used for the compute node NICs and the NFS server itself. I do this because initially I didn't and the SSVM started using the IP address of the NFS server.
    
    
    So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10 -> 172.30.5.14.
    
    
    And I used the label "cloudbr2" for storage.
    
    
    I must be doing this wrong somehow.
    
    
    Any pointers would be much appreciated.
    
    
    
    
    ________________________________
    From: Rafael Weingärtner <ra...@gmail.com>
    Sent: 05 June 2018 16:13
    To: users
    Subject: Re: advanced networking with public IPs direct to VMs
    
    That is interesting. Let's see the source of all truth...
    This is the code that is generating that odd message.
    
    >     List<StoragePoolVO> clusterPools =
    > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
    >         boolean hasNfs = false;
    >         for (StoragePoolVO pool : clusterPools) {
    >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem) {
    >                 hasNfs = true;
    >                 break;
    >             }
    >         }
    >         if (!hasNfs) {
    >             s_logger.warn(
    >                     "Agent investigation was requested on host " + agent +
    > ", but host does not support investigation because it has no NFS storage.
    > Skipping investigation.");
    >             return Status.Disconnected;
    >         }
    >
    
    There are two possibilities here. You do not have any NFS storage? Is that
    the case? Or maybe, for some reason, the call
    "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not returning
    any NFS storage pools. Looking at the "listPoolsByCluster " we will see
    that the following SQL is used:
    
    Select * from storage_pool where cluster_id = <host'sClusterId> and removed
    > is not null
    >
    
    Can you run that SQL to see the its return when your hosts are marked as
    disconnected?
    
    
Dag.Sonstebo@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 

On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:
    
    > I reran the tests with the 3 NIC setup. When I configured the zone through
    > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
    > traffic and cloudbr2 for NFS as per my original response to you.
    >
    >
    > When I pull the power to the node (dcp-cscn2.local) after about 5 mins
    > the  host status goes to "Alert" but never to "Down"
    >
    >
    > I get this in the logs -
    >
    >
    > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was
    > requested on host Host[-4-Routing], but host does not support investigation
    > because it has no NFS storage. Skipping investigation.
    > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was able to
    > determine host 4 is in Disconnected
    > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4 state
    > determined is Disconnected
    > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected but
    > the host is still up: 4-dcp-cscn2.local
    >
    > I don't understand why it thinks there is no NFS storage as each compute
    > node has a dedicated storage NIC.
    >
    >
    > I also don't understand why it thinks the host is still up ie. what test
    > is it doing to determine that ?
    >
    >
    > Am I just trying to get something working that is not supported ?
    >
    >
    > ________________________________
    > From: Rafael Weingärtner <ra...@gmail.com>
    > Sent: 04 June 2018 15:31
    > To: users
    > Subject: Re: advanced networking with public IPs direct to VMs
    >
    > What type of failover are you talking about?
    > What ACS version are you using?
    > What hypervisor are you using?
    > How are you configuring your NICs in the hypervisor?
    > How are you configuring the traffic labels in ACS?
    >
    > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
    > wrote:
    >
    > > Hi all
    > >
    > >
    > > I am close to giving up on basic networking as I just cannot get failover
    > > working with multiple NICs (I am not even sure it is supported).
    > >
    > >
    > > What I would like is to use 3 NICs for management, storage and guest
    > > traffic. I would like to assign public IPs direct to the VMs which is
    > why I
    > > originally chose basic.
    > >
    > >
    > > If I switch to advanced networking do I just configure a guest VM with
    > > public IPs on one NIC and not both with the public traffic -
    > >
    > >
    > > would this work ?
    > >
    >
    >
    >
    > --
    > Rafael Weingärtner
    >
    
    
    
    --
    Rafael Weingärtner
    


Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
I will disconnect the host this morning and test but before I do that I ran this command when all hosts are up -





 select * from cloud.host;
+----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
| id | name            | uuid                                 | status | type               | private_ip_address | private_netmask | private_mac_address | storage_ip_address | storage_netmask | storage_mac_address | storage_ip_address_2 | storage_mac_address_2 | storage_netmask_2 | cluster_id | public_ip_address | public_netmask  | public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets | cpus | speed | url                                 | fs_type | hypervisor_type | hypervisor_version | ram        | resource | version  | parent | total_size | capabilities | guid                                                          | available | setup | dom0_memory | last_ping  | mgmt_server_id | disconnected        | created             | removed | update_count | resource_state | owner | lastUpdated | engine_state |
+----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
|  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up     | Routing            | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.3        | 255.255.255.128 | 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |    2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource |         1 |     0 |           0 | 1492390408 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 | Enabled        | NULL  | NULL        | Disabled     |
|  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up     | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 | 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 | 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.98       | 255.255.255.128 | 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | Proxy.2-ConsoleProxyResource                                  |         1 |     0 |           0 | 1492390409 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
|  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up     | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 | 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 | 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.86       | 255.255.255.128 | 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | s-1-VM-NfsSecondaryStorageResource                            |         1 |     0 |           0 | 1492390407 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
|  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up     | Routing            | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.4        | 255.255.255.128 | 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |    2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource |         1 |     0 |           0 | 1492450882 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 | Enabled        | NULL  | NULL        | Disabled     |
|  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up     | Routing            | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.5        | 255.255.255.128 | 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |    2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource |         1 |     0 |           0 | 1492390408 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 | Enabled        | NULL  | NULL        | Disabled     |
+----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
5 rows in set (0.00 sec)



and you can see that it says the storage IP address is the same as the private IP address (the management network).


I also ran the command you provided using the Cluster ID number from the table above -



mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is not null;
Empty set (0.00 sec)

mysql>

So assuming I am reading this correctly that seems to be the issue.


I am at a loss as to why though.


I have a separate NIC for storage as described. When I add the zone and get to the storage web page I exclude the IPs already used for the compute node NICs and the NFS server itself. I do this because initially I didn't and the SSVM started using the IP address of the NFS server.


So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10 -> 172.30.5.14.


And I used the label "cloudbr2" for storage.


I must be doing this wrong somehow.


Any pointers would be much appreciated.




________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 05 June 2018 16:13
To: users
Subject: Re: advanced networking with public IPs direct to VMs

That is interesting. Let's see the source of all truth...
This is the code that is generating that odd message.

>     List<StoragePoolVO> clusterPools =
> _storagePoolDao.listPoolsByCluster(agent.getClusterId());
>         boolean hasNfs = false;
>         for (StoragePoolVO pool : clusterPools) {
>             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem) {
>                 hasNfs = true;
>                 break;
>             }
>         }
>         if (!hasNfs) {
>             s_logger.warn(
>                     "Agent investigation was requested on host " + agent +
> ", but host does not support investigation because it has no NFS storage.
> Skipping investigation.");
>             return Status.Disconnected;
>         }
>

There are two possibilities here. You do not have any NFS storage? Is that
the case? Or maybe, for some reason, the call
"_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not returning
any NFS storage pools. Looking at the "listPoolsByCluster " we will see
that the following SQL is used:

Select * from storage_pool where cluster_id = <host'sClusterId> and removed
> is not null
>

Can you run that SQL to see the its return when your hosts are marked as
disconnected?

On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> I reran the tests with the 3 NIC setup. When I configured the zone through
> the UI I used the labels cloudbr0 for management, cloudbr1 for guest
> traffic and cloudbr2 for NFS as per my original response to you.
>
>
> When I pull the power to the node (dcp-cscn2.local) after about 5 mins
> the  host status goes to "Alert" but never to "Down"
>
>
> I get this in the logs -
>
>
> 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was
> requested on host Host[-4-Routing], but host does not support investigation
> because it has no NFS storage. Skipping investigation.
> 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was able to
> determine host 4 is in Disconnected
> 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4 state
> determined is Disconnected
> 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected but
> the host is still up: 4-dcp-cscn2.local
>
> I don't understand why it thinks there is no NFS storage as each compute
> node has a dedicated storage NIC.
>
>
> I also don't understand why it thinks the host is still up ie. what test
> is it doing to determine that ?
>
>
> Am I just trying to get something working that is not supported ?
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 04 June 2018 15:31
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> What type of failover are you talking about?
> What ACS version are you using?
> What hypervisor are you using?
> How are you configuring your NICs in the hypervisor?
> How are you configuring the traffic labels in ACS?
>
> On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Hi all
> >
> >
> > I am close to giving up on basic networking as I just cannot get failover
> > working with multiple NICs (I am not even sure it is supported).
> >
> >
> > What I would like is to use 3 NICs for management, storage and guest
> > traffic. I would like to assign public IPs direct to the VMs which is
> why I
> > originally chose basic.
> >
> >
> > If I switch to advanced networking do I just configure a guest VM with
> > public IPs on one NIC and not both with the public traffic -
> >
> >
> > would this work ?
> >
>
>
>
> --
> Rafael Weingärtner
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
That is something, but it does not explain why that same message does not
appear when he is using everything in a single NIC

On Tue, Jun 5, 2018 at 7:08 PM, Andrija Panic <an...@gmail.com>
wrote:

> Zone-wide NFS storage ? in this case the SQL returns no results (cluster_id
> field in table is NULL)
>
> On 5 June 2018 at 17:16, Jon Marshall <jm...@hotmail.co.uk> wrote:
>
> > No problem.
> >
> >
> > I am leaving work now but will test first thing tomorrow and get back to
> > you.
> >
> >
> > I definitely have NFS storage as far as I can tell !
> >
> >
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 05 June 2018 16:13
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > That is interesting. Let's see the source of all truth...
> > This is the code that is generating that odd message.
> >
> > >     List<StoragePoolVO> clusterPools =
> > > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
> > >         boolean hasNfs = false;
> > >         for (StoragePoolVO pool : clusterPools) {
> > >             if (pool.getPoolType() == StoragePoolType.
> NetworkFilesystem)
> > {
> > >                 hasNfs = true;
> > >                 break;
> > >             }
> > >         }
> > >         if (!hasNfs) {
> > >             s_logger.warn(
> > >                     "Agent investigation was requested on host " +
> agent
> > +
> > > ", but host does not support investigation because it has no NFS
> storage.
> > > Skipping investigation.");
> > >             return Status.Disconnected;
> > >         }
> > >
> >
> > There are two possibilities here. You do not have any NFS storage? Is
> that
> > the case? Or maybe, for some reason, the call
> > "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
> > returning
> > any NFS storage pools. Looking at the "listPoolsByCluster " we will see
> > that the following SQL is used:
> >
> > Select * from storage_pool where cluster_id = <host'sClusterId> and
> removed
> > > is not null
> > >
> >
> > Can you run that SQL to see the its return when your hosts are marked as
> > disconnected?
> >
> > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > I reran the tests with the 3 NIC setup. When I configured the zone
> > through
> > > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
> > > traffic and cloudbr2 for NFS as per my original response to you.
> > >
> > >
> > > When I pull the power to the node (dcp-cscn2.local) after about 5 mins
> > > the  host status goes to "Alert" but never to "Down"
> > >
> > >
> > > I get this in the logs -
> > >
> > >
> > > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> > > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was
> > > requested on host Host[-4-Routing], but host does not support
> > investigation
> > > because it has no NFS storage. Skipping investigation.
> > > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was
> able
> > to
> > > determine host 4 is in Disconnected
> > > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4
> > state
> > > determined is Disconnected
> > > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
> > > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected
> but
> > > the host is still up: 4-dcp-cscn2.local
> > >
> > > I don't understand why it thinks there is no NFS storage as each
> compute
> > > node has a dedicated storage NIC.
> > >
> > >
> > > I also don't understand why it thinks the host is still up ie. what
> test
> > > is it doing to determine that ?
> > >
> > >
> > > Am I just trying to get something working that is not supported ?
> > >
> > >
> > > ________________________________
> > > From: Rafael Weingärtner <ra...@gmail.com>
> > > Sent: 04 June 2018 15:31
> > > To: users
> > > Subject: Re: advanced networking with public IPs direct to VMs
> > >
> > > What type of failover are you talking about?
> > > What ACS version are you using?
> > > What hypervisor are you using?
> > > How are you configuring your NICs in the hypervisor?
> > > How are you configuring the traffic labels in ACS?
> > >
> > > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
> > > wrote:
> > >
> > > > Hi all
> > > >
> > > >
> > > > I am close to giving up on basic networking as I just cannot get
> > failover
> > > > working with multiple NICs (I am not even sure it is supported).
> > > >
> > > >
> > > > What I would like is to use 3 NICs for management, storage and guest
> > > > traffic. I would like to assign public IPs direct to the VMs which is
> > > why I
> > > > originally chose basic.
> > > >
> > > >
> > > > If I switch to advanced networking do I just configure a guest VM
> with
> > > > public IPs on one NIC and not both with the public traffic -
> > > >
> > > >
> > > > would this work ?
> > > >
> > >
> > >
> > >
> > > --
> > > Rafael Weingärtner
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
>
> Andrija Panić
>



-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Andrija Panic <an...@gmail.com>.
Zone-wide NFS storage ? in this case the SQL returns no results (cluster_id
field in table is NULL)

On 5 June 2018 at 17:16, Jon Marshall <jm...@hotmail.co.uk> wrote:

> No problem.
>
>
> I am leaving work now but will test first thing tomorrow and get back to
> you.
>
>
> I definitely have NFS storage as far as I can tell !
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 05 June 2018 16:13
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> That is interesting. Let's see the source of all truth...
> This is the code that is generating that odd message.
>
> >     List<StoragePoolVO> clusterPools =
> > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
> >         boolean hasNfs = false;
> >         for (StoragePoolVO pool : clusterPools) {
> >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem)
> {
> >                 hasNfs = true;
> >                 break;
> >             }
> >         }
> >         if (!hasNfs) {
> >             s_logger.warn(
> >                     "Agent investigation was requested on host " + agent
> +
> > ", but host does not support investigation because it has no NFS storage.
> > Skipping investigation.");
> >             return Status.Disconnected;
> >         }
> >
>
> There are two possibilities here. You do not have any NFS storage? Is that
> the case? Or maybe, for some reason, the call
> "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
> returning
> any NFS storage pools. Looking at the "listPoolsByCluster " we will see
> that the following SQL is used:
>
> Select * from storage_pool where cluster_id = <host'sClusterId> and removed
> > is not null
> >
>
> Can you run that SQL to see the its return when your hosts are marked as
> disconnected?
>
> On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > I reran the tests with the 3 NIC setup. When I configured the zone
> through
> > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
> > traffic and cloudbr2 for NFS as per my original response to you.
> >
> >
> > When I pull the power to the node (dcp-cscn2.local) after about 5 mins
> > the  host status goes to "Alert" but never to "Down"
> >
> >
> > I get this in the logs -
> >
> >
> > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was
> > requested on host Host[-4-Routing], but host does not support
> investigation
> > because it has no NFS storage. Skipping investigation.
> > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was able
> to
> > determine host 4 is in Disconnected
> > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4
> state
> > determined is Disconnected
> > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected but
> > the host is still up: 4-dcp-cscn2.local
> >
> > I don't understand why it thinks there is no NFS storage as each compute
> > node has a dedicated storage NIC.
> >
> >
> > I also don't understand why it thinks the host is still up ie. what test
> > is it doing to determine that ?
> >
> >
> > Am I just trying to get something working that is not supported ?
> >
> >
> > ________________________________
> > From: Rafael Weingärtner <ra...@gmail.com>
> > Sent: 04 June 2018 15:31
> > To: users
> > Subject: Re: advanced networking with public IPs direct to VMs
> >
> > What type of failover are you talking about?
> > What ACS version are you using?
> > What hypervisor are you using?
> > How are you configuring your NICs in the hypervisor?
> > How are you configuring the traffic labels in ACS?
> >
> > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
> > wrote:
> >
> > > Hi all
> > >
> > >
> > > I am close to giving up on basic networking as I just cannot get
> failover
> > > working with multiple NICs (I am not even sure it is supported).
> > >
> > >
> > > What I would like is to use 3 NICs for management, storage and guest
> > > traffic. I would like to assign public IPs direct to the VMs which is
> > why I
> > > originally chose basic.
> > >
> > >
> > > If I switch to advanced networking do I just configure a guest VM with
> > > public IPs on one NIC and not both with the public traffic -
> > >
> > >
> > > would this work ?
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> Rafael Weingärtner
>



-- 

Andrija Panić

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
No problem.


I am leaving work now but will test first thing tomorrow and get back to you.


I definitely have NFS storage as far as I can tell !


________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 05 June 2018 16:13
To: users
Subject: Re: advanced networking with public IPs direct to VMs

That is interesting. Let's see the source of all truth...
This is the code that is generating that odd message.

>     List<StoragePoolVO> clusterPools =
> _storagePoolDao.listPoolsByCluster(agent.getClusterId());
>         boolean hasNfs = false;
>         for (StoragePoolVO pool : clusterPools) {
>             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem) {
>                 hasNfs = true;
>                 break;
>             }
>         }
>         if (!hasNfs) {
>             s_logger.warn(
>                     "Agent investigation was requested on host " + agent +
> ", but host does not support investigation because it has no NFS storage.
> Skipping investigation.");
>             return Status.Disconnected;
>         }
>

There are two possibilities here. You do not have any NFS storage? Is that
the case? Or maybe, for some reason, the call
"_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not returning
any NFS storage pools. Looking at the "listPoolsByCluster " we will see
that the following SQL is used:

Select * from storage_pool where cluster_id = <host'sClusterId> and removed
> is not null
>

Can you run that SQL to see the its return when your hosts are marked as
disconnected?

On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> I reran the tests with the 3 NIC setup. When I configured the zone through
> the UI I used the labels cloudbr0 for management, cloudbr1 for guest
> traffic and cloudbr2 for NFS as per my original response to you.
>
>
> When I pull the power to the node (dcp-cscn2.local) after about 5 mins
> the  host status goes to "Alert" but never to "Down"
>
>
> I get this in the logs -
>
>
> 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was
> requested on host Host[-4-Routing], but host does not support investigation
> because it has no NFS storage. Skipping investigation.
> 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was able to
> determine host 4 is in Disconnected
> 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4 state
> determined is Disconnected
> 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected but
> the host is still up: 4-dcp-cscn2.local
>
> I don't understand why it thinks there is no NFS storage as each compute
> node has a dedicated storage NIC.
>
>
> I also don't understand why it thinks the host is still up ie. what test
> is it doing to determine that ?
>
>
> Am I just trying to get something working that is not supported ?
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 04 June 2018 15:31
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> What type of failover are you talking about?
> What ACS version are you using?
> What hypervisor are you using?
> How are you configuring your NICs in the hypervisor?
> How are you configuring the traffic labels in ACS?
>
> On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Hi all
> >
> >
> > I am close to giving up on basic networking as I just cannot get failover
> > working with multiple NICs (I am not even sure it is supported).
> >
> >
> > What I would like is to use 3 NICs for management, storage and guest
> > traffic. I would like to assign public IPs direct to the VMs which is
> why I
> > originally chose basic.
> >
> >
> > If I switch to advanced networking do I just configure a guest VM with
> > public IPs on one NIC and not both with the public traffic -
> >
> >
> > would this work ?
> >
>
>
>
> --
> Rafael Weingärtner
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
That is interesting. Let's see the source of all truth...
This is the code that is generating that odd message.

>     List<StoragePoolVO> clusterPools =
> _storagePoolDao.listPoolsByCluster(agent.getClusterId());
>         boolean hasNfs = false;
>         for (StoragePoolVO pool : clusterPools) {
>             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem) {
>                 hasNfs = true;
>                 break;
>             }
>         }
>         if (!hasNfs) {
>             s_logger.warn(
>                     "Agent investigation was requested on host " + agent +
> ", but host does not support investigation because it has no NFS storage.
> Skipping investigation.");
>             return Status.Disconnected;
>         }
>

There are two possibilities here. You do not have any NFS storage? Is that
the case? Or maybe, for some reason, the call
"_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not returning
any NFS storage pools. Looking at the "listPoolsByCluster " we will see
that the following SQL is used:

Select * from storage_pool where cluster_id = <host'sClusterId> and removed
> is not null
>

Can you run that SQL to see the its return when your hosts are marked as
disconnected?

On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> I reran the tests with the 3 NIC setup. When I configured the zone through
> the UI I used the labels cloudbr0 for management, cloudbr1 for guest
> traffic and cloudbr2 for NFS as per my original response to you.
>
>
> When I pull the power to the node (dcp-cscn2.local) after about 5 mins
> the  host status goes to "Alert" but never to "Down"
>
>
> I get this in the logs -
>
>
> 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was
> requested on host Host[-4-Routing], but host does not support investigation
> because it has no NFS storage. Skipping investigation.
> 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was able to
> determine host 4 is in Disconnected
> 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4 state
> determined is Disconnected
> 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected but
> the host is still up: 4-dcp-cscn2.local
>
> I don't understand why it thinks there is no NFS storage as each compute
> node has a dedicated storage NIC.
>
>
> I also don't understand why it thinks the host is still up ie. what test
> is it doing to determine that ?
>
>
> Am I just trying to get something working that is not supported ?
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 04 June 2018 15:31
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> What type of failover are you talking about?
> What ACS version are you using?
> What hypervisor are you using?
> How are you configuring your NICs in the hypervisor?
> How are you configuring the traffic labels in ACS?
>
> On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Hi all
> >
> >
> > I am close to giving up on basic networking as I just cannot get failover
> > working with multiple NICs (I am not even sure it is supported).
> >
> >
> > What I would like is to use 3 NICs for management, storage and guest
> > traffic. I would like to assign public IPs direct to the VMs which is
> why I
> > originally chose basic.
> >
> >
> > If I switch to advanced networking do I just configure a guest VM with
> > public IPs on one NIC and not both with the public traffic -
> >
> >
> > would this work ?
> >
>
>
>
> --
> Rafael Weingärtner
>



-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
I reran the tests with the 3 NIC setup. When I configured the zone through the UI I used the labels cloudbr0 for management, cloudbr1 for guest traffic and cloudbr2 for NFS as per my original response to you.


When I pull the power to the node (dcp-cscn2.local) after about 5 mins the  host status goes to "Alert" but never to "Down"


I get this in the logs -


2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator] (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was requested on host Host[-4-Routing], but host does not support investigation because it has no NFS storage. Skipping investigation.
2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was able to determine host 4 is in Disconnected
2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4 state determined is Disconnected
2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected but the host is still up: 4-dcp-cscn2.local

I don't understand why it thinks there is no NFS storage as each compute node has a dedicated storage NIC.


I also don't understand why it thinks the host is still up ie. what test is it doing to determine that ?


Am I just trying to get something working that is not supported ?


________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 04 June 2018 15:31
To: users
Subject: Re: advanced networking with public IPs direct to VMs

What type of failover are you talking about?
What ACS version are you using?
What hypervisor are you using?
How are you configuring your NICs in the hypervisor?
How are you configuring the traffic labels in ACS?

On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> Hi all
>
>
> I am close to giving up on basic networking as I just cannot get failover
> working with multiple NICs (I am not even sure it is supported).
>
>
> What I would like is to use 3 NICs for management, storage and guest
> traffic. I would like to assign public IPs direct to the VMs which is why I
> originally chose basic.
>
>
> If I switch to advanced networking do I just configure a guest VM with
> public IPs on one NIC and not both with the public traffic -
>
>
> would this work ?
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
No, watching the management server logs when I pull the power on one of the compute nodes it recognises the host is not responding to a ping and eventually marks the host status as "Alert" in the UI but it never tries to migrate the VMs that was running on the node.


From memory when I put everything on one NIC (management, storage and guest traffic) the host status is marked as "Down" not alert which makes me think there is something not supported with multiple NICs and failover.


It is almost as though with multiple NICs the manager knows that there is a problem with the node but cannot definitely say it is down and so it cannot migrate the VM in case it is still running on that node.


I have been at this for well over a month now (off and on) and apart from when I used a single NIC VM HA has never worked. If the configuration I have posted looks okay then maybe it is just not supported unless of course you know differently ?


I did think it may be the default gateway being set to the guest VM subnet but if I don't do this then the SSVM has issues with communication.


I am going to do a side by side comparison of the management server logs for single NIC vs dual NICs (management/storage on one NIC, the other NIC for guest VMs) and see if there is anything obvious that stands out.


That aside if I can't get this working then can I just assign a public IP subnet to the guest VM when setting up advanced networking and if so how does it then in effect bypass the virtual router (in terms of NAT) or do I not need to worry about this ?

Thanks
________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 04 June 2018 21:15
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Everything seems to be normal at a first glance. Do you see some sort of
error in the log files?

On Mon, Jun 4, 2018 at 11:39 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> CS version 4.11
>
> VM HA at the moment (not Host HA as yet)
>
> KVM
>
>
> For the management node just one NIC - 172.30.3.2/26 assigned to physical
> NIC.
>
>
> For the compute nodes -
>
>
> 3 NICs so as an example from one compute node -
>
>
> ifcfg-eth0
>
> BRIDGE=cloudbr0
>
>
> ifcfg-eth1
>
> BRIDGE=cloudbr1
>
>
> ifcfg-eth2
>
> BRIDGE=cloudbr2
>
>
> then the 3 bridges -
>
>
> ifcfg-cloudbr0
>
> ip address 172.30.3.3/26    <--- management network
>
>
> if-cloudbr1
>
> ip address 172.30.4.3/25  <-- guest traffic
>
> gateway 172.30.4.1
>
>
>
> ifcfg-cloubr2
>
> ip address 172.30.5.3 /28 <-- storage traffic
>
>
> traffic labels would be cloudbr0, cloudbr1, cloudbr2
>
>
> Can only get failover working when I put all traffic on same NIC.
>
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 04 June 2018 15:31
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> What type of failover are you talking about?
> What version are you using?
> What hypervisor are you using?
> How are you configuring your NICs in the hypervisor?
> How are you configuring the traffic labels in ACS?
>
> On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Hi all
> >
> >
> > I am close to giving up on basic networking as I just cannot get failover
> > working with multiple NICs (I am not even sure it is supported).
> >
> >
> > What I would like is to use 3 NICs for management, storage and guest
> > traffic. I would like to assign public IPs direct to the VMs which is
> why I
> > originally chose basic.
> >
> >
> > If I switch to advanced networking do I just configure a guest VM with
> > public IPs on one NIC and not both with the public traffic -
> >
> >
> > would this work ?
> >
>
>
>
> --
> Rafael Weingärtner
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
Update to this.


I ran the all on one NIC test again and it does report as "Down" in the UI as opposed to "Alert" when using multiple NICs.


Looking at the management server log this seems to be the key part -


1) from the single NIC logs -


2018-06-04 10:17:10,967 DEBUG [c.c.h.KVMInvestigator] (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Neighbouring host:5 returned status:Down for the investigated host:4
2018-06-04 10:17:10,967 DEBUG [c.c.h.KVMInvestigator] (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) HA: HOST is ineligible legacy state Down for host 4
2018-06-04 10:17:10,967 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) KVMInvestigator was able to determine host 4 is in Down
2018-06-04 10:17:10,967 INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) The agent from host 4 state determined is Down
2018-06-04 10:17:10,967 ERROR [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-8627b348) (logid:ef7b8230) Host is down: 4-dcp-cscn2.local. Starting HA on the VMs



2) from the setup with 2 NICs (managemnet/storage on one NIC, guest traffic on the other) -



2018-06-04 12:53:45,853 WARN  [c.c.h.KVMInvestigator] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) Agent investigation was requested on host Host[-2-Routing], but host does not support investigation because it has no NFS storage. Skipping investigation.
2018-06-04 12:53:45,854 DEBUG [c.c.h.HighAvailabilityManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) KVMInvestigator was able to determine host 2 is in Disconnected
2018-06-04 12:53:45,854 INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) The agent from host 2 state determined is Disconnected
2018-06-04 12:53:45,854 WARN  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) Agent is disconnected but the host is still up: 2-dcp-cscn2.local
2018-06-04 12:53:45,854 WARN  [o.a.c.alerts] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) AlertType:: 7 | dataCenterId:: 1 | podId:: 1 | clusterId:: null | message:: Host disconnected, name: dcp-cscn2.local (id:2), availability zone: dcp1, pod: dcpp1
2018-06-04 12:53:45,858 INFO  [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) Host 2 is disconnecting with event AgentDisconnected
2018-06-04 12:53:45,858 DEBUG [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) The next status of agent 2is Alert, current status is Up
2018-06-04 12:53:45,858 DEBUG [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) Deregistering link for 2 with state Alert
2018-06-04 12:53:45,859 DEBUG [c.c.a.m.AgentManagerImpl] (AgentTaskPool-3:ctx-0aed2673) (logid:32aaef2a) Remove Agent : 2


I don't know what it means by host has no NFS storage but you can see it never marks the failed node as down.


Any ideas ?




________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 04 June 2018 21:15
To: users
Subject: Re: advanced networking with public IPs direct to VMs

Everything seems to be normal at a first glance. Do you see some sort of
error in the log files?

On Mon, Jun 4, 2018 at 11:39 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> CS version 4.11
>
> VM HA at the moment (not Host HA as yet)
>
> KVM
>
>
> For the management node just one NIC - 172.30.3.2/26 assigned to physical
> NIC.
>
>
> For the compute nodes -
>
>
> 3 NICs so as an example from one compute node -
>
>
> ifcfg-eth0
>
> BRIDGE=cloudbr0
>
>
> ifcfg-eth1
>
> BRIDGE=cloudbr1
>
>
> ifcfg-eth2
>
> BRIDGE=cloudbr2
>
>
> then the 3 bridges -
>
>
> ifcfg-cloudbr0
>
> ip address 172.30.3.3/26    <--- management network
>
>
> if-cloudbr1
>
> ip address 172.30.4.3/25  <-- guest traffic
>
> gateway 172.30.4.1
>
>
>
> ifcfg-cloubr2
>
> ip address 172.30.5.3 /28 <-- storage traffic
>
>
> traffic labels would be cloudbr0, cloudbr1, cloudbr2
>
>
> Can only get failover working when I put all traffic on same NIC.
>
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 04 June 2018 15:31
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> What type of failover are you talking about?
> What version are you using?
> What hypervisor are you using?
> How are you configuring your NICs in the hypervisor?
> How are you configuring the traffic labels in ACS?
>
> On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Hi all
> >
> >
> > I am close to giving up on basic networking as I just cannot get failover
> > working with multiple NICs (I am not even sure it is supported).
> >
> >
> > What I would like is to use 3 NICs for management, storage and guest
> > traffic. I would like to assign public IPs direct to the VMs which is
> why I
> > originally chose basic.
> >
> >
> > If I switch to advanced networking do I just configure a guest VM with
> > public IPs on one NIC and not both with the public traffic -
> >
> >
> > would this work ?
> >
>
>
>
> --
> Rafael Weingärtner
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
Everything seems to be normal at a first glance. Do you see some sort of
error in the log files?

On Mon, Jun 4, 2018 at 11:39 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> CS version 4.11
>
> VM HA at the moment (not Host HA as yet)
>
> KVM
>
>
> For the management node just one NIC - 172.30.3.2/26 assigned to physical
> NIC.
>
>
> For the compute nodes -
>
>
> 3 NICs so as an example from one compute node -
>
>
> ifcfg-eth0
>
> BRIDGE=cloudbr0
>
>
> ifcfg-eth1
>
> BRIDGE=cloudbr1
>
>
> ifcfg-eth2
>
> BRIDGE=cloudbr2
>
>
> then the 3 bridges -
>
>
> ifcfg-cloudbr0
>
> ip address 172.30.3.3/26    <--- management network
>
>
> if-cloudbr1
>
> ip address 172.30.4.3/25  <-- guest traffic
>
> gateway 172.30.4.1
>
>
>
> ifcfg-cloubr2
>
> ip address 172.30.5.3 /28 <-- storage traffic
>
>
> traffic labels would be cloudbr0, cloudbr1, cloudbr2
>
>
> Can only get failover working when I put all traffic on same NIC.
>
>
>
> ________________________________
> From: Rafael Weingärtner <ra...@gmail.com>
> Sent: 04 June 2018 15:31
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> What type of failover are you talking about?
> What version are you using?
> What hypervisor are you using?
> How are you configuring your NICs in the hypervisor?
> How are you configuring the traffic labels in ACS?
>
> On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk>
> wrote:
>
> > Hi all
> >
> >
> > I am close to giving up on basic networking as I just cannot get failover
> > working with multiple NICs (I am not even sure it is supported).
> >
> >
> > What I would like is to use 3 NICs for management, storage and guest
> > traffic. I would like to assign public IPs direct to the VMs which is
> why I
> > originally chose basic.
> >
> >
> > If I switch to advanced networking do I just configure a guest VM with
> > public IPs on one NIC and not both with the public traffic -
> >
> >
> > would this work ?
> >
>
>
>
> --
> Rafael Weingärtner
>



-- 
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
CS version 4.11

VM HA at the moment (not Host HA as yet)

KVM


For the management node just one NIC - 172.30.3.2/26 assigned to physical NIC.


For the compute nodes -


3 NICs so as an example from one compute node -


ifcfg-eth0

BRIDGE=cloudbr0


ifcfg-eth1

BRIDGE=cloudbr1


ifcfg-eth2

BRIDGE=cloudbr2


then the 3 bridges -


ifcfg-cloudbr0

ip address 172.30.3.3/26    <--- management network


if-cloudbr1

ip address 172.30.4.3/25  <-- guest traffic

gateway 172.30.4.1



ifcfg-cloubr2

ip address 172.30.5.3 /28 <-- storage traffic


traffic labels would be cloudbr0, cloudbr1, cloudbr2


Can only get failover working when I put all traffic on same NIC.



________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 04 June 2018 15:31
To: users
Subject: Re: advanced networking with public IPs direct to VMs

What type of failover are you talking about?
What version are you using?
What hypervisor are you using?
How are you configuring your NICs in the hypervisor?
How are you configuring the traffic labels in ACS?

On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> Hi all
>
>
> I am close to giving up on basic networking as I just cannot get failover
> working with multiple NICs (I am not even sure it is supported).
>
>
> What I would like is to use 3 NICs for management, storage and guest
> traffic. I would like to assign public IPs direct to the VMs which is why I
> originally chose basic.
>
>
> If I switch to advanced networking do I just configure a guest VM with
> public IPs on one NIC and not both with the public traffic -
>
>
> would this work ?
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Jon Marshall <jm...@hotmail.co.uk>.
I think I do know what it means.


Let me build it with 3 separate NICs again an rerun.


________________________________
From: Rafael Weingärtner <ra...@gmail.com>
Sent: 04 June 2018 15:31
To: users
Subject: Re: advanced networking with public IPs direct to VMs

What type of failover are you talking about?
What ACS version are you using?
What hypervisor are you using?
How are you configuring your NICs in the hypervisor?
How are you configuring the traffic labels in ACS?

On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> Hi all
>
>
> I am close to giving up on basic networking as I just cannot get failover
> working with multiple NICs (I am not even sure it is supported).
>
>
> What I would like is to use 3 NICs for management, storage and guest
> traffic. I would like to assign public IPs direct to the VMs which is why I
> originally chose basic.
>
>
> If I switch to advanced networking do I just configure a guest VM with
> public IPs on one NIC and not both with the public traffic -
>
>
> would this work ?
>



--
Rafael Weingärtner

Re: advanced networking with public IPs direct to VMs

Posted by Rafael Weingärtner <ra...@gmail.com>.
What type of failover are you talking about?
What ACS version are you using?
What hypervisor are you using?
How are you configuring your NICs in the hypervisor?
How are you configuring the traffic labels in ACS?

On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jm...@hotmail.co.uk> wrote:

> Hi all
>
>
> I am close to giving up on basic networking as I just cannot get failover
> working with multiple NICs (I am not even sure it is supported).
>
>
> What I would like is to use 3 NICs for management, storage and guest
> traffic. I would like to assign public IPs direct to the VMs which is why I
> originally chose basic.
>
>
> If I switch to advanced networking do I just configure a guest VM with
> public IPs on one NIC and not both with the public traffic -
>
>
> would this work ?
>



-- 
Rafael Weingärtner