You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Eric Lee Green <er...@gmail.com> on 2019/05/24 00:15:13 UTC

DNS quit working when I updated to 4.11.2

I had this working under 4.9. All I did was, on my main BIND9 servers, 
point a forward zone at 'cloud.<mydomain>.com' to the virtual router 
associated with all VM's that were publicly available. I could then 
resolve all foo.cloud.<mydomain>.com names on my global network.

Somehow, though, this quit working after I updated to 4.11. I'm not 
quite sure why.

The 'Guest Network' is defined with domain 'cloud.mydomain.com'.

Okay, so my router for the 'Guest Network' advanced networking is 
located at 10.102.199.148. In my master BIND9 DNS server at 10.31.1.2 I 
have this:
zone "cloud.mydomain.com" IN {
    type forward;
    forward only;
    forwarders {
         10.102.199.148;
     };
};

If I send a NAMED request directly to the virtual router while logged 
into my main name server, it works:

[root@ypbind ~]# host eric-gui.cloud.mydomain.com 10.102.199.148
Using domain server:
Name: 10.102.199.148
Address: 10.102.199.148#53
Aliases:

eric-gui.cloud.mydomain.com has address 10.102.199.234

If I try to use the name server however, it doesn't work:

[root@ypbind logs]# host eric-gui.cloud.mydomain.com
Host eric-gui.cloud.viakoo.com not found: 3(NXDOMAIN)

I'm baffled, because this *was* working.

So I disabled any dnssec in the {options} on bind9  and gave all 
permissions to see if that was the problem (note that this is internal 
to my infrastructure, so DNS amplification isn't an issue):

         dnssec-enable no;
         dnssec-validation no;
         dnssec-lookaside auto;
         recursion yes;
         allow-recursion { any; };
         allow-query { any; };
         allow-query-cache { any; };user

Still nope. Still baffled.

Anybody got any clues as to what I may be doing wrong? I'm thinking it 
has to be on the BIND9 side, because I can resolve the host name if I 
talk to the virtual router directly, but for some reason it's not 
allowing me to get any records from the router.

Right now I've temporarily worked around this with a script that 
directly queries the MySQL database every few minutes and generates a 
revised zone file on my master DNS server when the list of virtual 
machines queried out of the database changes, but that's clearly not the 
right way to do it. The question is, what *is* the right way to do it?

-Eric



Re: DNS quit working when I updated to 4.11.2

Posted by Eric Lee Green <er...@gmail.com>.
On 5/24/19 12:21 PM, Andrija Panic wrote:
> In other words - you are hitting an internal interface of a VR?

The VR has two NIC's. I presume that the Guest NIC as vs the Control NIC 
is the "internal" NIC?

Type 	Shared
Traffic Type 	Guest
Network Name 	Shared
Netmask 	255.255.0.0
IP Address 	10.102.199.148
ID 	7f59d904-cdc0-43eb-b679-0721077f5bb1
Network ID 	924eda5f-9a1f-4a8e-9423-18000dc92093
Isolation URI 	vlan://102
Broadcast URI 	vlan://102


Type 	
Traffic Type 	Control
Network Name 	
Netmask 	255.255.0.0
IP Address 	169.254.2.203
ID 	9c3676bc-23e6-48e3-baca-b8cce6511092
Network ID 	6eff5bd9-4f4d-48fe-b6ed-f50fc115947b
Isolation URI 	
Broadcast URI 	

>
> I would replace (for a test) bind9 with just the default setup of 
> DNSmasq, while specifying it's uper/ROOT DNS servers to be the VR IP - 
> i.e. client --> DNSmasq (internal server) --> DNSmasq (VR).
> See if that work - so you can draw possibly some conclusions.

That gives me room for some more experiments. I am fairly sure that I am 
running into recent changes to bind9 / dnsmasq intended to prevent DNS 
amplification and spoofing attacks, but the question of which one 
changed and how to work around it is still a question I'm trying to answer.


>
> Andrija
>
> On Fri, 24 May 2019 at 21:12, Eric Lee Green <eric.lee.green@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     On 5/24/19 10:16 AM, Andrija Panic wrote:
>     > Eric,
>     >
>     > your BIND9 servers is on a "Public" network (trying to talk to
>     the Public
>     > IP of the VR during forwarding DNS requests) or a VM inside an
>     Isolated
>     > network behind VR)?
>
>     It's on *a* public network, but not *the* public network. I don't
>     have
>     any Isolated networks, though I have them enabled from VLAN
>     1000-2000. I
>     am using "Advanced Networking" but for my own purposes -- I have one
>     "Shared" guest network at VLAN 102, and then several isolated
>     specialty
>     physical "Shared" networks like "Security Cameras" (VLAN 103) and
>     "Storage Network" (VLAN 200) that are attached to virtual machines
>     that
>     need access to those things. The "Shared" guest network (VLAN 102) is
>     routed by my layer 3 switch with the rest of my network's public
>     VLANs
>     so if I am on e.g. 10.31.1.2 (VLAN 31), which is similarly a routed
>     public VLAN (but not one that Cloudstack is allowed to directly
>     talk to
>     or manage, it has to go thru the layer 3 switch) or 10.120.0.5 (VLAN
>     120), I can talk directly to 10.102.199.148 since all are routed into
>     the common fabric via the layer 3 switch.  I only care about the VM's
>     that are VLAN 102, which are supposed to be publicly available to my
>     users, thus why my quicky script hack to generate a zone file out
>     of the
>     database does
>
>     select v.name <http://v.name>, n.ip4_address from vm_instance as
>     v, nics as n where v.removed is null and n.instance_id = v.id
>     <http://v.id> and n.ip4_address like '10.102.%'  and type =
>     'User'  order by n.ip4_address;
>
>     in order to select out the name and IP address of virtual machines
>     with
>     NIC's on that VLAN. (Which, if it's a different list from the last
>     list
>     that was queried, then gets massaged into a zone file for
>     name.cloud.mydomain.com <http://name.cloud.mydomain.com> by the
>     script, which then scp's to my master
>     domain server and does a reload to reload the zone file from the new
>     version).
>
>     Both of my BIND9 servers can talk directly to 10.102.199.148 (the
>     IP of
>     the virtual router for the 10.102.xxx.xxx network, VLAN 102) if I use
>     'host' to directly query 10.102.199.148 for an API address like, say,
>     'api-default1.cloud.mydomain.com
>     <http://api-default1.cloud.mydomain.com>' but when I try to put a
>     forward domain
>     there, nope. This was working, but now is not. I suspect it's got
>     to do
>     with the recent changes in DNS software, both bind9 and dnsmasq,  to
>     deal with multiple attacks on the domain name system, but I'm having
>     trouble figuring out why, or what my solution should be.
>
>     Note that it's quite reasonable / feasible / viable to put a DNS
>     server
>     actually inside the Cloudstack constellation if that's necessary and
>     then do a two-stage hop if necessary. I'm just trying to figure
>     out the
>     "right" way to do this right now so I can retire my hack script.
>
>     > On Fri, 24 May 2019 at 02:15, Eric Lee Green
>     <eric.lee.green@gmail.com <ma...@gmail.com>>
>     > wrote:
>     >
>     >> I had this working under 4.9. All I did was, on my main BIND9
>     servers,
>     >> point a forward zone at 'cloud.<mydomain>.com' to the virtual
>     router
>     >> associated with all VM's that were publicly available. I could then
>     >> resolve all foo.cloud.<mydomain>.com names on my global network.
>     >>
>     >> Somehow, though, this quit working after I updated to 4.11. I'm not
>     >> quite sure why.
>     >>
>     >> The 'Guest Network' is defined with domain 'cloud.mydomain.com
>     <http://cloud.mydomain.com>'.
>     >>
>     >> Okay, so my router for the 'Guest Network' advanced networking is
>     >> located at 10.102.199.148. In my master BIND9 DNS server at
>     10.31.1.2 I
>     >> have this:
>     >> zone "cloud.mydomain.com <http://cloud.mydomain.com>" IN {
>     >>      type forward;
>     >>      forward only;
>     >>      forwarders {
>     >>           10.102.199.148;
>     >>       };
>     >> };
>     >>
>     >> If I send a NAMED request directly to the virtual router while
>     logged
>     >> into my main name server, it works:
>     >>
>     >> [root@ypbind ~]# host eric-gui.cloud.mydomain.com
>     <http://eric-gui.cloud.mydomain.com> 10.102.199.148
>     >> Using domain server:
>     >> Name: 10.102.199.148
>     >> Address: 10.102.199.148#53
>     >> Aliases:
>     >>
>     >> eric-gui.cloud.mydomain.com
>     <http://eric-gui.cloud.mydomain.com> has address 10.102.199.234
>     >>
>     >> If I try to use the name server however, it doesn't work:
>     >>
>     >> [root@ypbind logs]# host eric-gui.cloud.mydomain.com
>     <http://eric-gui.cloud.mydomain.com>
>     >> Host eric-gui.cloud.viakoo.com
>     <http://eric-gui.cloud.viakoo.com> not found: 3(NXDOMAIN)
>     >>
>     >> I'm baffled, because this *was* working.
>     >>
>     >> So I disabled any dnssec in the {options} on bind9 and gave all
>     >> permissions to see if that was the problem (note that this is
>     internal
>     >> to my infrastructure, so DNS amplification isn't an issue):
>     >>
>     >>           dnssec-enable no;
>     >>           dnssec-validation no;
>     >>           dnssec-lookaside auto;
>     >>           recursion yes;
>     >>           allow-recursion { any; };
>     >>           allow-query { any; };
>     >>           allow-query-cache { any; };user
>     >>
>     >> Still nope. Still baffled.
>     >>
>     >> Anybody got any clues as to what I may be doing wrong? I'm
>     thinking it
>     >> has to be on the BIND9 side, because I can resolve the host
>     name if I
>     >> talk to the virtual router directly, but for some reason it's not
>     >> allowing me to get any records from the router.
>     >>
>     >> Right now I've temporarily worked around this with a script that
>     >> directly queries the MySQL database every few minutes and
>     generates a
>     >> revised zone file on my master DNS server when the list of virtual
>     >> machines queried out of the database changes, but that's
>     clearly not the
>     >> right way to do it. The question is, what *is* the right way to
>     do it?
>     >>
>     >> -Eric
>     >>
>     >>
>     >>
>
>
>
> -- 
>
> Andrija Panić



Re: DNS quit working when I updated to 4.11.2

Posted by Andrija Panic <an...@gmail.com>.
In other words - you are hitting an internal interface of a VR?

I would replace (for a test) bind9 with just the default setup of DNSmasq,
while specifying it's uper/ROOT DNS servers to be the VR IP - i.e. client
--> DNSmasq (internal server) --> DNSmasq (VR).
See if that work - so you can draw possibly some conclusions.

Andrija

On Fri, 24 May 2019 at 21:12, Eric Lee Green <er...@gmail.com>
wrote:

> On 5/24/19 10:16 AM, Andrija Panic wrote:
> > Eric,
> >
> > your BIND9 servers is on a "Public" network (trying to talk to the Public
> > IP of the VR during forwarding DNS requests) or a VM inside an Isolated
> > network behind VR)?
>
> It's on *a* public network, but not *the* public network. I don't have
> any Isolated networks, though I have them enabled from VLAN 1000-2000. I
> am using "Advanced Networking" but for my own purposes -- I have one
> "Shared" guest network at VLAN 102, and then several isolated specialty
> physical "Shared" networks like "Security Cameras" (VLAN 103) and
> "Storage Network" (VLAN 200) that are attached to virtual machines that
> need access to those things. The "Shared" guest network (VLAN 102) is
> routed by my layer 3 switch with the rest of my network's public VLANs
> so if I am on e.g. 10.31.1.2 (VLAN 31), which is similarly a routed
> public VLAN (but not one that Cloudstack is allowed to directly talk to
> or manage, it has to go thru the layer 3 switch) or 10.120.0.5 (VLAN
> 120), I can talk directly to 10.102.199.148 since all are routed into
> the common fabric via the layer 3 switch.  I only care about the VM's
> that are VLAN 102, which are supposed to be publicly available to my
> users, thus why my quicky script hack to generate a zone file out of the
> database does
>
> select v.name, n.ip4_address from vm_instance as v, nics as n where
> v.removed is null and n.instance_id = v.id and n.ip4_address like
> '10.102.%'  and type = 'User'  order by n.ip4_address;
>
> in order to select out the name and IP address of virtual machines with
> NIC's on that VLAN. (Which, if it's a different list from the last list
> that was queried, then gets massaged into a zone file for
> name.cloud.mydomain.com by the script, which then scp's to my master
> domain server and does a reload to reload the zone file from the new
> version).
>
> Both of my BIND9 servers can talk directly to 10.102.199.148 (the IP of
> the virtual router for the 10.102.xxx.xxx network, VLAN 102) if I use
> 'host' to directly query 10.102.199.148 for an API address like, say,
> 'api-default1.cloud.mydomain.com' but when I try to put a forward domain
> there, nope. This was working, but now is not. I suspect it's got to do
> with the recent changes in DNS software, both bind9 and dnsmasq,  to
> deal with multiple attacks on the domain name system, but I'm having
> trouble figuring out why, or what my solution should be.
>
> Note that it's quite reasonable / feasible / viable to put a DNS server
> actually inside the Cloudstack constellation if that's necessary and
> then do a two-stage hop if necessary. I'm just trying to figure out the
> "right" way to do this right now so I can retire my hack script.
>
> > On Fri, 24 May 2019 at 02:15, Eric Lee Green <er...@gmail.com>
> > wrote:
> >
> >> I had this working under 4.9. All I did was, on my main BIND9 servers,
> >> point a forward zone at 'cloud.<mydomain>.com' to the virtual router
> >> associated with all VM's that were publicly available. I could then
> >> resolve all foo.cloud.<mydomain>.com names on my global network.
> >>
> >> Somehow, though, this quit working after I updated to 4.11. I'm not
> >> quite sure why.
> >>
> >> The 'Guest Network' is defined with domain 'cloud.mydomain.com'.
> >>
> >> Okay, so my router for the 'Guest Network' advanced networking is
> >> located at 10.102.199.148. In my master BIND9 DNS server at 10.31.1.2 I
> >> have this:
> >> zone "cloud.mydomain.com" IN {
> >>      type forward;
> >>      forward only;
> >>      forwarders {
> >>           10.102.199.148;
> >>       };
> >> };
> >>
> >> If I send a NAMED request directly to the virtual router while logged
> >> into my main name server, it works:
> >>
> >> [root@ypbind ~]# host eric-gui.cloud.mydomain.com 10.102.199.148
> >> Using domain server:
> >> Name: 10.102.199.148
> >> Address: 10.102.199.148#53
> >> Aliases:
> >>
> >> eric-gui.cloud.mydomain.com has address 10.102.199.234
> >>
> >> If I try to use the name server however, it doesn't work:
> >>
> >> [root@ypbind logs]# host eric-gui.cloud.mydomain.com
> >> Host eric-gui.cloud.viakoo.com not found: 3(NXDOMAIN)
> >>
> >> I'm baffled, because this *was* working.
> >>
> >> So I disabled any dnssec in the {options} on bind9  and gave all
> >> permissions to see if that was the problem (note that this is internal
> >> to my infrastructure, so DNS amplification isn't an issue):
> >>
> >>           dnssec-enable no;
> >>           dnssec-validation no;
> >>           dnssec-lookaside auto;
> >>           recursion yes;
> >>           allow-recursion { any; };
> >>           allow-query { any; };
> >>           allow-query-cache { any; };user
> >>
> >> Still nope. Still baffled.
> >>
> >> Anybody got any clues as to what I may be doing wrong? I'm thinking it
> >> has to be on the BIND9 side, because I can resolve the host name if I
> >> talk to the virtual router directly, but for some reason it's not
> >> allowing me to get any records from the router.
> >>
> >> Right now I've temporarily worked around this with a script that
> >> directly queries the MySQL database every few minutes and generates a
> >> revised zone file on my master DNS server when the list of virtual
> >> machines queried out of the database changes, but that's clearly not the
> >> right way to do it. The question is, what *is* the right way to do it?
> >>
> >> -Eric
> >>
> >>
> >>
>
>

-- 

Andrija Panić

Re: DNS quit working when I updated to 4.11.2

Posted by Eric Lee Green <er...@gmail.com>.
On 5/24/19 10:16 AM, Andrija Panic wrote:
> Eric,
>
> your BIND9 servers is on a "Public" network (trying to talk to the Public
> IP of the VR during forwarding DNS requests) or a VM inside an Isolated
> network behind VR)?

It's on *a* public network, but not *the* public network. I don't have 
any Isolated networks, though I have them enabled from VLAN 1000-2000. I 
am using "Advanced Networking" but for my own purposes -- I have one 
"Shared" guest network at VLAN 102, and then several isolated specialty 
physical "Shared" networks like "Security Cameras" (VLAN 103) and 
"Storage Network" (VLAN 200) that are attached to virtual machines that 
need access to those things. The "Shared" guest network (VLAN 102) is 
routed by my layer 3 switch with the rest of my network's public VLANs 
so if I am on e.g. 10.31.1.2 (VLAN 31), which is similarly a routed 
public VLAN (but not one that Cloudstack is allowed to directly talk to 
or manage, it has to go thru the layer 3 switch) or 10.120.0.5 (VLAN 
120), I can talk directly to 10.102.199.148 since all are routed into 
the common fabric via the layer 3 switch.  I only care about the VM's 
that are VLAN 102, which are supposed to be publicly available to my 
users, thus why my quicky script hack to generate a zone file out of the 
database does

select v.name, n.ip4_address from vm_instance as v, nics as n where v.removed is null and n.instance_id = v.id and n.ip4_address like '10.102.%'  and type = 'User'  order by n.ip4_address;

in order to select out the name and IP address of virtual machines with 
NIC's on that VLAN. (Which, if it's a different list from the last list 
that was queried, then gets massaged into a zone file for 
name.cloud.mydomain.com by the script, which then scp's to my master 
domain server and does a reload to reload the zone file from the new 
version).

Both of my BIND9 servers can talk directly to 10.102.199.148 (the IP of 
the virtual router for the 10.102.xxx.xxx network, VLAN 102) if I use 
'host' to directly query 10.102.199.148 for an API address like, say, 
'api-default1.cloud.mydomain.com' but when I try to put a forward domain 
there, nope. This was working, but now is not. I suspect it's got to do 
with the recent changes in DNS software, both bind9 and dnsmasq,  to 
deal with multiple attacks on the domain name system, but I'm having 
trouble figuring out why, or what my solution should be.

Note that it's quite reasonable / feasible / viable to put a DNS server 
actually inside the Cloudstack constellation if that's necessary and 
then do a two-stage hop if necessary. I'm just trying to figure out the 
"right" way to do this right now so I can retire my hack script.

> On Fri, 24 May 2019 at 02:15, Eric Lee Green <er...@gmail.com>
> wrote:
>
>> I had this working under 4.9. All I did was, on my main BIND9 servers,
>> point a forward zone at 'cloud.<mydomain>.com' to the virtual router
>> associated with all VM's that were publicly available. I could then
>> resolve all foo.cloud.<mydomain>.com names on my global network.
>>
>> Somehow, though, this quit working after I updated to 4.11. I'm not
>> quite sure why.
>>
>> The 'Guest Network' is defined with domain 'cloud.mydomain.com'.
>>
>> Okay, so my router for the 'Guest Network' advanced networking is
>> located at 10.102.199.148. In my master BIND9 DNS server at 10.31.1.2 I
>> have this:
>> zone "cloud.mydomain.com" IN {
>>      type forward;
>>      forward only;
>>      forwarders {
>>           10.102.199.148;
>>       };
>> };
>>
>> If I send a NAMED request directly to the virtual router while logged
>> into my main name server, it works:
>>
>> [root@ypbind ~]# host eric-gui.cloud.mydomain.com 10.102.199.148
>> Using domain server:
>> Name: 10.102.199.148
>> Address: 10.102.199.148#53
>> Aliases:
>>
>> eric-gui.cloud.mydomain.com has address 10.102.199.234
>>
>> If I try to use the name server however, it doesn't work:
>>
>> [root@ypbind logs]# host eric-gui.cloud.mydomain.com
>> Host eric-gui.cloud.viakoo.com not found: 3(NXDOMAIN)
>>
>> I'm baffled, because this *was* working.
>>
>> So I disabled any dnssec in the {options} on bind9  and gave all
>> permissions to see if that was the problem (note that this is internal
>> to my infrastructure, so DNS amplification isn't an issue):
>>
>>           dnssec-enable no;
>>           dnssec-validation no;
>>           dnssec-lookaside auto;
>>           recursion yes;
>>           allow-recursion { any; };
>>           allow-query { any; };
>>           allow-query-cache { any; };user
>>
>> Still nope. Still baffled.
>>
>> Anybody got any clues as to what I may be doing wrong? I'm thinking it
>> has to be on the BIND9 side, because I can resolve the host name if I
>> talk to the virtual router directly, but for some reason it's not
>> allowing me to get any records from the router.
>>
>> Right now I've temporarily worked around this with a script that
>> directly queries the MySQL database every few minutes and generates a
>> revised zone file on my master DNS server when the list of virtual
>> machines queried out of the database changes, but that's clearly not the
>> right way to do it. The question is, what *is* the right way to do it?
>>
>> -Eric
>>
>>
>>


Re: DNS quit working when I updated to 4.11.2

Posted by Andrija Panic <an...@gmail.com>.
Eric,

your BIND9 servers is on a "Public" network (trying to talk to the Public
IP of the VR during forwarding DNS requests) or a VM inside an Isolated
network behind VR)?

Andrija

On Fri, 24 May 2019 at 02:15, Eric Lee Green <er...@gmail.com>
wrote:

> I had this working under 4.9. All I did was, on my main BIND9 servers,
> point a forward zone at 'cloud.<mydomain>.com' to the virtual router
> associated with all VM's that were publicly available. I could then
> resolve all foo.cloud.<mydomain>.com names on my global network.
>
> Somehow, though, this quit working after I updated to 4.11. I'm not
> quite sure why.
>
> The 'Guest Network' is defined with domain 'cloud.mydomain.com'.
>
> Okay, so my router for the 'Guest Network' advanced networking is
> located at 10.102.199.148. In my master BIND9 DNS server at 10.31.1.2 I
> have this:
> zone "cloud.mydomain.com" IN {
>     type forward;
>     forward only;
>     forwarders {
>          10.102.199.148;
>      };
> };
>
> If I send a NAMED request directly to the virtual router while logged
> into my main name server, it works:
>
> [root@ypbind ~]# host eric-gui.cloud.mydomain.com 10.102.199.148
> Using domain server:
> Name: 10.102.199.148
> Address: 10.102.199.148#53
> Aliases:
>
> eric-gui.cloud.mydomain.com has address 10.102.199.234
>
> If I try to use the name server however, it doesn't work:
>
> [root@ypbind logs]# host eric-gui.cloud.mydomain.com
> Host eric-gui.cloud.viakoo.com not found: 3(NXDOMAIN)
>
> I'm baffled, because this *was* working.
>
> So I disabled any dnssec in the {options} on bind9  and gave all
> permissions to see if that was the problem (note that this is internal
> to my infrastructure, so DNS amplification isn't an issue):
>
>          dnssec-enable no;
>          dnssec-validation no;
>          dnssec-lookaside auto;
>          recursion yes;
>          allow-recursion { any; };
>          allow-query { any; };
>          allow-query-cache { any; };user
>
> Still nope. Still baffled.
>
> Anybody got any clues as to what I may be doing wrong? I'm thinking it
> has to be on the BIND9 side, because I can resolve the host name if I
> talk to the virtual router directly, but for some reason it's not
> allowing me to get any records from the router.
>
> Right now I've temporarily worked around this with a script that
> directly queries the MySQL database every few minutes and generates a
> revised zone file on my master DNS server when the list of virtual
> machines queried out of the database changes, but that's clearly not the
> right way to do it. The question is, what *is* the right way to do it?
>
> -Eric
>
>
>

-- 

Andrija Panić