ISC DHCP 4.1, Solaris 10, Sun Cluster 3.2 1/09 and... Solaris Containers


I’ve been trying to compile ISC DHCP 4.1 and to get it installed on a Solaris 10 u6 (11/08) zone for a few days.
Compiling ISC DHCP is a simpler, as long as you are using gcc and not Solaris cc
The issues have arisen when it has come to run ISC DHCP in a Solaris 10 zone .
Read further to know more about setting a Solaris Container SunCluster resource to run ISC DHCP in a zone!
...

1/ Infrastructure
Let’s describe the environment the ISC DHCP server will run in.
·         Two Sun Blade servers are running Solaris 10 11/08 (u6)
node1.mc-thias.org
node2.mc-thias.org
·         SunCluster framework 3.2 1/09 (u2) is installed on top of these two blades. therefore the two blades can be considered as a SunCluster server known as sc-infra.mc-thias.org
·         A zone is created, sc-zns.mc-thias.org, that hosts ISC BIND and ISC DHCP services.
·         This zone is handled by SunCluster Data Service for Solaris Containers for failover
2/ Encountered issues
I’ve been facing two main issues while trying to get ISC DHCP running on top of the SunCluster powered zone.
2.1/ ISC DHCP not able to get interface information
The first attempt to run ISC DHCP in a zone I’ve made was in a zone configured with shared IP (the default when configuring a zone):
set ip-type=shared
When starting ISC DHCP, the following error occurs:
Error getting interface flags for 'lo0:1'; No such device or address
Error getting interface information.
Looking a little closer to the network layer of a “shared ip-type” zone, it appears that no network device is declared:
root@sc-zns:/# ifconfig -a
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,\
                                    VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
e1000g0:1: flags=1001040843<UP,BROADCAST,RUNNING,MULTICAST,\
                   DEPRECATED,IPv4,FIXEDMTU> mtu 1500 index 2
        inet 192.168.0.20 netmask ffff0000 broadcast 192.168.0.255
root@sc-zns:/# ls -la /dev/e1000g*
/dev/e1000g*: No such file or directory
/!\ One can also notice that there is no ether address (hardware, mac address) for any of the network interface in such a zone.
As ISC DHCP is calling a get_hw_addr function when starting, this leads to a non operational (non starting actually ) service.
2.2/ ISC DHCP not able to get DLPI working
DLPI stands for Data Link Provider Interface. If you want to know more about it, have a look at the OpenGROUP
This time, the sc-zns zone has been defined as an exclusive ip-type:
set ip-type=exclusive
add net
set physical=nxge2
end
A physical public network interface from the blades is dedicated to the zone, and can’t be used elsewhere. With this configuration, the following problem is encountered when starting the ISC DHCP service:
Can't open DLPI device for nxge2: No such file or directory
It looks like we’re getting closer to a solution. Well, this is not working yet, but the error is no longer about accessing the network devices. Indeed, getting a closer look to the network configuration, the device and the ether address are now available:
root@sc-zns:/# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,\
                                 VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
nxge2: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,\
                              NOFAILOVER> mtu 1500 index 2
        inet 172.26.0.232 netmask ffff0000 broadcast 172.26.255.255
        groupname zns-group
        ether 0:14:4f:89:b7:6
root@sc-zns:/# ls -la /dev/nxge*
crw-------   1 root     sys      183,  3 Apr 10 10:22 /dev/nxge2
So… what is missing at this stage? Solaris 10 provides what they call a cloning character-special device, in our case /dev/nxge, to access all the devices (of this kind, in our case Sun Neptune NIU installed within the system. ISC DHCP requires this cloning character-special device when calling DLPI.
3/ How to configure a zone that can host ISC DHCP
Want to know more???
Well, as the SunCluster Framework that is used is 3.2 1/09, it can handle Solaris Containers using exlusive-IP network interfaces.
Moreover, the
cloning character-special device will be presented to the zone.
3.1/ Zone configuration
So, here is an export of the set of commands I’ve used to configure the
sc-zns.mc-thias.org zone on both nodes:
root@node1:/# zonecfg -z zns export
create -b
set zonepath=/zns/rootfs
set autoboot=false
set ip-type=exclusive
add net
set physical=nxge2
end
add device
set match=/dev/nxge
end
add attr
set name=content
set type=string
set value="BIND / DHCP zone"
end
The two important parts are the definition of ip-type to exclusive and the match of device /dev/nxge.
This last one
set match=/dev/nxge allows the cloning character-special device to be presented to, and available from the zone.
The zone being installed, and booted once, the first ever configuration is done (TZ, hostname, …)
root@nodeX:/# zoneadm -z zns install
root@nodeX:/# zoneadm -z zns boot
root@nodeX:/# zlogin -C zns
The zone is then halted
root@nodeX:/# zoneadm -z zns halt
The network configuration of the zone needs to be improved with IPMP. The hostname.interfacename can be edited while the zone is halted
root@node1:/# cat /zns/rootfs/root/etc/hostname.nxge2
sc-zns.mc-thias.org group zns-group up -failover
Of course, the hosts and netmasks files will be updated for consistency.
3.2/ Configuring the cluster resources
This being done, a SunCluster Zone Boot resource (sczbt) is created for the zone as follow:
root@node1:/# cat /zns/pfiles/sczbt_config.zns
RS=sc-infra-znsSCZBT
RG=sc-infra-znsRG
PARAMETERDIR=/zns/pfiles
SC_NETWORK=false
SC_LH=
#SC_LH=sc-infra-znsLH
FAILOVER=true
HAS_RS=sc-infra-znsHAS

Zonename="zns"
Zonebrand="native"
Zonebootopt=""
Milestone="multi-user-server"
Mounts=""
Note that there is no LogicalHostname resource define in the Zone Boot resource dependencies. As the resource is running in failover mode, the root filesystem is on a ZFS SunCluster HAStorage resource.
This being done, the Zone Boot resource can be registered
root@node1:/# /opt/SUNWsczone/sczbt/sczbt_register \
-f /zns/pfiles/sczbt_config.zns
The sc-zns.mc-thias.org zone can then be booted using the Solaris Containers SunCluster resource.
root@node1:/# clresourcegroup online -M sc-infra-znsRG
4/ Conclusion
Running some tests, as “Immediate shutdown” of one of the two nodes, have proved that the failover is running pretty well.
Note that this is working as the two blades have the same hardware, and as a consequence, the exact same type of network adapters (Sun Neptune NIU devices)
Well, I might have missed some points, so… don’t hesitate to leave your comments & valuable Inputs... :)

Comments

Popular posts from this blog

How To Add Print Button to Blogger Posts

INSTALL CISCO VPN CLIENT ON WINDOWS 10 (32 & 64 BIT). FIX REASON 442