- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Solaris NIS sever srash resulted in MCSG Node ...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2004 09:50 PM
08-30-2004 09:50 PM
Environment is:
XP512 as Backend storage array
HPUX 11i two node cluster as NFS gateways connected via brocade switches
Solaris 8 as NIS master server
Background and Issue
--------------------
1. The Solaris NIS server went down at 5.00 AM in the morning.
2. This resulted in following mentioned messages in the /var/adm/syslog/syslog.log of the
NODE on which the packages were running.
Aug 31 07:39:10 cmgtpn1 syslog: svc_getreqset: No transport handle for fd 257
Aug 31 07:39:10 cmgtpn1 syslog: svc_getreqset: No transport handle for fd 258
-
-
-
Aug 31 09:08:41 cmgtpn1 EMS [1923]: ------ EMS Event Notification ------ Value: "error"
for Resource: "/cluster/package/package_status/itacpkg1" (Threshold: != " 1")
3. At 9.11 AM, the node cmgtpn1 got rebooted with following messages in the
/var/adm/syslog/syslog.log file.
Aug 31 09:11:15 cmgtpn1 cmsrvassistd[14799]: Unable to communicate with ServiceGuard main
daemon (cmcld): Network is unreachable
Aug 31 09:12:17 cmgtpn1 cmclconfd[1805]: The ServiceGuard daemon, /usr/lbin/cmcld[1806],
died upon receiving signal number 6.
QUESTION : WHY THE NODE SHOULD BE REBOOTED IN CASE THE NIS SERVER HAS GONE DOWN ?
4. All the packages shifted to second node at 9:14 AM.
5. Following messages start to appear again on the second node (cmgtpn2). The NIS server is
still DOWN.
Aug 31 09:21:29 cmgtpn2 syslog: svc_getreqset: No transport handle for fd 257
Aug 31 09:21:29 cmgtpn2 syslog: svc_getreqset: No transport handle for fd 258
Aug 31 09:21:29 cmgtpn2 syslog: svc_getreqset: No transport handle for fd 259
Aug 31 09:25:00 cmgtpn2 syslog: svc_getreqset: No transport handle for fd 260
-
-
-
Aug 31 09:37:17 cmgtpn2 syslog: svc_getreqset: No transport handle for fd 261
Aug 31 09:37:17 cmgtpn2 syslog: svc_getreqset: No transport handle for fd 262
Aug 31 09:37:17 cmgtpn2 syslog: svc_getreqset: No transport handle for fd 259
Aug 31 09:40:48 cmgtpn2 syslog: svc_getreqset: No transport handle for fd 257
Aug 31 09:51:43 cmgtpn2 cmsrvassistd[1842]: Lost connection with ServiceGuard cluster
daemon (cmcld): Connection timed out
Aug 31 09:54:49 cmgtpn2 cmlvmd: Could not read messages from /usr/lbin/cmcld: Connection
timed out
Aug 31 09:54:49 cmgtpn2 cmlvmd: CLVMD exiting
Aug 31 09:54:49 cmgtpn2 cmsrvassistd[149]: Unable to communicate with ServiceGuard main
daemon (cmcld): Network is unreachable
Aug 31 09:55:51 cmgtpn2 cmclconfd[1800]: The ServiceGuard daemon, /usr/lbin/cmcld[1801],
died upon receiving signal number 6.
#
6. At that point, the second node also went rebooted with packages throwing back on the
first node.
QUESTION : Why the cluster node should get rebooted if the NIS server crashes ?
Regards
Mahesh
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2004 09:59 PM
08-30-2004 09:59 PM
Re: Solaris NIS sever srash resulted in MCSG Node reboot
SG needs to have access to specific services as listed in the /etc/services file
grep hacl /etc/services
If the nodes are configured to use NIS and NOT their own serviecs, then when the NIS server died, they had communication issues, and therefore you get errors.
Also the package may have reliance on the NIS server, and may have a FAIL_FAST variable set to yes which would TOC the node if the package lost a service.
these are the clues:
cmsrvassistd[1842]: Lost connection with ServiceGuard cluster daemon (cmcld): connection timed out
etc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2004 10:04 PM
08-30-2004 10:04 PM
Re: Solaris NIS sever srash resulted in MCSG Node reboot
The cluster is not using any of the services (which are defined in /etc/services) off NIS.
It's confgured to be an NIS client which is taking user/group and host database from the NIS server.
Regards
Mahesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2004 10:07 PM
08-30-2004 10:07 PM
Re: Solaris NIS sever srash resulted in MCSG Node reboot
Can you post the contents of the file /etc/nsswitch.conf from both nodes?
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2004 10:08 PM
08-30-2004 10:08 PM
Re: Solaris NIS sever srash resulted in MCSG Node reboot
You certainly have lost some network connectivity according to the syslog entries
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2004 10:38 PM
08-30-2004 10:38 PM
Re: Solaris NIS sever srash resulted in MCSG Node reboot
the entries as defined in the /etc/nsswitch.conf of both the nodes are as below:
#
# /etc/nsswitch.nis:
#
# @(#)B.11.11_LR
#
# An example file that could be copied over to /etc/nsswitch.conf; it
# uses NIS (YP) in conjunction with files.
#
passwd: files nis
group: files nis
hosts: files [NOTFOUND=continue] nis [NOTFOUND=continue] dns
#hosts: files
#hosts: dns [NOTFOUND=continue] nis [NOTFOUND=continue] files
networks: nis [NOTFOUND=return] files
protocols: nis [NOTFOUND=return] files
rpc: nis [NOTFOUND=return] files
publickey: nis [NOTFOUND=return] files
netgroup: nis [NOTFOUND=return] files
automount: files nis
aliases: files nis
services: files nis
One critical information is that even after the NIS server came back, both the nodes were not able to resolve the hosts name. They started to resolve the host names only after stopping and starting the NIS clients.
There is also a core file residing in /var/adm/cmcluster directory. Would sending of that help in any way ?
Thanks and Regards,
Mahesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2004 11:11 PM
08-30-2004 11:11 PM
Re: Solaris NIS sever srash resulted in MCSG Node reboot
You need to investigate that and sort it out.
The core file is of no use in this respect.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-31-2004 12:29 AM
08-31-2004 12:29 AM
Re: Solaris NIS sever srash resulted in MCSG Node reboot
I suspect that one problem may be with the other lines in your nsswitch file, which have nis before files such as:
networks nis files
protocols nis files
Change these entries round to put files first, then put a secondary NIS server in place, which isn't reliant on the primary NIS server.
Maybe you could create a NIS server package in MC/SG? I am not sure how feasible that is, but its an idea.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-31-2004 12:33 AM
08-31-2004 12:33 AM
Solutionrpc: nis files
in your nsswitch file. RPC is a fundamental requirement of NFS yet you have a SPOF of the NIS server.
Change this entry around as well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-31-2004 01:27 AM
08-31-2004 01:27 AM
Re: Solaris NIS sever srash resulted in MCSG Node reboot
For everything, it should be files first.
# cat nsswitch.conf
#
# /etc/nsswitch.files:
#
# @(#)B.11.11_LR
#
# An example file that could be copied over to /etc/nsswitch.conf; it
# does not use any name services.
#
passwd: files
group: files
hosts: files [NOTFOUND=CONTINUE] dns
services: files
networks: files
protocols: files
rpc: files
publickey: files
netgroup: files
automount: files
aliases: files
Rgds...Geoff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-31-2004 04:15 AM
08-31-2004 04:15 AM
Re: Solaris NIS sever srash resulted in MCSG Node reboot
Most probably the problem is in nsswitch.conf file only.
Corrected the entries. However to test that out I need to break the connection with NIS server.
As of now, my production environment does not allow to test. Will keep the testing in my list of TODO in next shutdown
Best Regards
Mahesh