- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: HPUX Health Checks for script
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-27-2016 07:56 AM
05-27-2016 07:56 AM
HPUX Health Checks for script
Hi,
I know this topic has been dancing around before, and that you can use HP SIM or HP health checks tools.
With this in mind I'm working on a tool that checks the output of a script that we have that gathers info from hpux servers, for hardware/software configuration errors.
I allready have these mainly hardware related check/tests:
#Put the title in Errors.html
#Devices in No hardware state in ioscan
#Bad label lvlnboot
#unavailable VG/PV, stale PEs
#check failed disks in sas raid controller
#Check smartarray raid controller
#check for failed components in partstatus(cpu/mem/fans)
#CSTM memory.
#EMS LOGS Check for Critical/Warning
#Read/write errors CSTM disk.
#shutdown.log check for recent panics/machine checks/hpmcs/etc
# presence of files in /var/tombstones
#LAN:netfmt lost link errors
#LAN:Network devices with failed ports in configuration
#fcmsutil output check for topology, link speed, driver state and probably some link statistics (Loss of signal,etC)
#"olrad -q" output check for slot anomalies
#HDW:cprop checking for failing component status
#check errors in check_patch
#check filesets not in configured state
#LOG:/var/adm/syslog/syslog.log checking for different errors in vmunix
Can you please help me with other hardware related test/checks.
Also any Configuration health checking that you can think off would also go next.
Thanks for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-29-2016 06:37 PM - edited 05-29-2016 06:43 PM
05-29-2016 06:37 PM - edited 05-29-2016 06:43 PM
Re: HPUX Health Checks for script
There are several sanity checks you should do for any server, especially to make sure it will reboot the next time:
- From setboot, verify that the primary and alternate paths are valid.
- Check the LIF area on boot disks
- Is /stand almost full? (ie, less than 20 MB left)?
- Does /stand have the current and previous vmunix kernels present and more than zero bytes?
- Are the vmunix files type s800 or ELF-64?
- Check for the ioconfig file in /stand and /etc
- Check that rootconf file is valid:
ROOTCONF=/stand/rootconf MAGIC="$(xd $ROOTCONF | head -1 | awk '{print $2 $3}')" [[ $(echo "$MAGIC" | grep -c deadbeef) -ne 1 ]] && ErrMsg "$ROOTCONF magic number wrong, should be deadbeef (hex)" "rootconf = $MAGIC (hex)"
- Check that /stand/bootconf has both primary and alternate boot paths and are valid.
- Check that dead gateway detection is disabled
CHECKDEADGW="ndd -get /dev/tcp ip_ire_gw_probe" [[ $(eval "$CHECKDEADGW") -ne 0 ]] && echo "DEAD GATEWAY detection is enabled\n $CHECKDEADGW = 1"
For a very complete acceptance test script, see Dusan Baljevic's excellent script at:
http://www.circlingcycle.com.au/Unix-sources/HP-UX-check-OAT.pl.txt
It's in Perl, good coding structure and a few comments.
Bill Hassell, sysadmin