Operating System - HP-UX
1823222 Members
3784 Online
109648 Solutions
New Discussion юеВ

Re: daily/weekly checks for general maintenence?

 
SM_3
Super Advisor

daily/weekly checks for general maintenence?

Hello

I know this is an open ended question...

What sort of daily/weekly checks do you do on the hp boxes for general maintenence?

6 REPLIES 6
Pete Randall
Outstanding Contributor

Re: daily/weekly checks for general maintenence?

I do the following daily:

Check disk space
Review syslog, diagnostic logs, bad logins
Check for hung users and force them off
Visually check all peripheral devices status
Check previous night's backups

Throughout the day I keep an eye on Glance.


Pete

Pete
Paul_481
Respected Contributor

Re: daily/weekly checks for general maintenence?

Hi SM,

We are required by management to produce CPU,Memory and Disk utilization report every week. We run the following script every minute, average the results and submit a report in an excel sheet weekly.


sar -u 1 > /tmp/MONITOR/cpu/$(date +'%m%d%Y_%H%M')cpu.log

top -d 1 > /tmp/MONITOR/memory/$(date +'%m%d%Y_%H%M')mem.log

bdf > /tmp/MONITOR/os_data/$(date +'%m%d%Y_%H%M')OSdat.log

Regards,
Paul


Mahesh Kumar Malik
Honored Contributor

Re: daily/weekly checks for general maintenence?

Hi SM

Check following on regular basis:

/var/adm/btmp
/var/adm/sulog
/var/adm/wtmp
/var/adm/cron/log
/var/adm/lp/log
/var/adm/syslog/mail.log
/var/adm/syslog/syslog.log

It is recommended to buy tools like Glance plus / pack etc... to help in presenting the system resources utilization in proper format

Regards
Mahesh
Rita C Workman
Honored Contributor

Re: daily/weekly checks for general maintenence?

We automate everything we possibly can. Say like if a filesystem hits a certain % full, then we get an email/pager to address it. If a process is taking too much of resources we get an email/pager...and so forth. Doesn't cover for everything that can happen, but it sure helps catch alot.

We check the backups daily. We put alot of things like backups and so forth on a webpage to post the work from the night before so we can check it all from the one source. We also have the activity of the boxes on this webpage (thank you Patrick !! for the great tool) so we can see if there were any spike in CPU,Memory or disks and then check on what caused it and follow up.

Try to get the syslogs cleanup up weekly (and never more than 2 weeks). The big things would have been addressed when we got the alerts from those monitoring scripts/tools.
Do make_recovery tapes monthly for all systems, unless we make changes, then we make the new tape(s). We make 2 tapes for each box. One stays here, and one stays at the DR site for offsite storage. Cause ignite tapes to me are DR and maintenance.

Rgrds,
Rita
DCE
Honored Contributor

Re: daily/weekly checks for general maintenence?

Things I check daily

did backups complete successfully, if not why?
Root e-mail several HP monitoring programs e-mail root with alerts.

review syslog.log

review mail.log

check file system sizes

check glance

remove all files older than 7 days from /tmp and /var/tmp

remove all core files older than 7 days, assuming you no longer need them

Things I check weekly
I use measureware/perfview for long term system analysis - review system performance

trim log files - several of the system log files will grow indefinitely if left alone

make a ignite backup of vg00

Dave
A. Clay Stephenson
Acclaimed Contributor

Re: daily/weekly checks for general maintenence?

The best approach to this is to monitor all of your systems continuously.

You should really look into an a product like HP Openview Operations (formerly known as VantagePoint/Operations, formerly known as IT/Operations, formerly known as ...). I would love to meet those people who spend their days renaming products in lieu of useful work. In any event, the idea is that all of your system continuously monitor themselves for resources, network problems, hardware failures, etc. and then report to a central location. If a message goes unacknowledged for a given period, the event is escalated and emails/pagers are sent to the appropriate parties. You can monitor many kinds of devices and platforms with OV/O and they all report to one central server.

I warn you, a product like this takes a good deal of effort to setup initially and you will be writing templates for months but after that everything pretty much takes care of itself. For example, suppose a filesystem is approaching a threshhold. OV/O could simply notify you or it could fire off a cleanup script or it could even expand the LVOL/filesystem for you. You are really only limited by your own abilities.
If it ain't broke, I can fix that.