1835114 Members
2179 Online
110076 Solutions
New Discussion

Troubleshooting

 
SOLVED
Go to solution
fg_1
Trusted Contributor

Troubleshooting

Hello all

I am trying to put together a best practices document for troubleshooting so that I can pass this on to my jr admins.

I am thinking of including such things as:

Printer troubleshooting
Disk/lvm Troubleshooting

etc....

This is a broad topic, so if anyone has put together any type of documentation for this please let me know. All ideas are welcomed.

When I am finished with the document I will post it out here for all who want to use it.

Thanks and have a happy holiday all of you.
9 REPLIES 9
Rita C Workman
Honored Contributor
Solution

Re: Troubleshooting

I know this is probably NOT what your looking for...but I am in the process of training someone myself. We have documentation on how to:
Create a volume group
Add disk to VG in a MC/SG environment
How to add/modify printer; how to cancel print job...
..and so on..and so on..
BUT for troubleshooting;...I do this:
1. Any new issue that was resolved gets typed up and added to the 'special file'..doesn't matter who figured out the fix..me..him..or HP Tech Support. It's sort of like a diary really..
2. If it's not in the Documentation Book and it not already in the 'special file' than look it up on the ITRC Forums.
Check manpages; HP Manuals, etc.
3. If it's not resolved using step 1 or 2 than call HP Tech Support; and proceed with step 1 (document in special file)

Special file is just a file in my home directory..that I give all UNIX Admin folks access to for those FYI...tips & tricks and "say what'..
No real format...just info to search on. Like I said more like a diary of notes.

Rgrds,
Rita

..by the way..the special file gets read more than anything around some days for those "I remember seeing something like that error about 6 months ago..what was the fix...oh yeah...."

Steven Sim Kok Leong
Honored Contributor

Re: Troubleshooting

Hi,

Have you tried this link yet?

http://us-support3.external.hp.com/cki/bin/doc.pl/sid=5074748e13c79bb361/screen=ckiHome

This is the technical knowledge base which contains both software and hardware KBs of common system problems, issues and solutions.

Hope this helps. Regards.

Steven Sim Kok Leong
Brainbench MVP for Unix Admin
http://www.brainbench.com
Tom Dawson
Regular Advisor

Re: Troubleshooting

I do something similar to what Rita was explaining. But I carry it a step further. I write up a white-paper about any major "fix-it" incident we have. I then "html" the document and put it on our Intranet server.

I also run shell scripts that document how each server is configured and also place those on our Intranet server.

The real benefit lies in that our Intranet server is located at a different physical location than the HP-UX servers. So in the event of a major disaster, I have one more remote ( and survivable ) source of documentation.
Mark Greene_1
Honored Contributor

Re: Troubleshooting

Here is a script I've cron to run daily at 8:00 am to look for core files and send e-mail if it finds any. The usual procedure for us is to then run "strings" on the core file to see if it was the result of an application or OS fubar, and then escalate to the appropriate phone support.

# pg find_core.ksh
#!/bin/ksh
# 09/05/2001 mjg
# finds core files and sends mail about it
####################################################
set -u

### remove old files ###
ls /tmp/findcore_*|xargs -i rm -f {}

### init variables for new files ###
REPORT=/tmp/findcore_report.$$
ERRORS=/tmp/findcore_errors.$$

### redirect standard error for the duration of the script ###
exec 2>$ERRORS

### date stamp report ###
echo "Process started at "`date` >$REPORT
echo "host system: "`hostname` >>$REPORT

### find core files ###
cd /
CNT=`find / -name "core" -print |xargs -i ls -l {} | tee -a $REPORT|wc -l`
STATUS=$?
if [ $? -gt 0 ]; then
echo "errors during the find: $STATUS " >>$REPORT
echo "Count = $CNT" >>$REPORT
exit 1
else
echo $STATUS >>$REPORT
echo "Count = $CNT" >>$REPORT
fi
echo "\nProcess temp file listings:">>$REPORT
ls -l /tmp/findcore_* >>$REPORT

### mail the results ###
if [ "$CNT" -gt 0 ]; then
mailx -s "core files report" [e-mail addresses here] <$REPORT
fi
exit $?
### end of script ###

HTH
--
Mark
the future will be a lot like now, only later
harry d brown jr
Honored Contributor

Re: Troubleshooting

Frank,

I think you need to turn your jr admins on to this site, as the best place to find solutions.

live free or die
harry
Live Free or Die
Wodisch
Honored Contributor

Re: Troubleshooting

Hello Frank,

in my experience it is best to give them a rather solid fundament to build their own experiences on...
So they'll have to understand how:
- the kernel gets started
- the "init" process-tree is working
- the different flavours of login are working (serial, telnet, rlogin, ssh, XDMCP)
- devices are used in UN*X
- LVM is working
- the file-tree is organized
- to read man-pages
- to use your local documentation
- TCP/IP is working (e.g. the importance of always using the netmask-parameters/-options)
- compare the 'is'-state with the 'should-be'-state (i.e. documented printouts/html-pages of ALL the config files)
- X-windows is working (i.e. X-resources, X-properties, and such)
- filesystems are working (CDFS, VxFS, HFS, NFS, PFS-RRIP)
- printing is done on UN*X systems (System V, BSD LPR, JetAdmin, HPNP)

and of course a lot more, depending on your local applications (Oracle-RDBMS, SAP R/3, PeopleSoft, Informix-RDBMS), and your local middleware (MC/ServiceGuard, DCE, CORBA).

A history of the more or less recent events would be quite useful to know what used to happen on your site...

Even if this is NOT the HPADM mailing-list, I would rather appreciate a "summary" ;-)

Good luck and a happy new year,
Wodisc
someone_4
Honored Contributor

Re: Troubleshooting

Hey Rita .. Tom
how about sharing some of your docs with us?

Richard
fg_1
Trusted Contributor

Re: Troubleshooting

Hello all

Thank you for the input so far as it has been just as good as ever.

Rita/Tom, Can you please share some of the documentation with me that you have put together. If you want to send it as attachments here go ahead, or you may send it to me at my email: frank.grosberger@chsli.org

Thanks again all, I will summarize and post a master copy of the document when completed.

BTW, my junior admins already use this site but I want them to be independent thinkers as well.
Rita C Workman
Honored Contributor

Re: Troubleshooting

I will email some of these to you tomorrow, when I get back into the office.

One of my 'favs' is not so much a document, but a form with commands on it..
Called:
Adding disk to existing VG in a MC/SG
Another...adding disk to existing VG not in MC/SG

I find the new folks like it cause they can fill in the blanks and follow the commands...checking off as they go to avoid mistakes.


Like I said, I'll send you a few things..if you have anything specific your looking for...let me know.

Rita