- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Troubleshooting methodologies
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 08:07 AM
08-23-2002 08:07 AM
While I am in my begging frame of mind, does
anyone out there have any ideas on some troubleshooting methodologies that any good
sysadmin can follow. I am going to try and
construct some flowcharts for troubleshooting
here for my guys, and I would be more than
appreciative of any comments or suggestions the group may have.
Once i have completed these I will share them out to all of you.
Thank you for the assist.
fg
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 08:15 AM
08-23-2002 08:15 AM
Solution1. What's changed?
2. What's changed?
3. What's changed?
At least in my experience, 9 out of 10 times when something breaks, it's because you've changed something. You've applied patches, you've changed kernel parameters, you added a new device, you've changed something, something that may not seem to be even remotely related.
Pete
Pete
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 08:15 AM
08-23-2002 08:15 AM
Re: Troubleshooting methodologies
1) If you have just added new hardwear (RAM, SCSI / FC / Network card, etc) and the system doesn't boot or otherwise doesn't work well, remove the new hardware and see what happens.
2) If you are getting an error message, READ the error message and understand what it is telling you. Don't ASSUME anything.
3) If something happens for one person, but not another, find out what is different between the two.
4) If you are talking to a user, always suspect that you aren't getting the whole story. Get a screen shot so that you can get the EXACT error message they are seeing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 08:15 AM
08-23-2002 08:15 AM
Re: Troubleshooting methodologies
Do you mean system problem troubleshooting? Or application? Or both?
General System Problems:
what's changed?
if startup issues - check rc.log
check syslog
if network issue, use ping, traceroute, netstat -rn, ifconfig to check network configurations.
bdf to check if any file systems are full
swapinfo to see if swap is being heavily used
Applications:
check application logs
check patch levels
Other than that, what'd'ya wanna know?
Cheers!
James
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 08:21 AM
08-23-2002 08:21 AM
Re: Troubleshooting methodologies
2) Reduce the chance of an error to a single area. When some problem comes up, first find out if that's a hardware or software error, then which application, which process, which resource and try concentraing on that area.
3) Listen to all log files and warnings.
4) Check what changed recently
5) Compare - compare the settings with another system or network
6) Check solutions in forums and TKB
7) Document whatever you tried before and the problem solutions.
8) Work as a team.
9) Never ignore even a small event.
10) Don't panic and be confident on whatever you try
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 08:24 AM
08-23-2002 08:24 AM
Re: Troubleshooting methodologies
Personally I've been trained in analytical troubleshooting using the Kepner Tregoe course. I believe HP offers these courses to external customers too.
Their training manuals contain some good information on troubleshooting and how to break down problems into steps which could then be flowcharted.
regards,
Darren.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 08:26 AM
08-23-2002 08:26 AM
Re: Troubleshooting methodologies
Check all log files - syslog.log, EMS logs
If problem is from a user, can it be reproduced. Can you reproduce the problem doing what the user has done.
If suspected hardware - use ioscan, stm to check for errors.
Has anything changed on the system - new apps, changes to existing apps.
Hilary
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 10:03 AM
08-23-2002 10:03 AM
Re: Troubleshooting methodologies
From a philosophical standpoint: ;)
K.I.S.S. - Keep It Simple Stupid
9 times of 10 it's an easy, cheap (free) fix, the 10th will reveal itself soon enough. I can't count how many times someone has wanted to replace expensive "defective" hardware because a cable was knocked loose.
Don't get tunnel vision
I hate to admit it but I recently spent hours working on a server terminal that wouldn't transmit data. I changed cables, verified software, etc., etc. I knew it couldn't be the brand new, replaced about a month ago power supply. Guess what!
Be open to new ideas
I was assisting troubleshooting an L-Class that refused to recognize 2 of the installed DIMM's, listed unconfigured in the firmware. After going over all the standard fixes to no avail. I had a thought about the power monitor board. My partner was adament that it couldn't be the issue. After a few hours, I convinced him to try it. The DIMM's came back.
Those 3 concepts, among those listed by others, have helped me a great deal. (And hurt me when I forget and don't follow them.)
Good Luck,
Kel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 10:35 AM
08-23-2002 10:35 AM
Re: Troubleshooting methodologies
I have to admit it's been a while since I *learned* troubleshooting, so I can only tell you how I do it today (but it might not be something you can "draw" actually):
1) listen and watch - do not form an opion, yet
2) write down what your intuition tells you (onto a sheet of paper or so)
3) check/verify those points
Even though I know that "2)" sounds like "and now something magic happens", that way it is much faster than by working through the *official* check-list(s)...
And if you replace "intuition" with "experience" then it does not even sound that weird ;-)
Take Kelli's example with the DIMM's - her experience helped her to *intuitivly" find the weak point!
To get back to your flowchart: tell them that the flowchart will change with their personal experience and knowledge, so it's a starting point only (but a safe and sound one).
With (some) experience the can/will use more of their senses (I guess), not their eyes and your brain only. Like ears (it's not as noisy as it should - is a fan broken?), nose (some resistors/capacitors are getting too hot), fingers (that plug is not correctly attached) - concurrently, hence they will be faster than those working sequently through a list.
Why do I write this? Well, some of those being trained usually want to know how "we" do it, not how "they" should do it, so it helped me in hte past to tell them that they could do it *that* way, eventually - but a little later, only.
Good luck,
Wodisch (who maybe has a weird moment right now ;-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 10:52 AM
08-23-2002 10:52 AM
Re: Troubleshooting methodologies
1. If if feels/looks like a permissions problem, it probably is.
2. Always trouble shoot in a straight line.
Usually you are trouble shooting a problem that has just manifest itself, so you know that it once worked. First thing to look at is what changed right before it quite working. Usually a user has changed something so look for recent file dates.
Networking issues should be examined in a straight line starting from the origin and going to the destination. For example make sure the origin computer is on the LAN, then start pinging and tracerouting from there. Ping itself then something on the same Subnet/LAN, then the default router, and continue until you find where it is broken.
Straight line thinking works for most problems. Wether it is a computer or plumbing in your house there is a straight line between the cause and the effect.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 05:31 PM
08-23-2002 05:31 PM
Re: Troubleshooting methodologies
My first step if I haven't seen this before and haven't changed anything recently is to go to www.google.com and put in the error message surrounded by quotes and see what pops up. Most of the time the problem will have been reported before and the solution may be right there. If it doesn't show up on the websearch, click on Groups and let it check the newsgroups.
Next step is to post a question on the Forum. Give some big hat a chance to earn a few more points! ;-)
Ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-23-2002 05:48 PM
08-23-2002 05:48 PM
Re: Troubleshooting methodologies
Frank,
You can teach some, and I stress "some", problem solving skills, but you can't teach common sense.
(1) Just because the manual/documentation say's/show's something one way, does not mean that it has to be exactly that way or the highway.
(2) shit happens
(3) man makes computers (see #2)
(4) some users, and even some people in IT are
"dumber than an empty box of rocks"
Flowchart? Gantt chart? chartered buses, funky stupid boxes with funky colors. Hey, who's turn is it to draw rectangles and color it mauve today???
Seriously, I was once asked to help map out how to make a system a bastion host. They wanted to "chart" it. I looked at them and said "Are you retarded?"
These idiots that think they can process flow control everything are insane, and unfortunately for us in IT, they are the ones running the nut house.
You can have steps to follow, but one needs common sense, the desire to learn, and basic comprehension to be able to problem solve.
You could "process flow control it", but it would take more energies tracking the process than to perform the work that the process flow control is supposed to me guiding.
live free or die
harry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-24-2002 04:47 AM
08-24-2002 04:47 AM
Re: Troubleshooting methodologies
try to be S.M.A.R.T.
Start with drawing a list of common problems you encounter. From there, you can start the good work.
All the best.
Best Regards
Yogeeraj
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-24-2002 06:00 AM
08-24-2002 06:00 AM
Re: Troubleshooting methodologies
2) When you learn something, put it in your own "troubleshooting guide" , in case this happens again .. in a few month ?
Then some guys don't like to put their stuff in a document...
I like going on holidays ;-)
Generally speaking, when a pb arises I try the (relevant) log 1st. The pb will nerver be reported the same way by 2 different persons.
my 2 cents.
Jean-Luc