Operating System - Tru64 Unix
1828617 Members
7136 Online
109983 Solutions
New Discussion

Tru64 5.1b error

 
SOLVED
Go to solution
Adam Strobel
Frequent Advisor

Tru64 5.1b error

Does anyone know what the below message means?

this is from dia -R |more

My system is running very very slow.

Thanks!!

**** V3.4 ********************* ENTRY 1 ********************************


Logging OS 2. Digital UNIX
System Architecture 2. Alpha
Event sequence number 135.
Timestamp of occurrence 26-OCT-2005 08:22:53
Host name AVENGER

System type register x00000022 Systype 34. (Regatta Family)
Number of CPUs (mpnum) x00000004
CPU logging event (mperr) x00000001

Event validity 1. O/S claims event is valid
Event severity 5. Low Priority
Entry type 310. Time Stamp
-1. - (minor class)
36 REPLIES 36
Michael Schulte zur Sur
Honored Contributor
Solution

Re: Tru64 5.1b error

Hi,

this is just a timestamp which occurs every some 10 minutes.
Have a look with top and monitor.

greetings,

Michael
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

thanks Here is the current TOP details. Does this seem to tell anyone that this would slow down the system?

load averages: 1.81, 1.75, 1.68 08:42:59
105 processes: 3 running, 8 waiting, 28 sleeping, 66 idle
CPU states: % user, % nice, % system, % idle
Memory: Real: 2600M/4470M act/tot Virtual: 3M/5201M use/tot Free: 511M

PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND
26408 oracle 51 0 2508M 1255M run 9:32 92.70% oracle
26683 oracle 42 0 2533M 293M WAIT 0:50 58.60% oracle
26570 oracle 42 0 2509M 271M WAIT 0:31 3.10% oracle
26509 oracle 42 0 2556M 172M WAIT 0:23 2.20% oracle
1270 oracle 44 0 2514M 575M sleep 9:03 1.50% oracle
26651 oracle 42 0 2556M 204M WAIT 0:33 1.20% oracle
26569 oracle 42 0 2509M 232M WAIT 0:28 1.00% oracle
25704 oracle 42 0 2510M 1152M WAIT 1:42 0.90% oracle
26571 oracle 42 0 2510M 226M WAIT 0:22 0.60% oracle
1268 oracle 44 0 2511M 580M sleep 3:08 0.20% oracle
1266 oracle 44 0 2511M 582M sleep 3:07 0.20% oracle
1262 oracle 44 0 2514M 584M run 3:07 0.10% oracle
1264 oracle 44 0 2511M 580M sleep 3:06 0.10% oracle
860 root 44 0 6808K 2195K sleep 1:19 0.10% envmond
26749 root 44 0 4024K 491K run 0:00 0.00% t
Ivan Ferreira
Honored Contributor

Re: Tru64 5.1b error

Please post the output of vmstat 5 10 and swapon -s. The oracle proccess is taking most resources, I want to know from the vmstat output the general status of the system.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

Attached are the vmstat 5 10 and swapon -s

Thanks!!
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

I also notice when I plug a console cable to the first controller and run some basic commands like "show this" or "show devices" the details come across the screen pretty quick but when I do the same thing above to the second controller the screen detail are very slow.

--Adam
Ivan Ferreira
Honored Contributor

Re: Tru64 5.1b error

Serial console output can be slow, this is because of the method of data transmission and it's normal unless you change the baud rate.

From the vmstat output, it seems that the system (at the moment of the output) was pretty idle. There are no page outs, cpu idle is more than half, and free memory is enough. You also are not using swap space.

When you say the system is slow, is slow how? the output of commands? The time that takes to complete a database query? The file transfers?

Maybe you should verify the network connectivity. You should ensure that the network is running at full duplex if possible. You can use:

hwmgr get attr -cat network

To verify the speed and duplex mode.

Also, post the results of netstat -ni, to see if there are packets with problems.

PD: Don't forget to assign points to ALL people that anwers to your questions.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Michael Schulte zur Sur
Honored Contributor

Re: Tru64 5.1b error

Hi,

many Oracle processes are in the state WAIT.
This indicates to me that there is much I/O going on. Have a look with monitor to see if too much I/O is going on.

Michael
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

thanks everyone for continuing to help.

Attached are the hwmgr get attr -cat network and netstat -ni
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

attached is a snap shot of "monitor"

Ivan Ferreira
Honored Contributor

Re: Tru64 5.1b error

The network is ok, so the problem sems to be disk I/O related, if the output of monitor is reliable, you have a lot of WAIT, you can check this information with vmstat -w 5 10, and see the column for iowait.

Are you using an storage? HSG? (as you methioned show ohter).

Verify if you don't have a failed disk, run show failed.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

Hi Ivan-

The system is running very slow for database query. I have users that are telling me usually there stuff will run in one hour now it takes five.

It does seam like everyone that is complaining it reffers to DB actions.
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

Attached is the vmstat -w 5 10

I'm using a compaq storageworks . I also verified that I have no failed disks.

This is a very strange issue.

thnaks for all your help.
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

Attached is the vmstat -w 5 10

I'm using a compaq storageworks . I also verified that I have no failed disks.

This is a very strange issue.

thnaks for all your help.
Ivan Ferreira
Honored Contributor

Re: Tru64 5.1b error

Again verify the iowait from vmstat, if you see high iowait, check this document:

http://www.dbis.informatik.uni-goettingen.de/Teaching/oracle-doc/admin-guide/appd_tru.htm


Section:

Tuning Asynchronous I/O

Direct I/O Support and Concurrent Direct I/O Support

Are you using a RAID device? What RAID level (RAID1 RAID5 RAID0+1)? Is everything ok at the storage?
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Michael Schulte zur Sur
Honored Contributor

Re: Tru64 5.1b error

It looks like there are many I/Os occuring. Has anything changed on the database? Are there extra jobs running? I would ask the dba to check the performance of the Oracle server and the SQL statements.

Michael
David_854
Frequent Advisor

Re: Tru64 5.1b error

Adam,
What is the complete environment including patches.
You may want to take a look at your messages file and tell us what are the errors. Also, take a look at your console output if you do not have one. I'll suggest you run a console or xterm -C ( -C for console output), save that and take a look at the console messages.

David
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

Hi

Attached is my /var/adm messages

I checked the console and I have no errors or messages.

thanks,

Adam
David_854
Frequent Advisor

Re: Tru64 5.1b error

Adam,
There are no errors on your messages file. You might want to start troubleshooting using tcpdump check the status of your network, check the errors in binary.errlog, take a look at all your other logs.

David
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

thanks David.

Attached are my crash-data.3.txt and also a crash-data |grep panic

Adam
Ivan Ferreira
Honored Contributor

Re: Tru64 5.1b error

It seems that the database should be tuned. Also, could be helpful if you post your /etc/sysconfigtab file and the results of ipcs -a.

You can also post the results of collect -sd -i 10 -R 40s

Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

Hi Ivan

Attached are the ipcs -a, sysconfigtab and collect -sd -i 10 -R 40s which said Ouch! at the end.


thanks,

Adam
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

how can I find out what disks these are on my storage device?


# DISK Statistics
#DSK NAME B/T/L R/S RKB/S W/S WKB/S AVS AVW ACTQ WTQ %BSY
0 dsk0 0/0/0 0 0 0 1 18.07 0.00 0.00 0.00 0.30
1 dsk1 0/1/0 0 0 0 0 0.00 0.00 0.00 0.00 0.00
2 dsk15 1/0/3 23 2530 0 0 178.42 0.00 4.12 0.00 99.98
3 dsk16 1/0/1 0 0 0 0 0.00 0.00 0.00 0.00 0.00
4 dsk20 1/0/2 0 0 0 0 0.00 0.00 0.00 0.00 0.00
5 dsk12 1/0/11 4 1638 0 0 722.51 0.00 3.40 0.00 99.38
6 dsk22 1/0/4 0 0 0 0 0.00 0.00 0.00 0.00 0.00
7 dsk17 1/1/2 0 0 0 12 2.93 0.00 0.00 0.00 0.10
8 dsk18 1/1/3 0 0 0 12 3.91 0.00 0.00 0.00 0.10
9 dsk19 1/1/4 0 0 56 2815 15.97 0.00 0.91 0.00 6.60
10 dsk21 1/1/1 0 6 0 12 5.05 0.00 0.00 0.00 0.30
11 dsk13 1/1/10 0 25 0 0 7.93 0.00 0.01 0.00 0.60
12 cdrom0 4/0/0 0 0 0 0 0.00 0.00 0.00 0.00 0.00

#### RECORD 2 (1130437845:0) (Thu Oct 27 13:30:45 2005) ####
Ivan Ferreira
Honored Contributor

Re: Tru64 5.1b error

Normally, if you run

show storage

In the storage, you will see the storagesets with an ID.

When you run in the o.s.

hwmgr v d

You will see the same ID. If you don't then you may match the size in the storage and the size on the server.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Adam Strobel
Frequent Advisor

Re: Tru64 5.1b error

thanks,

What I'm trying to figure out is from the "collect" data if I can match it up with the drive configs if I do a "df"

attached is the output from my "collect" and "df". I would like to know more about the two disks show 99.80 %BSY and what a0? it belongs to.