Operating System - Linux
1752777 Members
6700 Online
108789 Solutions
New Discussion юеВ

Re: Oracle + HP Agents = Crash..

 
SOLVED
Go to solution
Don_89
Trusted Contributor

Oracle + HP Agents = Crash..

We have about 15 Oracle boxes running AS/ES 2.1 on Oracle 8.1.7 & 9.2.0.5 with HP-Agents version 6.40 with no problems..

About 2 months ago we installed 4 systems with AS/ES 3.0 (HP DL-380 G3's, 2GB RAM) . Oracle 9.2.0.5 is then installed & patched. Everything works fine (50+ days) until I start the HP-Agents (latest ver. 7.1-xx) , about 24-48 hours later, the server freezes with no messages in any of the log files or in Insight Manager.. Red Hat won't even attempt to troubleshoot this. I've had a ticket open with HP for over 2 months with no resolution. These are production systems so I can't do any testing to figure out WTF is going on. Anyone have some suggestions or experience this?

Any help is appreciated..
3 REPLIES 3
Patrick Lampert_1
Occasional Advisor
Solution

Re: Oracle + HP Agents = Crash..

Can you tell us what the ticket number of your call?

Two things I can think of offhand.

1. BIOS - Make sure you are at relase from 9/15/2004 See:

http://h18007.www1.hp.com/support/files/server/us/download/22089.html

Fixes

Updated to integrate the latest Intel processor support code into the System ROM. This works around an issue with the Intel Xeon processor that could cause unexpected behavior or system hang.

2. Make sure you are at RH AS 3.0 U3 or preferrably U4. There are problems with swapping too early and associated hangs with earlier releases.

Beyond that we would need a sysrq forced netdump of the server while hung.
Celso Medina Kern
Trusted Contributor

Re: Oracle + HP Agents = Crash..

Hello Don,

The first thing to do is to configure a serial console and capture all information up to the moment of hang via terminal emulation logging.

You also need to enable the magic sysrq key in order to allow dump/information capture straight from the kernel in the moment of hang via serial console. Do:

echo 1 /proc/sys/kerne/sysrq
vi /etc/sysctl.conf
kernel.sysrq = 1

Test your serial console sysrq capture online with care, there are commands that reboot/halt the system. Here is how it works:

There is a magig sysrq key that you can enable in your kernel, so that when the
hang occurs you can try out somethig else than simply rebooting:

* What are the 'command' keys?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'r' - Turns off keyboard raw mode and sets it to XLATE.

'k' - Secure Access Key (SAK) Kills all programs on the current virtual
console. NOTE: See important comments below in SAK section.

'b' - Will immediately reboot the system without syncing or unmounting
your disks.

'o' - Will shut your system off (if configured and supported).

's' - Will attempt to sync all mounted filesystems.

'u' - Will attempt to remount all mounted filesystems read-only.

'p' - Will dump the current registers and flags to your console.

't' - Will dump a list of current tasks and their information to your
console.

'm' - Will dump current memory info to your console.

'0'-'9' - Sets the console log level, controlling which kernel messages
will be printed to your console. ('0', for example would make
it so that only emergency messages like PANICs or OOPSes would
make it to your console.)

'e' - Send a SIGTERM to all processes, except for init.

'i' - Send a SIGKILL to all processes, except for init.

'l' - Send a SIGKILL to all processes, INCLUDING init. (Your system
will be non-functional after this.)

'h' - Will display help ( actually any other key than those listed above will di
splay help. but 'h' is easy to remember :-)

To configure your magic key, look at
http://www.linux.com/howtos/Remote-Serial-Console-HOWTO/security-sysrq.shtml
Basicamente i habilitar no `make menuconfig`:
Kernel hacking --->
[ x ] Magic SysRq key
e depois de compilado e carregado o kernel:
sysctl -w kernel.sysrq=1
Editar /etc/sysctl.conf
# Disables the magic SysRq key
kernel.sysrq = 1

To see more information on RedHat sysrq documentation, look at
/usr/src/linux/Documentation/sysrq.txt in your system.

The most usefull options for me were p, t and m. In the cases where i troubleshooted Red Hat hangs over Proliant, the cause was crond looping and spinning child processes.

Celso
God bless pessimists, they did the backup!
Celso Medina Kern
Trusted Contributor

Re: Oracle + HP Agents = Crash..

To enable your serial console:

Connect a serial cable in Serial port 0 of your DL380 and in a PC/notebook serial port. Use a software emulation to connect to the PC/note serial port. Enable activity logging.

1 - In Proliant BIOS, enable serial console;

2 - edit /etc/grub/grub.conf
in kernel line statement append at the end of the line: console=ttyS0

3 - enable a getty for ttyS0 from init, editing /etc/inittab;

4 - enable root login from ttyS0:
vi /etc/securetty
ttyS0

To use magig sysrq key:
in your serial emulation - i.e. minicom linux terminal:
a) Configured 'minicom' as follows:

# cat /etc/minirc.dfl
# Machine-generated file - use "minicom -s" to change parameters.
pr port /dev/ttyS0
pu baudrate 19200
pu backspace DEL

b) Start 'minicom' and login your system and
press 'SysRq' + 'p' to dump the current registers and flags to your console.

To send 'BREAK' signal on 'minicom' to have to use CTRL-A-F, so I used:
CTRL-A-F and 'p'

Celso
God bless pessimists, they did the backup!