1751907 Members
5138 Online
108783 Solutions
New Discussion юеВ

Re: Oracle10gR2

 
Pavel Scheberov
Occasional Contributor

Oracle10gR2

Hi, all!
Please HELP!!!
We have a problem with Oracle 10gR2 RAC.

2 x DL580G2 (System ROM Version P27-09/15/2004);
2 x FCA2101 2GB Fibre Channel HBA (Firmware Version 3.93A0, Option ROM Version 1.70A3);
1 x MSA1000 (Firmware Version 4.48A) + 1 x MSA30;
Windows Server 2003 EE + SP1;
Emulex LightPulse LP952 SCSIport Miniport Driver (Version 5.5.20.9).

After installation of Oracle 10gR2 Clusterware and restart of servers, second starting node hangs in blue screen STOP 0x0000FFFF. Blue screen occurs at the stage of тАЬApplying Computer SettingsтАЭ. Only after having stoped OracleCRService, OracleCSService and OracleEVMService services on the first server, the second server downloads properly.

Best regards,
Pavel.
4 REPLIES 4
Hans van Veen_1
Occasional Advisor

Re: Oracle10gR2

We suffer from the same in a 5 blade (BL20p) Ora10g RAC setup attached to a MSA1500 SAN using ASM. Same windows version. For multipath access we use the out-of-the-box MPIO for MSA driver

Regards,
Hans
Kari Tolvanen
Occasional Visitor

Re: Oracle10gR2

Hi Pavel,
Did you ever get the cause of this bluescreen?
Regards,
Kaapo
EGBERT VELDMAN
New Member

Re: Oracle10gR2

found this:

Oracle Meta Link: Note:337784.1
https://metalink.oracle.com

Symptoms:
While running RAC a blue screen is shown and a reboot takes place. The windows created coredump shows that the orafencedrv.sys is involved.

Cause:
When running Oracle 10g RAC/CRS on Windows, the OracleCSService isSUPPOSED to reboot the OS if it detects a problem in the clusterware.The result of a CSS daemon rebooting the node will be that a bluescreen will occur.

The failure is as per design. Anytime that the OracleCSService process fails, it is designed to cause the machine to reboot it does this by means of an IOCTL to the IOFENCE driver, this is a kernel driver which gets a fault. And for windows this is an unhandled exception what will cause the blue screen.

Solution:
So the question is not why does the blue screen occur, but why is the OracleCSService process failing (node eviction)

The question to that answer can only be answered by investigation the CSS logfiles from all the nodes in the cluster.


The logfiles you need to investigated to see if there was a problem with the interconnect, or with the I/O to the voting disk are depending on the CRS release used


10.1.0:%CRS_HOME%\css\log\ocssd.log

10.2.0:%CRS_HOME%\log\\cssd\ocssd.log
Russ Starksen
New Member

Re: Oracle10gR2

This same issue turned up after applying security patches and HP driver updates.

By turning off the Oracle services and setting them to manual the second node would start up fine. We then would manually start oracle and it comes up fine.

It will still hang with automatic services set for oracle.... The Metalink not 337784.1 looks promissing.