ProLiant Servers (ML,DL,SL)
1753774 Members
6809 Online
108799 Solutions
New Discussion

DL385, DL585, and BL460 crashing during EMC VNX update

 
don_m
Occasional Contributor

DL385, DL585, and BL460 crashing during EMC VNX update

We have 25 Oracle DB servers running on HP hardware, everything from DL385G1 to BL460G7.  All currently running under SLES11 SP1. Using Powerpath for multipathing (version 5.5.0.00.00-275)  For the second time in 6 months we have had all these boxes crash during a Flare update  process.  Storage is a VNX5500. All Windows, VMware, and Solaris servers survived the SP  failovers. Here are the steps we took prior to this latest disaster:

  Verified BIOS settings for the Qlogic HBAs per EMC recommendation

  Installed the 'latest' set of OS patches. This is restricted by the fact that Powerpath is not certified with the current set of Suse patches. We're up to kernel version 2.6.32.54-0.3

  Configured hosts to capture crash dumps

I've made a stab at looking at the crash dumps but I'm not a kernel guru. In 2 cases the backtrace showed the crash occurred during a vfs function in the ASM module. Another case showed the reiser_fs module running at the crash.

It appears that EMC wants to pass this problem off to Oracle since they own ASM. Suse doesn't  want to look at it since we're not at the current patch level.

We never had this problem running with a CX700 and CX3-80 and SLES9. All survived numerous NDU  maintenance events and a couple of FC switch failures. Now we're running our production DB  environment without effective SAN redundancy.

 

Has anyone else seen problems like this with SLES11 and EMC SAN arrays ??

 

thanks,

Don M.