Switches, Hubs, and Modems
1753770 Members
4973 Online
108799 Solutions
New Discussion юеВ

ProCurve 4100gl and Oracle 10g RAC

 
Ulf Zimmermann
Frequent Advisor

ProCurve 4100gl and Oracle 10g RAC

We are having problems with our Oracle 10g RAC cluster which uses a pair of 4108gl as interconnects. The two 4108gl are connected to each other via 4 GigE links (port A1/A2/B1/B2) trunked together via LACP. Firmware is G.07.103, there is 104 available but I looked at the release notes and our problem isn't listed as a possible fix.

We created separate VLANs for each interconnect of which there are 3. Each node has the two onboard Broadcom nics (NetXtreme II on these DL380g5) and two Intel e1000 based ports (HP branded). One of the Broadcom ports and both e1000 are used as an interconnect. 3 nodes have their 3 ports connected to the first switch, the other 3 have their 3 ports connected to the second switch.

Now Oracle uses large UDP packets to communication its global cache updates and the 4100gl aren't jumbo frame capable so we are seeing fragmented packets. And we are unfortunately also seeing missing packets when we look with tcpdump. We can also see ICMP reassembly time exceeded because of those missing packets.

I can use something like "ping -i 0.1 -q -c 1000 -s 8000 192.168.201.x" from one node to another and never seem to loose any packets. But too many of the Oracle UDP packets are getting lost.

Has anyone seen anything similar?
2 REPLIES 2
Matt Hobbs
Honored Contributor

Re: ProCurve 4100gl and Oracle 10g RAC

What type of modules are you using in the 4100?

If you are using the 20-port gigabit module you should try the 'qos-passthrough-mode' command. Actually give it a try anyway.

One other thing that may help in this instance is enabling flow-control.

Otherwise, the architecture of the 4100 has only a 2-Gbit link going from each module to the backplane. Not sure how much traffic you have but maybe you're hitting that limitation?
Ulf Zimmermann
Frequent Advisor

Re: ProCurve 4100gl and Oracle 10g RAC

Mostly we are doing less then 2 Mbit/sec per port per node. Flow control is enabled on all ports but on the e1000 cards itself it is not (yet).