HPE GreenLake Administration
- Community Home
- >
- Storage
- >
- Midrange and Enterprise Storage
- >
- HPE EVA Storage
- >
- FCIP issues with SAN replication
HPE EVA Storage
1827294
Members
1696
Online
109960
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2010 10:54 AM
06-28-2010 10:54 AM
FCIP issues with SAN replication
Greetings all,
My company is attempting to perform replication from one HP EVA SAN array to another HP EVA SAN array across the WAN. We have a metro Ethernet connection between the two with one Gigabit of shared bandwidth. We share the bandwidth with our other business units, with no QoS in place, but we have been told that the pipe has never been completely saturated, and we’re not rate limited. The SAN arrays are on 4Gbps fiber channel brocade switches. There are two devices called MPX110’s that send the data from fiber channel to Ethernet. Each MPX has redundancy groups they perform replication for, and although they have two Ethernet and two fiber channel ports on each, we only use one on each. Each MPX110 has a path they perform replication for to their counter parts on the other side. It is my understanding they negotiate a tunnel between them, Fiber Channel over IP. They’re each on their own 6509 which have a uplinks to a 3750 and that goes across the metro Ethernet to a 3560 on the other side, then up to a 3560 acting as the core and out to two 3560’s with an MPX on each one.
Now the problem, although we have one gigabit of bandwidth, they’ll only use about 13Mbps of it each, we’ve verified this with iperf. Each connection we’ll only take 13Mbps of bandwidth, parallel tests show each connection gets 13Mbps of bandwidth. The HP engineer told us that at >5Mbps we get approximately 1.3Mbps of actually data, which means that FCIP has 80% over head? Can that be right? The big huge problem is that after running for several hours they’ll eventually just die and have to rebooted to start replicating again. They’re already on the latest firmware (2.4.4.1). The only error we get from the statistic screen of the MPX’s says they’re getting TCP timeouts.
I’ve performed captures on both sides’ MPXs’ and the errors I see in a 60 sec sample are FCP malformed packets (~4300), duplicate ACK’s (~41), previous segment lost (~3), fast retransmission (~3). When HP was questioned about the FCP malformed packets they stated that they use a proprietary protocol and that wireshark wouldn’t be able to decode it. I’ve since searched for this protocol but can find no references to it anywhere. The other errors seem so minor and few it would be hard to believe that they’re impacting the data stream that much if at all.
I’ll include a small sample of the captures, if it lets me.
Thanks in advance for your assistance.
Chandler Bing
My company is attempting to perform replication from one HP EVA SAN array to another HP EVA SAN array across the WAN. We have a metro Ethernet connection between the two with one Gigabit of shared bandwidth. We share the bandwidth with our other business units, with no QoS in place, but we have been told that the pipe has never been completely saturated, and we’re not rate limited. The SAN arrays are on 4Gbps fiber channel brocade switches. There are two devices called MPX110’s that send the data from fiber channel to Ethernet. Each MPX has redundancy groups they perform replication for, and although they have two Ethernet and two fiber channel ports on each, we only use one on each. Each MPX110 has a path they perform replication for to their counter parts on the other side. It is my understanding they negotiate a tunnel between them, Fiber Channel over IP. They’re each on their own 6509 which have a uplinks to a 3750 and that goes across the metro Ethernet to a 3560 on the other side, then up to a 3560 acting as the core and out to two 3560’s with an MPX on each one.
Now the problem, although we have one gigabit of bandwidth, they’ll only use about 13Mbps of it each, we’ve verified this with iperf. Each connection we’ll only take 13Mbps of bandwidth, parallel tests show each connection gets 13Mbps of bandwidth. The HP engineer told us that at >5Mbps we get approximately 1.3Mbps of actually data, which means that FCIP has 80% over head? Can that be right? The big huge problem is that after running for several hours they’ll eventually just die and have to rebooted to start replicating again. They’re already on the latest firmware (2.4.4.1). The only error we get from the statistic screen of the MPX’s says they’re getting TCP timeouts.
I’ve performed captures on both sides’ MPXs’ and the errors I see in a 60 sec sample are FCP malformed packets (~4300), duplicate ACK’s (~41), previous segment lost (~3), fast retransmission (~3). When HP was questioned about the FCP malformed packets they stated that they use a proprietary protocol and that wireshark wouldn’t be able to decode it. I’ve since searched for this protocol but can find no references to it anywhere. The other errors seem so minor and few it would be hard to believe that they’re impacting the data stream that much if at all.
I’ll include a small sample of the captures, if it lets me.
Thanks in advance for your assistance.
Chandler Bing
2 REPLIES 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2010 11:51 AM
06-28-2010 11:51 AM
Re: FCIP issues with SAN replication
Hello Chandler,
have you setup you switches right for FCIP and IP Distance gateway?
If you have an HP B-Series/ Brocade FC-Switch do this:
switchdisable
iodset
aptpolicy 1
portCfgISLMode (Set for all FCIP switch Ports)
switchenable
Regards,
Patrick
have you setup you switches right for FCIP and IP Distance gateway?
If you have an HP B-Series/ Brocade FC-Switch do this:
switchdisable
iodset
aptpolicy 1
portCfgISLMode (Set for all FCIP switch Ports)
switchenable
Regards,
Patrick
Best regards,
Patrick
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-14-2010 11:25 AM
07-14-2010 11:25 AM
Re: FCIP issues with SAN replication
Thanks for your help, I passed that on to our SAN engineer to implement. Can you explain a little what it does and what problem it's specifically addressing?
Now the update:
I discovered if I disable the FCP decode, Wireshark does decode it correctly as FCIP.
We applied a QoS config to flag SAN replication traffic as DSCP EF and have seen consistent ping times of ~36ms between sites and the bandwidth climb as high as 45Mbps on a 1Gbps link. They still fail after replicating for a few hours. Last time we watched them replicate for 12 hours and then fail. The TCP timer exceed counter seems to indicate that is the problem, but I have nothing significant on the wireshark captures to support this.
HP has decided that the MPX110 on the far side needs to be replaced. I'll post an update after that's done.
Now the update:
I discovered if I disable the FCP decode, Wireshark does decode it correctly as FCIP.
We applied a QoS config to flag SAN replication traffic as DSCP EF and have seen consistent ping times of ~36ms between sites and the bandwidth climb as high as 45Mbps on a 1Gbps link. They still fail after replicating for a few hours. Last time we watched them replicate for 12 hours and then fail. The TCP timer exceed counter seems to indicate that is the problem, but I have nothing significant on the wireshark captures to support this.
HP has decided that the MPX110 on the far side needs to be replaced. I'll post an update after that's done.
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Support
Events and news
Customer resources
© Copyright 2025 Hewlett Packard Enterprise Development LP