Disk Enclosures
1752618 Members
4587 Online
108788 Solutions
New Discussion юеВ

Re: MSA1500 redundancy

 
SOLVED
Go to solution
James Mercer
Advisor

Re: MSA1500 redundancy

Torsten - the fail-over test failed. When we remove the active controller, the standby controller tries to become active and then fails with the error:

"43 Redundancy failed hardware failure"
From our point of view, the array is now 'dead'. Software and disk access no longer communicate with the MSA controller.

Shouldn't the standby controller become active and indicate that it is ready for use?

Stephen Kebbell
Honored Contributor

Re: MSA1500 redundancy

Hi,

hang on, do you only have one SAN switch ? If so, your configuration is not a supported one. The rule for the MSA is either single everything or dual everything (HBAs, Switches, Controllers). Nothing in between is supported (even if it does seem to work). Also, you mention you have a mixed environment - I assume this means different OSes. This usually means you NEED zoning on the switch(es). You may run into problems if you do not keep different operating systems in separate zones.

To test controller failover, you could connect the serial cable to the front of the active controller and use the CLI to disable the controller. The command is:

disable this_controller

From the user guide: "In a dual-controller system, this command disables one of the controllers to prepare it for removal. When a controller is disabled, all resources being processed by that controller are automatically failed over to the remaining controller. After a controller has been successfully disabled, the LCD panel displays a message stating that it is safe for that controller to be removed."
The CLI User Guide can be found here:
http://h20000.www2.hp.com/bc/docs/support/SupportManual/c00683579/c00683579.pdf

Note: I hope this command is supported in your firmware version. I can't say for sure.

Regards,
Stephen
Torsten.
Acclaimed Contributor

Re: MSA1500 redundancy

The message "43 Redundancy failed hardware failure" is telling you, there is no longer Redundancy, because one controller has failed. Again, if you need to have the fail over software on your host and you don't have it installed, your host will still try to access the failed controller. Use the front panel to get the current status. BTW, I don't know your whole configuration, thats why I can only answer in general.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
James Mercer
Advisor

Re: MSA1500 redundancy

Right Torsten, I understand your point - you're saying the message is merely a warning and not an indication that the secondary controller has failed as well. Since we lose all communication when the primary is lost, we assumed the message was reflecting our problem and not stating something else.

Our firmware (5.02 and 5.10) only allows the standby controller to be disabled.

Fair point on the redundancy supported by HP: all single path or all redundant. In the past with our HP virtual arrays, we've used a mix redundancy situation - some servers are single path and some are dual path, but the controllers are always dual path. I think we expected the same of the MSA even though all of our servers are likely to be single pathed - we still expected the MSA 'backplane' to handle communication between the FC I/O cards and each controller.

Is it true then that: MSA FC I/O cards only communicate with one controller and any use of the standby controller must use multi-path software?

I've tried installing MPIO on one server, but it fails presumably because it has only one FC HBA.

Any other thoughts welcome, but I think it is beginning to be very clear.

Thanks
Torsten.
Acclaimed Contributor

Re: MSA1500 redundancy

Virtual arrays - VA or EVA? The VA is totally different.

You should now read the documentation carefully and make sure you meet all requirements regarding number and model of HBAs, drivers, SAN connection/cabling, software and so on. In case of doubt ask here or open a call with HP.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Uwe Zessin
Honored Contributor
Solution

Re: MSA1500 redundancy

> Is it true then that: MSA FC I/O cards only communicate with one controller and any use of the standby controller must use multi-path software?

Absolutely. A host's FC port is only allowed to have access to one of the MSA's controllers and vice versa: an MSA controller must not be accessed by more than one FC host port.

It is not written that way in the documentation, but if you scroll back some time in ITRC's history, you will see quite a number of failures due to breaking these rules.
.
James Mercer
Advisor

Re: MSA1500 redundancy

Torsten - yes, we've been re-reading the documentation very carefully over the last few days. There are some known issues like the firmware version and Netware, but the only thing that wasn't working as expected was the 'failover'. In that regard the documentation was clear about what was supported, but not about what worked. I think you and the others have answered that - the MSA controllers only establish one connection and there is no back plane connection between controllers.

Your comment on VA vs EVA being very different - can you explain? In general I know the differences regarding continuous access (replication), sizes, speeds, and product lifetime (EVA is the future), but I take it you were talking about the controller redundancy behavior. Can you tell me more about how the VA and EVA differ in redundancy methods? Thanks in advance.

Uwe - I think you 'hit the nail on the head'. Our assumption which wasn't directly contradicted in the documentation has lead to the whole problem. In hindsight, I would have rather spent my time in the forums than trying to 'fix' something that wasn't broken...
Uwe Zessin
Honored Contributor

Re: MSA1500 redundancy

The VA uses two disk pools (called Redundancy Groups). By default, all disk drives from odd slot numers are allocated to one RG and the even ones to the other. Each RG is assigned/maintained to a single controller. The storage ressource (LUN) is allocated from such a RG.

The EVA uses one or more (up to 16) disk groups. It usually selects the disks itself, but the user can explicitly group/ungroup individual disk drives. The storage ressource (virtual disk) allocates space from a single disk group, but the controller management can change. Both controllers can manage different virtual disks in the same group.


As far as I can tell, the VA has provided 'Active/Active' support from the very beginning and the EVA used the old DEC HSG mode: a single controller allows read/write access to a resource while the second controller maps SCSI LUNs as well, but I/Os cannot be satisfied through these paths.

This has changed with Active/Active firmware for the EVA since some time ago. However: both arrays do work the same way: a single controller does the actual disk I/O (on the VA it is called the 'performance path') for a storage ressource [*] and if you try to access the ressource through the other controller it has to redirect the I/O to the owning controller.

[*] Unlike the MSA controllers with Active/Passive firmware, the other EVA controller can do disk I/O for a different virtual disk, even if it's in the same disk group.
.
Torsten.
Acclaimed Contributor

Re: MSA1500 redundancy

regarding the connections:

The 2 controllers are connected to each other to allow communication, of course.

Every controller is connected to all the SCSI busses to be able to access every disk.

The fibre channel connections are dedicated to a single controller, first FC connection only for the first controller, second FC connection for the second controller only.

The main difference between active/passive (e.g. MSA1500 with "old" fw) and active/active (new EVA, VA) is recognizable by the name, I guess.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
James Mercer
Advisor

Re: MSA1500 redundancy

Thanks for the explanations - it was the dual controller, dual FC I/O yet only single paths that we hadn't understood. The information regarding the EVA and VA match what we thought. Our mistake was to think this also applied to the MSA.

I'll leave the thread open in case anyone wants to comment further, but you've cleared up the issue for me - there never was a problem with hardware - it was only with our expectations.

As for the firmware, since we're running Netware, I'll load the firmware that HP recommends; well, it recommends 4.98, but supplies 5.02...

Thanks again for all the help.