Email Subscription Notifications Suspended Temporarily
We are in the process of making navigation in the Servers and Operating Systems forums simpler and more direct. While doing this, we have to temporarily suspend email notifications for subscriptions. If you are subscribed to one or more discussion boards or blogs in the community, please check them daily to see new content. Notifications will be turned back on in a few days. We apologize for any inconvenience this may cause. Thanks, Warren_Admin
Disk Enclosures
cancel
Showing results for 
Search instead for 
Did you mean: 

MSA1000 questions

SOLVED
Go to solution
Tom Swigg_1
Occasional Advisor

MSA1000 questions

We have a MSA1000 running FW 4.32 build 300 used as storage for a MS exchange server running with MSCS. (I am not a Windows person so apologies for the lack of clarity of the Application config). We use Secure Path for multipath access, and a product called Double Take to replicate selected files to a backup site (which currently resides two feet away)- as far as I understand things this replication uses a dedicated LAN and not the SAN unlike CA with EVA*

We have been having some performance problems and I have been looking into the SAN side of things and starting from the bottom up I used the CLI to find out how the MSA1000 was configured. It has a single shelf with 14 disks, 2 in the spare set.

My questions (which may have nothing to do with our performance problem but I am trying to understand how this was put together) are as follows:

1) show profile indicates that we are running with
"Mode 12 = Ignore Force Unit Access on Write"
whereas all other OS profiles use
"Mode 12 = Enforce Force Unit Access on Write"
There are two windows profiles:

Profile name = Windows: with ignore FUA on write set
and
Profile name = Windows_SP2_and_below: with enforce FUA on write set

All connections use the Default profile and the Default profile is Windows ie ignore FUA on write.

It seems to me that FUA may have something to do with an OS requesting a synchronous write eg for metadata.

1.1 How is FUA on write interpreted by the MSA1000, use the cache as a write through cache?
1.2 How come Windows can get away with ignoring this whereas OpenVMS, Tru64, Linux, Solaris, Netware , HP (?UX) enforce it?

1.3 We are also running Win2003 SP2. Does this also mean we have the wrong setting?

1.4 Are we in danger of meta-data corruption?

2) There are 12 RAID 1 units each with a stripe size of 128K, each spread across all 12 available physical disks. We also have 9 RAID 5 units each with a stripe size of 16K, each spread across all 12 available physical disks.

2.1 My gut feeling is thay this is not an ideal configuration: is not 16K rather a small I-O size for RAID 5?

2.2 Is having 16k stripes and 128K stripes coexisting on the same physical disks not a recipe for fragmentation and poor I-O?

3. When connecting to the MSA1000 for the first time a show this_controller showed the active controller battery as off. When I looked a little later it was on.

If the battery is off does this not mean the cache becomes write through and performance declines.

Any comments appreciated. Full show tech_support attached
The road of excess leads to the Palace of Wisdom
6 REPLIES
John Kufrovich
Honored Contributor

Re: MSA1000 questions

Tom,

Normally, I would recommend upgrading to 4.48 but we will be introducing new MSA1000 FW very soon. So, you could wait till then.

Your right, you configuration isn't ideal. How many mailbox users are you supporting?

To your questions.
1.1, 1.2, 1.3, 1.4. Windows is notorious for setting FUA. FUA, is Force Unit Access, it basically forces data to write-through cache to the drives. The problem was when the MSA cache was full, Windows would send commands with the FUA bit set and wouldn't stop. If there were many commands with FUA set, we couldn't flush the controllers cache. So, now we ignore the FUA. The MSA cache is BBWC. All data is written into BBWC. No concern for corruption. The other OS's, sparsely use the FUA.

You really should set SSP. Windows servers have a habit of claiming all LUNs. Wouldn't want you to install a new server and inadvertly delete a LUN. Set the host profile to Windows.

2, 2.1, 2.2. Not ideal configuration but that really depends on your I/O pattern. The problem with sharing so many spindle is drive contention. Meaning, request waiting for access to use the drive.

3).If you make configuration changes, we momentarily disable/enable cache.







Tom Swigg_1
Occasional Advisor

Re: MSA1000 questions

Thanks John,

Our active user base is about 20,000 with around another 40,000 mailboxes remaining from previous academic years. We get about 100,000 emails a day. The mailbox data stores are living on the RAID 5 units. I would have thought that 16K was a small stripe-size for RAID5.

Are there any known problems with FW 4.32 as it is a bit old?

Also, there are only 4 connections seen on the active controller. Should this not be 8 if we have multipath access to the MSA1000 from 2 nodes?

The road of excess leads to the Palace of Wisdom
John Kufrovich
Honored Contributor
Solution

Re: MSA1000 questions

When ever a equipment manufacturer releases new firmware or drivers there are reasons. Usually it is due to a certain situation or condition.

Depending on a certain condition, 4.48 could be potentially faster. I have never seen the problem associated with Notes/Exchange. Those applications are IOPS transactions.

Based on the above information, your mail servers storage is not properly sized for the number of users. If your users are experiencing slow response. You really should consider, adding one maybe two extra MSA30 shelves with 15k drives.

You can attach the MSA cli cable to monitor some other performance counters on the MSA.
>show cacheinfo - Displays a snapshot of your cache useage. Cycle through the command a few times
>show taskstats - Displays a snapshot of the commands the MSA controller is working on.

>start perf - Let this run for a while. To average out the counters below.

>show perf
>show perf physical
>show perf logical

>stop perf
>clear perf

If you are running windows, you some of the perfmon counters. Perfmon counter Physical disk->current QueueLength, pick your busiest LUN.

I only see 4 hba's. What is your SAN setup.

Basil Vizgin
Honored Contributor

Re: MSA1000 questions

John, is new fw long-awaited A/A for MSA1000?
John Kufrovich
Honored Contributor

Re: MSA1000 questions

Maybe ;)
Basil Vizgin
Honored Contributor

Re: MSA1000 questions

John, when-when-when?:)