StoreVirtual Storage
1752378 Members
6953 Online
108788 Solutions
New Discussion

Cache error and Node Repair

 
SOLVED
Go to solution
JMan1
Advisor

Re: Cache error and Node Repair

If I look at the site configuration, the node has already been added into the site even though I didn't add it.  So now i have 6 nodes in that site (should be 5).  I have the ghost one in there and the repaired node.

What happens if I remove the ghost node from the management group and just add the repaired node in without doing an exchange?

oikjn
Honored Contributor

Re: Cache error and Node Repair

I don't think you can remove the ghost node even if you try...  can't hurt to try.

 

You are 100% sure the "new" node is on the site?  If you didn't add it to it, it shouldn't be there.

Can you attach a screenshot of the node list for the cluster and the site list? 

 

Too much to read, but if the dead node is still in the cluster, you cannot/should not add the new node unless you do an exchange as the strip wouldn't be correct as it would assume the dead node will come back at some point.  Also, assuming the dead node IS showing in the cluster still, right click on it and select "repair storage system".  If that wasn't done, it might not have triggered the condition where it is viewing that system as truely dead and not returning instead of just missing.  Once that is done, it should show the node as RIP:mac_address. at which point the proceedure you are trying should work without the warning.

 

 

 

 

 

 

JMan1
Advisor

Re: Cache error and Node Repair

Ok, so here is what I did.  I removed that node from the management group with no problem, checked the site list for the second site and it wasn’t there.  I added it back in to the management group, checked the site list (Image 1), and it still wasn’t there.  What is still showing up in the site is the ghosted machine.  So far so good.

I right click on the cluster, select Edit cluster, and then Exchange Storage Systems.  I select the ghosted system from the list, then select the “Exchange Storage System…” button.  I then select the repaired unit as it is the only one in the list of available units and click OK.  When I click OK, I get the message shown in Image 2.

After I click OK, I get the message shown in Image 3.

Does this mean that the site will be deleted or the servers listed will be moved from that site?  I don’t want to do anything to change my sites so I click Cancel, then Cancel 2 more times to get out of this and get back to the CMC.

However, now, when I go into sites, SAN2 is listed in the secondary site, even though I cancelled it out.  See Image 4. And SAN2 is still sitting outside of the cluster with the ghosted machine still in there.

And SAN2 is still sitting outside of the cluster with the ghosted machine still in there.  See Image 5.

And yes, if I click on the ghosted node, it shows the MAC as RIP0-xx:xx:xx, where the MAC is the same as SAN 2.

oikjn
Honored Contributor
Solution

Re: Cache error and Node Repair

before you attemp the exchange node action you need to see SAN2 in the same site as the RIP node in the cluster.  You did not mention doing that step in the process after joining the management group and I think img2 is a warning about that.  The node you add must be in the same site as the missing node or it will mess up the stripe pattern for redundancy and cause the system to do a total restipe which is not ideal (understatement of the day).

 

That said, the warning in img4 refrences a single user...  that is strange and I don't know if/what user prefrences are site specific.  I don't see my users with any site specific settings.  I think as long as you are 100% with the sites definititions for your systems in the management group, then you should be fine to proceed with the node exchange.  The 100% critical point is that you must see the new node and the old node listed in the same site BEFORE you initiate the cluster exchange node action.

Once you start that action, the RIP node will show outside the cluster but inside the management group and the new node will show in the cluster with a caution until it completes its restripe of each LUN.  Once that is done, the RIP node will dissappear from the management group.

 

Side note:  the "SAN" is the entire network (IE the management group(s)), the systems aren't SAN's they are Nodes inside the san. ;)

 

 

you do NOT want to add the node and if CMC

JMan1
Advisor

Re: Cache error and Node Repair

You are correct.  I didn't add it to the site after joining the management group.  The users in that image are servers but Lefthand is calling them users.

Yes, I know that they are nodes.  That is just the naming scheme that was chosen (for whaterver reason) so we stuck with it.  If I had it do over again, they would be named differently.

Sorry, didn't understand your last sentence.

JMan1
Advisor

Re: Cache error and Node Repair

Yes!!!  You rule!!!

The correct solution was to add it to the site before exchanging it.  It is restriping now.

My last question is now I have the ghost node sitting outside the cluster but still within the management group.  Will that go away on its own or do I need to remove that from the Management Group?

oikjn
Honored Contributor

Re: Cache error and Node Repair

glad to hear its working :)

 

yes it will go away on its own when the restripe is complete.