M and MSM Series
1752270 Members
4439 Online
108786 Solutions
New Discussion

Re: Losing access to network

 
Aarón
Frequent Advisor

Re: Losing access to network

Hello Craig,

this is how we have it configured:

 - IGMP is disabled

 - We need RADIUS for authentication

 - We need RRM enabled as well so the APs are assigned to their channels in a way that make sense, we had issues before using it due RF interference between or APs.

 - Severe interferences is disabled, we had a case prior to this one that showed that the APs were hopping between channels constantly and prevented RRM from running as those APs weren't on a "stable condition".

I still think it's just a graphing problem, let's see what does support say about it.

 

Regards,

Aarón

CraigS1971
Valued Contributor

Re: Losing access to network

Hi Aaron,

Radius authentication does not need accounting except if you use it for some kind of bandwidth limit etc.
I disabled radius accounting and users are still happily authenticating. ;-D

I also enabled RRM for auto channel and auto power, but then switched it off because the environment should not change all the time. You can perhaps disable it and run it manually once a week/month.

I would like to know what HP support says.

Hope you win.

Regards,

Craig

Aarón
Frequent Advisor

Re: Losing access to network

Hi Craig,

my bad, I read your prior post incorrectly, we only do authentication with a remote radius, no accounting whatsoever.

We have RRM enabled it so it runs automatically every night at 5.00AM, I'll wait for HPs response, maybe it would be better to not run it every single day, just once a week. I'll wait to see what they say.

As soon as I get any news on the case I'll post back.

 

Thanks again!

Aarón

PS: I hope I win too :D

Aarón
Frequent Advisor

Re: Losing access to network

Hi guys,

so I have some news on the case. The rrdsampler is a process that just manages the dashboard, it retrieves some info and then feeds the webs dashboard.

This process was running on a higher priority than it should so when there were many users connected it was gathering information from all the APs plus the clients statistics. When that happened the controller had no CPU left to do authentication and even if a client could connect properly the bandwidth was very poor.

The temporary fix is to kill that process as it only affects the dashboard so no harm there.

After killing it the wireless complaints seem to have decreased but I'm still troubleshooting a couple of complaints.

Just out of curiosity, how are your RRM results? I've been checking them and I some decissions do not make sense, like having three nearby APs in the same channel even though the neighbor channels have a low noise-floor.

Evan_ISS
Frequent Advisor

Re: Losing access to network

I tend to trust my RF surveys more than RRM, only use it in small sites (less than x20 APs).

NCGnet
Advisor

Re: Losing access to network

Interesting post, and exactly the same issue we are having with 4 MSM760's and 480 AP's. In top I can see that the rrdsampler process is peaking up to 90-100% cpu usage. However, one question, how do you kill the rrdsampler process? I tried from the CLI but I think I need shell acess and there seems to be a challenge, and I don't know how to work out the response.

CraigS1971
Valued Contributor

Re: Losing access to network

Hi NCGnet,

Did you try to disable RRM, LLDP and IGMPproxy?

I did not need to kill rrdsampler.

 

Aarón
Frequent Advisor

Re: Losing access to network

Hi NCGnet,

this issue is related to firmware version 6.6.2.0 with high client volume environments (in our case the problem showed up at approximately 1500 users) and it seems to be because the rrdsampler process has a higher priority than it should. If you see that the load on your manager is very high (you can see this in the top command through the CLI) you should open a ticket with support to see if they can help you, as you can't kill that process by yourself.

 

Cheers,

Aarón

NCGnet
Advisor

Re: Losing access to network

Hi and thanks for the replies,

Every weekday we have between 2000 and 2800 concurrent guest connections and this is when we see the same issues. However our controllers are at V6.6.3.0-22868. I still see the same issues as with V6.6.2.0 which we used to run.

I have opened a call with HPE regarding this. We run HP iMC with the WSM module and used to run Solarwinds Orion NPM. We recently switched off Orion and the snmpd process which was hogging a lot of cpu time dropped drastically. So I thought we had nailed it then, but the rrdsampler seems to be another culprit. We rarely ever look at the dashboard on the controllers, so to lose it would not be any great loss.

I'll try turning off LLDP and CDP first and if no luck I will log another call with HPE and see where we get.

Many thanks for the help!

Rob

NCGnet
Advisor

Re: Losing access to network

Hi,

We don't run RRM as it caused more problems than it resolved when we tried. IGMPproxy is off already too, I have just disabled LLDP and CDP to see how that affects things.

Thanks for your reply

Rob