M and MSM Series
1752599 Members
5243 Online
108788 Solutions
New Discussion юеВ

Re: Losing access to network

 
sjordet
Advisor

Re: Losing access to network

Well, I have found out that the drivers of the wifi-card has a lot to say. It seems the MSM infrastructure is more sensitive than others for bad wifi-drivers. I can't explain why, though.

 

I've had several versions of Intel Wifi drivers giving me a lot of headaches, even quite recent ones...

 

-Stian

Arimo
Respected Contributor

Re: Losing access to network

There's a lot of  Intel 7240 WiFi chipset going around. This one's known to have problems with any wless. Intel has published a fix. MSM also has a fix on 6.4.2 -FW. There are also other client connectivity fixes, including DHCP connectivity.

 

So if you're entitled, I'd suggest updating.


HTH,

Arimo
HPE Networking Engineer
Dmitry2012
New Member

Re: Losing access to network

The same issue. But i using one msm720 with 9 ap's msm410. Anyone knows what to do with this trouble?

ISHR
New Member

Re: Losing access to network

Hello,

we're facing very similar problems here at International School in Hanover, Germany. Running MSM765 controllers with MSM422 and MSM466 APs and a total of 74 APs. We have appr. 250 clients using the WiFi of which 90% are MacBooks.

 

Our case ID is 4762319356

 

With kind regards,

ISHR

Break_dontfix
Occasional Visitor

Re: Losing access to network

I also am having this issue.  I have read through the 6 pages of blog, and find that I am not alone.  I have an MSM410, with NO controller, a stand alone unit.  I was running Firmware 5.5 for about 3 years with NO problems.  (Same clients, same hardware connecting and using the AP, about 20 devices).  After 3 years, I decided to upgrade the firmware, so that I could connect with browser (SSL v3 not supported on newer OSes) and manage the device.  Bad move!  I upgraded about 10 days ago, and every 24 - 48 hours, I get called that our WiFi is dead.  I log into it, and it seems like it is up and running, but the people using just lost connectivity.  I restart (from web panel or PoE cable disconnect) cures the problem and life goes on.

There is only 1 VSC and wireless security is NOT checked.

Has anyone been able to make this work.  Firmware is current at 6.6.2.0-22792

 

Aar├│n
Frequent Advisor

Re: Losing access to network

Hello,

I'm having some big throughput issues. Recently we added a 5th controller to our team with 100 extra APs, having a total of ~760 APs and peaks of 4000 concurrent users. On the weekends it seems to be fine, we have a openwrt client making measures all the time to see how the throughput/latency and so on is working. The problem comes as soon as the number of users arise.

When there are about 1.5-2K simultaneous users connected, the manager controller CPU stays continuosly at 90-100% usage until the number of users drop, then the CPU lowers (it still has some spikes though),

From what I gathered the culprit seems to be a process called rrdsampler, it hogs the CPU and it is affecting the service. It is affecting the authentication process as well, I noticed that we have a ton more 802.1x timeouts than before, the throughput drops drastically and ping loss and latency increases. That happens on an AP without many users and the total throughput of the AP on the ethernet port is very low.

There are no big interferences detected, I went there with a spectrum analyzer to check if it could be an RF issue but I didn't find any problems, just a nearby AP that was on a different channel so no channel overlap there (5 channels of difference between them).

I know that RRDtool is used for graphing and storing statistics, maybe the issue here is trying to get too many statistics from each and everyone of the users. When there are few users it's ok, but when that value spikes it's just not working.

We are running 6.6.2.0, we have many 3 VSCs, 2 of them are tunneled through the controllers but the third one is not tunneled (sends the traffic directly from the AP to a VLAN tagged directly onto it). We are not using the team for control access, just for authentication through an external RADIUS server.

Our configuration is like this:

 - We have the lower allowed speed rates disabled (11Mbps or higher are only allowed) to assure a good connection for each user.

 - RRM enabled with auto-channel, auto-power and AP load balancing.

 - Tx protection -> RTS/CTS with 1024 RTS Threshold to mitigate the hidden node problem (we took measures to see if this affected the overall throughput and it didn't seem affect that much).

I already opened a case with support but I would like to know if someone is experiencing the same issues I'm having. Mostly the rrdsampler process issue, if you want to check whether the process is hogging the CPU SSH the controller/AP and type top.

 

Aar├│n

 

Thanks!

Aar├│n

CraigStrydom71
Occasional Advisor

Re: Losing access to network

Hi Aaron,

Do you have LLDP enabled?

Disabling LLDP dropped our CPU usage from 90-100% to 27-60%.

I have 4x MSM760 teamed with 400X MSM460 APs on software ver 6.6.2.0.

Regards,

Craig.

Aar├│n
Frequent Advisor

Re: Losing access to network

No I haven't, I will try that and let you know how it goes, thanks!

Aar├│n
Frequent Advisor

Re: Losing access to network

I just disabled LLDP but the CPU it's still very high, here is the top output command:

Mem: 2244736K used, 862380K free, 0K shrd, 315404K buff, 722896K cached
Load average: 3.51, 3.73, 3.70    (State: S=sleeping R=running, W=waiting)

  PID USER     STATUS   RSS  PPID %CPU %MEM COMMAND
25481 root     R        15M   449 91.5  0.5 rrdsampler ---> That is the process that hogs the CPU
 5853 root     R       141M   449 19.1  4.6 rfmgr_sc
  478 root     S       736M   449  9.5 24.2 regng
  815 root     S       3088   449  4.0  0.0 openvpn_master
  728 root     S        36M   449  2.7  1.2 openvpn
  452 root     S       6228   449  2.7  0.2 store-devices

Also here is a screenshot where you can see how the users increase (top graph), the manager controllers CPU increases as well (middle graph) and the bandwidth report deacreases (last graph). It happens everyday except on the weekends, where I the bandwidth was far better and consistent.

I added a second graph but with a week timespan where the you can see the behaviour I mean.

 

Any ideas? When I get some more info I'll keep on posting plus with any news from support.

 

Regards,

Aaron

CraigStrydom71
Occasional Advisor

Re: Losing access to network

Another setting may be IGMP proxy under Home -> Network -> IGMP.
Have seen it mentioned together with high CPU.

Also, Radius accounting really adds a lot of processing. Disable it if you do not use it on the VSC.

Perhaps disable RRM to test

Check if you have severe interference checks enabled on the radios - also made my CPU usage higher.

Will post again if I think of anything else.

Regards,

Craig.