Communications and Wireless
1855758 Members
9244 Online
104103 Solutions
New Discussion

Re: Problem with AP 420wl

 
Carsten Reinhard
Frequent Advisor

Problem with AP 420wl

Hi all,

in the last two weeks 13 of our 65 APs lost their configuration without warning. It happended on different dates and times - all were reseted to factory defaults. All APs are connected to ProCurve 2626PWR and use PoE. I'm using firmware version 2.1.2. I did not receive any syslog message from the ap (level set to error).

I do not expect that someone hacked the APs. Also I do not expect that someone pushed the reset button on the AP. Any idea what can cause this problem?

16 REPLIES 16
Mohieddin Kharnoub
Honored Contributor

Re: Problem with AP 420wl

Hi

Would you please update to firmware 2.1.5
If you have a proper site installation and channel assignment, please disable the auto channel, and assign static channel (use channel 1,6 and 11 to prevent interference).

Use NetStumbler to check neighbor channles.

Use a perpoer Security like WPA or WPA2.

Don;t forget to assign points, its the way to say Thanks :)

Good Luck !!
Science for Everyone
Carsten Reinhard
Frequent Advisor

Re: Problem with AP 420wl

Hi Mohieddin,

since the firmware is running perfect for 8 months, I do not see why I should update. Even if this is the first advice you get from the HP Support. Except, the problem is mentioned in the release notes (but it's not for this problem).

Concerning security and channel setup I do not see any correlation between this setup and my problem. But as the setup was done by HP, it should be correct.
Matt Hobbs
Honored Contributor

Re: Problem with AP 420wl

Hi Carsten,

It truly sounds like the AP's simply lost their configuration for no real reason.

Was there anything in the 2626PWR event log? I'm expecting some PD entries, probably not much else.

You can disable the reset button just to rule out the chance that someone has gone around and held in the reset button. 'no reset button'. I'm wondering if someone did press and hold the reset button if you'd see an entry on your syslog server anyway, might be worthwhile trying. You could also try a 'reset configuration' to see if that sends an event. For this test you'd probably want to set the logging level to debug just to make sure you'd get the event.

If both send an event then you'd know that in this case the AP's lost their configuration without any user intervention as you did not see any entries on your syslog.

Matt
Carsten Reinhard
Frequent Advisor

Re: Problem with AP 420wl

Hi Matt,

thanks for your reply. Even I do not think, that someone resetted the APs do default, because there are mounted in a height of 6-8 meter in our production facility, I will disable the reset button.

To get a better monitoring I would like to send all possible snmp-messages from all APs to my Systems Insight Manager. But this is not working for a unknown reason (see in the SIM-Forum, if you are interested). The PoE-Switch just send some notification, that the port went down and was blocked by LACP. After some seconds it came up again (looks like a normal restart).

But even if there were a short power fault (although the PoE is connected to an UPS) why should this reset the AP? It has a flash and normaly they come up even if there are powered down for hours?
Matt Hobbs
Honored Contributor

Re: Problem with AP 420wl

Hi Carsten,

I just found out today you can reset these AP's through SNMP, so I'd recommend you change your SNMP community names on these AP's and possibly stop your SIM server from polling them as it may be what's causing this to happen.

Matt

Don't forget to assign points to any replies you receive.

Jason C Carter
Occasional Advisor

Re: Problem with AP 420wl

We have been having this same problem for the past 7 months now. It repeatedly happens to several hundred of our access points and we are extremely frustrated. The resets occur in waves lasting about a week, roughly every month.

It has happened on 420's in every building we have, connected to several different manufacturer's switches including HP, POE and directly plugged in. Happens on 2.1.2, 2.1.4, and 2.1.5 firmware. We thought it might be a hacker at one point so we switched to SNMPv3, changing all credentials, the problems continued. It happens on 420's we purchased 2 years ago, and ones that are brand new. We've set up test beds flooding 420's with traffic trying to get them to do this but they stay solid. We've had this problem regardless of whether or not they were on or off our HP700wl system.

We're at a total loss and HP claims that we are the only ones with this problem, which I find hard to believe. Has any one else dealt with this?

One question I do have for you is, are you running Procurve Manager? We are and would like to know if that is a factor.
Jason C Carter
Occasional Advisor

Re: Problem with AP 420wl

I almost forgot; The 420's are almost all mounted above the ceiling and with dozens sometimes dropping at the same time it would be nearly impossible for this to be someone manually resetting them. We've also set the logging to debug, configured a syslog server for all of them, and never seen a problem. Just one minute they're reporting, the next minute they're gone. None of the switches have ever reported an error on the corresponding ports.
Mohieddin Kharnoub
Honored Contributor

Re: Problem with AP 420wl

Hi Jason

Yo uare doing a Troubleshooting is a very critical stage.

I would suggest you a silly thing.

Remove one of your AP, let it sit next to you, isolate it from the PoE, and power it up with a normal power adapter, then
delete it from ProCurve Manager , and if you do assign an IP through DHCP, assign a static IP.

Do whatever you feel it will narrow the Issue, like is this AP is connected to an Old unmanaged switch, connect to another one, you know better than us about your network, so do whatever to isolate and narrow it.

Then keep an eye on your AP, and try to find out whats up.

Please Let us know what going up with you, it concern us as it does for you.

Good Luck !!!
Science for Everyone
Mohieddin Kharnoub
Honored Contributor

Re: Problem with AP 420wl

One more thing.

Would you try to disable Autochannel, and select one fixed static.

Science for Everyone
Carsten Reinhard
Frequent Advisor

Re: Problem with AP 420wl

Jason,

thank you for you reply. It does feel so good not beeing alone with such a problem :-)

After I reported the problem here no other AP has "died". So yes, the problem oocured in one single wave.

Like you we are using PCM. I think Version 2.0 is installed to do the management. As we have some problems with the PCM (automatic downloads of the text configuration are assigned to the wrongs APs) it perhaps is another problem with this software. Which version do you use?

As you told you seem to have a lot APs. So perhaps you can check (with HP) the log files of PCM if it has killed your APs?!

I will report the general problem in August after finishing my holidays. So after that you are not the only one at HP having this issue.

Greetings from Djerba.
Carsten Reinhard
Frequent Advisor

Re: Problem with AP 420wl

Mohieddin,

I still do not understand your strategy trying to solve this problem. If you have like in our case 65 APs and some (15) of them died without a warning, how should I know which of the APs dies next to place it on my desk and do all the tests you would do? And Jason has hundrets of them

And again, why should it be a solution to disable auto-channel? In fact, I have never used it and I have/had the problem.
Jason C Carter
Occasional Advisor

Re: Problem with AP 420wl

With the exception of our mobile labs, all of our devices have hard set channels and have been checked for overlaps. All but a few (2-3 used for testing and training between subnets) are on static IPs.

Our 420's are on a mix of PoE and direct power adapters, the problem does not discriminate. They are also on a wide range of switches, some HP2626's, PoE and non-powered, HP4800's, Cisco 3500's, Allied Telesyn managed and unmanaged, some even on hubs I believe, once again, the problem doesn't discriminate.

We have created a simulated 'building' on our network just for testing this problem with a server and 2 420's on their own subnet. We've done everything we can think of to duplicate this problem, including creating loopbacks without spanning tree and targetting the 420's specifically with traffic generators, all with no success.
Jason C Carter
Occasional Advisor

Re: Problem with AP 420wl

Sorry Carsten, forgot to reply to your question. We have analyzed the PCM logs and we are running the latest version with all updates. We could not find any indication that PCM is causing this. We sent the files to HP and they did not report finding anything either. As a precaution we took our PCM server (the whole machine) offline for several week to see if the problem continued without it running, and indeed it did.
Carsten Reinhard
Frequent Advisor

Re: Problem with AP 420wl

Jason,

as we have a very homogenous networkwith only HP Switches and only use HP ProCurve 2626PWR to power the APs, this seems to be a HP problem for me.

But talking about (management)-software. As I'm in a migration phase of our old server to new HP Blades I installed HP SIM some weeks before the problem occured the first time. Do you use SIM as well? And if yes, what version do you use?
Jason C Carter
Occasional Advisor

Re: Problem with AP 420wl

I'm afraid we don't use SIM so that's most likely a dead end. I think switches can be eliminated as well due to the broad array of different switches we've had failures on.

One theory here is that this problem comes from a cocktail of a 420 problem mixed with a thus far undetected problem on our network.

The other theory is that there is a problem with either the 3.0.6 bootrom or the 2.1.x firmware itself. We never had any trouble (running almost 2 years) until we upgraded. For this to be the case we'd have to assume that most of HP's customers either haven't upgraded yet or have few enough 420's that they can fix it quickly without concerning themselves with contacting HP.

We're into our 7th month of dealing with this problem almost weekly, I'll post if any progress is made.
Carsten Reinhard
Frequent Advisor

Re: Problem with AP 420wl

Jason,

I received some new beta firmwares from AP concerning this problem. As it occurs very rarely at our side, I would like to ask you if you have already received and tried those firmwares.

Greetings Carsten