- Community Home
- >
- Networking
- >
- Legacy
- >
- Switches, Hubs, Modems
- >
- Re: 16-port 10/100/1000 Module (J4907A) failure
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-25-2004 12:38 AM
тАО09-25-2004 12:38 AM
16-port 10/100/1000 Module (J4907A) failure
I recently received a bunch of 5308xl switches with this gig module and two of them were failing on by test bench. I was testing the power supply redundancy by unplugging/plugging one of the two power cords and somtimes a module would fail.
At no time was there any trap sent nor was there anything in the syslog. A soft boot would not bring back the module but would log the failure locally. An issue with the OS prevents boot-time errors from being sent to the trap. A cold boot would resurrect the module.
HP replaced both modules and both chassis and I have not been able to reproduce the problem since.
The switches are still on the test bed and will not go in production until some software notification issues are resolved. I still have a nagging concern about the lack of notification when the module failed as that points to an OS issue that hardware replacement does not solve.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-25-2004 12:50 AM
тАО09-25-2004 12:50 AM
Re: 16-port 10/100/1000 Module (J4907A) failure
I have put off deploying these seven 5308xl switches until we get to the bottom of the module failures and I see some progress on the lack of notification issue. Meanwhile, I am testing them further in the lab and have encountered a couple more issues for which I have separate threads.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-26-2004 07:43 PM
тАО09-26-2004 07:43 PM
Re: 16-port 10/100/1000 Module (J4907A) failure
Out of curiosity, what firmware are you running and do the J4907A modules fail in any slot or specific slots?
Regards,
SCOOTER
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-27-2004 03:54 AM
тАО09-27-2004 03:54 AM
Re: 16-port 10/100/1000 Module (J4907A) failure
I am running the latest (E.08.42) software.
As I had mentioned, it is only sometimes that they fail. On one switch it was in slot E and the other it was in C. I was instructed by HP support to move the modules to a different slot and never saw a recurrance of the failures under the same conditions. Not satisfied that simply moving them was the solution to the problem, I pressed HP support and they suggested I put them back in their orignal slots for more testing. When I did, they failed immediately on boot. HP had me ship both modules and both chassis back to them so no further testing was done by me.
Now a third mondule on yet another switch failed in slot E as well but this one failed on boot and not during RPS testing (I did manage to ferret out a faulty RPS though but that is yet another topic). I am still working with HP on the issue and will do more tests to see if it will also fail under RPS cycling.
It is this random intermittent failing that really concerns me because there is no trap sent when it happens. I have yet another incident (2 actually) open with HP on the no trap notification issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-27-2004 07:03 AM
тАО09-27-2004 07:03 AM
Re: 16-port 10/100/1000 Module (J4907A) failure
Just an idea when I was reading the tread!
May be one good thing you can do with this switch is put a UPS with. If your power falilure are short, you can buy little UPS that not cost a lot and then you avoid faillure.
Its only a patch on the problem. But while you wait for HP, you can put them faster in production.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-28-2004 12:55 AM
тАО09-28-2004 12:55 AM
Re: 16-port 10/100/1000 Module (J4907A) failure
I grabbed a 5308 running E.08.42.
2 Powersupply's
Slot A J4907A module
Slot B J4820A module
Slot E J4907A module
Configured only a Static IP to check eventlog.
Booted the switch, all fine.
Powercycled PS 1, all fine.
Powercycled PS 2, all fine.
Removed PSU 1 reinserted PS 1, all fine
Removed PSU 2 reinserted PS 2, all fine
Removed all power booted, all fine.
Moved J4907A modules around in the empty slots all slots recogized the module and no faults on the slots or modules occurred.
Performed this procedure twice without any problems.
FYI:
S/N Switch SG419JZ030
S/N J4907A SG421PM0IU
S/N J4907A SG421PM040
Sorry, I could not reproduce your problem.
Kind regards,
SCOOTER
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-28-2004 01:40 AM
тАО09-28-2004 01:40 AM
Re: 16-port 10/100/1000 Module (J4907A) failure
I did not expect anyone except perhaps HP to go through that level of testing. I am not surprised that it would not fail for you. Most times it would not fail for me either!
This is a very obscure failure that is almost impossible to deliberately reproduce. It only manifests itself rarely but I have my eyes as a witness and show techs (after a warm boot) to prove it.
show running-config
; J4819A Configuration Editor; Created on release #E.08.42
hostname "MO PBX HP 5308xl"
snmp-server contact "Les Ligetfalvy"
snmp-server location "Main Office PBX Room"
time timezone -300
module 1 type J4907A
module 2 type J4907A
module 3 type J4907A
module 4 type J4907A
module 5 type J4907A
module 6 type J4907A
module 8 type J4907A
module 7 type J4907A
...
show modules
Status and Counters - Module Information
Slot Module Description Serial Number
----- ------------------------------------ --------------
A HP J4907A XL Gig-T/GBIC module SG425PM0X6
B HP J4907A XL Gig-T/GBIC module SG425PM0ZM
D HP J4907A XL Gig-T/GBIC module SG425PM0PS
E HP J4907A XL Gig-T/GBIC module SG425PM0M0
F HP J4907A XL Gig-T/GBIC module SG425PM0LG
...
PowerDsine Show:
Slot 3
CRASHLogfileshow
slot 3:
-------
ERROR: slot 3 not ready
CRASHData
slot 3:
-------
ERROR: slot 3 not ready
poe_status_port all
slot 3:
-------
ERROR: slot 3 not ready
pdshow
slot 3:
-------
ERROR: slot 3 not ready
...
W 08/24/04 09:26:53 snmp: SNMP Security access violation from 10.198.10.12
I 08/24/04 09:26:54 tftp: Transfer completed
M 08/24/04 09:26:57 sys: 'Config updated via network tftp'
I 08/24/04 09:26:57 system: --------------------------------------------------
I 08/24/04 09:26:57 system: System went down: 08/24/04 09:26:57
I 08/24/04 09:26:57 system: Config updated via network tftp
I 08/24/04 09:27:02 lacp: Passive Dynamic LACP enabled on all ports
I 08/24/04 09:27:07 chassis: Slot A Inserted
I 08/24/04 09:27:07 chassis: Slot B Inserted
I 08/24/04 09:27:07 chassis: Slot C Inserted
I 08/24/04 09:27:07 chassis: Slot D Inserted
I 08/24/04 09:27:08 chassis: Slot E Inserted
I 08/24/04 09:27:08 dhcpr: DHCP relay agent feature enabled
I 08/24/04 09:27:08 chassis: Slot F Inserted
W 08/24/04 09:27:08 chassis: Power Supply failure: Supply: 2, Failures: 1
I 08/24/04 09:27:08 chassis: Slot A Downloading
I 08/24/04 09:27:08 tftp: Enable succeeded
I 08/24/04 09:27:08 system: System Booted.
I 08/24/04 09:27:08 cdp: CDP enabled
I 08/24/04 09:27:08 chassis: Slot B Downloading
I 08/24/04 09:27:09 chassis: Slot D Downloading
I 08/24/04 09:27:09 chassis: Slot E Downloading
I 08/24/04 09:27:09 chassis: Slot F Downloading
I 08/24/04 09:27:10 chassis: Slot A Download Complete
I 08/24/04 09:27:10 chassis: Slot B Download Complete
I 08/24/04 09:27:10 chassis: Slot D Download Complete
I 08/24/04 09:27:11 chassis: Slot E Download Complete
I 08/24/04 09:27:11 chassis: Slot F Download Complete
I 08/24/04 09:27:25 chassis: Slot A Ready
I 08/24/04 09:27:26 chassis: Slot F Ready
I 08/24/04 09:27:26 chassis: Slot E Ready
I 08/24/04 09:27:26 chassis: Slot D Ready
I 08/24/04 09:27:26 chassis: Slot B Ready
W 08/24/04 09:27:27 chassis: Module in Slot C not Supported or may be Faulty
I obviously left out a lot of the show tech. I am starting to wonder if my "SNMP Security access violation" issue (another thread) may be overflowing a buffer (stack) and the change of RPS status what triggered it. This one may be a tough nut to crack.
As for the UPS suggestion, I would never consider putting cheapy UPSes on the switches but I do have expensive redundant UPSes in most of my rack rooms. Cheap UPSes are generally unmanaged and yet another point of failure.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-28-2004 05:57 PM
тАО09-28-2004 05:57 PM
Re: 16-port 10/100/1000 Module (J4907A) failure
I don't know why, but my feeling is that SCOOTER _IS_ connected to HP somehow.
Right, SCOOTER? C'mon, you can tell us!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-29-2004 11:38 AM
тАО09-29-2004 11:38 AM
Re: 16-port 10/100/1000 Module (J4907A) failure
I made a typo when I said "Now a third mondule on yet another switch failed in slot E"...It was slot C and I have not been able to reproduce the error.
I did get word today from HP on the "trap notification failure on boot" issue and it sounds like I may be getting a fix for Christmas.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-21-2004 12:00 PM
тАО10-21-2004 12:00 PM
Re: 16-port 10/100/1000 Module (J4907A) failure
Yesterday I took all my switches up to 8.50 and decided I would take them from my test lab and press them into service since I have not been able to repro the module faulting mentioned above.
Well, today I mounted three of them into racks and meshed them together. I did not connect any servers or clients yet but did connect it to my Cisco. THey were not up for an hour when a module faulted.
One of the three locations (L3) has only a single UPS so one of the two RPSs was connected to raw town power. There was a scheduled town power outage today and the one (and only) UPS dumped. When that happened, the switch at L3 rebooted and faulted module C. While there is no knowing where in this chain of events the module faulted, there was no trap thrown nor was there any syslog entry.
I guess my old Cisco core and Nortel edge switches will be around for a while longer.