IP Telephony - NBX
1752812 Members
5837 Online
108789 Solutions
New Discussion

Re: v3000 lockups - S2 on solid

 
JimC_1
Advisor

v3000 lockups - S2 on solid

Greetings all,


 



I have a v3000A with 4.4.12 that has been running fine for several years now until a bit over a week ago – then it suddenly started locking up.


By locking up I mean that the Ethernet port fails to send and the console port is unresponsive until it is power-cycled.


During lockup, power light is on. Ethernet lights show connection and activity, S1 light is out, S2 light is on solid, Disk activity light is dark and shows no activity.


Unit has a T1 in a chassis, one fax connected to the v3000 port, and uses no analog lines. Location is Ethernet mode (layer2) with 21 model 3102 phones.



Connection of test boxes to monitor the system console, Ethernet response (with 3Com Network Supervisor) and packet traces (With WireShark) show the following.



(on CLI:)v3000->ERROR: Codec could not get ready   Reg:0x26  Read:0x0


This does not take system down, but does anyone know if it is normal or know what it indicates. I can only assume that one of the codecs used in voicemail or recordings is not working properly – but I have no clue if it is of any importance.



(on CLI:)Fn. bIllegalLoop()--iTableIndex is <= -1, i.e. System reached the Maximum Number  of Simultaneous Forward Sessions


I have seen this occasionally on the CLI, and it does not seem to take the system down – at least not immediately.  During a lockup I have seen as many as 7 of them.


I have manually traced through every route in the Dial Plan, every AA, and checked every phone’s call coverage and can find no loop in the system. Anybody know of anyplace else to look or how many “hops” the system is capable of?  I have one route that goes to an AA -> Primary Calling Group -> Secondary Calling Group ->phantom voicemail box.  I wouldn’t think that this is too many hops, and it was working for over a year without a problem.


 



In the NBOSS logs I see:


0310:153757:0923 IntVM642       W Unexpected Message updateOriginalPartyInfo in State Active



0310:120457:0826 NBSetBus231    E 231 : Error: pSessionProfile for other party is NULL 



0310:130309:0884 DBI            W No Personal Operator configured for user at extension:



0310:140512:0423 Router         E Extension Profile is Null for Extension=###


 



There are No Entries at all in the NBOSS for at least the last 10 or more minutes before failure. There has never been an error that could be considered related to hard drive or memory.



In the upgrade log I can see where the system was upgraded from 4.4.8 to 4.4.12 in Aug of 2008 & have seen the following at every restart ever since:



20100312:113733             ==================================================================


20100312:113733                           Phase 6:  Post upgrade data conversion


20100312:113733             ==================================================================


20100312:113733                         Starting data upgrade conversion


20100312:113733                         Completed data upgrade conversion



Is this a normal entry for 4.4.12?



The packet captures have sometimes shown only 3NS pings and NBX layer2 communications for over the last 5 minutes prior to lockup in some cases, and nothing definitive in any of them.


One trace shows the v3000 sending layer 2 packets two seconds before failing to respond to a ping from the 3NS software – with no traffic in between – so I think it is safe to rule out a broadcast storm or a virus.



I have already purged and reinstalled the database without effect.



Is there anything here that indicates something else I could be checking?  I hate to go throwing a drive or a chassis at a perfectly good box.



Present plan is to bench the unit, blow away the version, try to get the system to run a fsck on command or do a drive test, then put back 4.4.12 and restore the existing database.



Anybody know how to get the v3000 to start a fsck or run a drive or memory test from the CLI?



Thank you for your time,


Jim


 

4 REPLIES 4
JimC_1
Advisor

Re: v3000 lockups - S2 on solid

Greetings all,


 



I have a v3000A with 4.4.12 that has been running fine for several years now until a bit over a week ago – then it suddenly started locking up.


By locking up I mean that the Ethernet port fails to send and the console port is unresponsive until it is power-cycled.


During lockup, power light is on. Ethernet lights show connection and activity, S1 light is out, S2 light is on solid, Disk activity light is dark and shows no activity.


Unit has a T1 in a chassis, one fax connected to the v3000 port, and uses no analog lines. Location is Ethernet mode (layer2) with 21 model 3102 phones.



Connection of test boxes to monitor the system console, Ethernet response (with 3Com Network Supervisor) and packet traces (With WireShark) show the following.



(on CLI:)v3000->ERROR: Codec could not get ready   Reg:0x26  Read:0x0


This does not take system down, but does anyone know if it is normal or know what it indicates. I can only assume that one of the codecs used in voicemail or recordings is not working properly – but I have no clue if it is of any importance.



(on CLI:)Fn. bIllegalLoop()--iTableIndex is <= -1, i.e. System reached the Maximum Number  of Simultaneous Forward Sessions


I have seen this occasionally on the CLI, and it does not seem to take the system down – at least not immediately.  During a lockup I have seen as many as 7 of them.


I have manually traced through every route in the Dial Plan, every AA, and checked every phone’s call coverage and can find no loop in the system. Anybody know of anyplace else to look or how many “hops” the system is capable of?  I have one route that goes to an AA -> Primary Calling Group -> Secondary Calling Group ->phantom voicemail box.  I wouldn’t think that this is too many hops, and it was working for over a year without a problem.


 



In the NBOSS logs I see:


0310:153757:0923 IntVM642       W Unexpected Message updateOriginalPartyInfo in State Active



0310:120457:0826 NBSetBus231    E 231 : Error: pSessionProfile for other party is NULL 



0310:130309:0884 DBI            W No Personal Operator configured for user at extension:



0310:140512:0423 Router         E Extension Profile is Null for Extension=###


 



There are No Entries at all in the NBOSS for at least the last 10 or more minutes before failure. There has never been an error that could be considered related to hard drive or memory.



In the upgrade log I can see where the system was upgraded from 4.4.8 to 4.4.12 in Aug of 2008 & have seen the following at every restart ever since:



20100312:113733             ==================================================================


20100312:113733                           Phase 6:  Post upgrade data conversion


20100312:113733             ==================================================================


20100312:113733                         Starting data upgrade conversion


20100312:113733                         Completed data upgrade conversion



Is this a normal entry for 4.4.12?



The packet captures have sometimes shown only 3NS pings and NBX layer2 communications for over the last 5 minutes prior to lockup in some cases, and nothing definitive in any of them.


One trace shows the v3000 sending layer 2 packets two seconds before failing to respond to a ping from the 3NS software – with no traffic in between – so I think it is safe to rule out a broadcast storm or a virus.



I have already purged and reinstalled the database without effect.



Is there anything here that indicates something else I could be checking?  I hate to go throwing a drive or a chassis at a perfectly good box.



Present plan is to bench the unit, blow away the version, try to get the system to run a fsck on command or do a drive test, then put back 4.4.12 and restore the existing database.



Anybody know how to get the v3000 to start a fsck or run a drive or memory test from the CLI?



Thank you for your time,


Jim


 

merlin_1
Super Advisor

Re: v3000 lockups - S2 on solid

R4.4.12 is not supported anymiore via our engineering groups , long standing 3Com policy for 2 code s, R6.5 and R6.0 codes now supported . Check the release notes in R5.0 to se eif there are any system suspnesion bugs .



It could be a denial of service attack on port 1025 1026 or 1027 . Set up a network trace and see at the time of an event if the NBX IP addfress is seeing ay sort of foreign traffic .



 



Best bet is the trace !



merlin_1
Super Advisor

Re: v3000 lockups - S2 on solid

You should actually see system states and system statuses leaving the NBX via filtering on the NBX mac address in your trace . Wireshark and other traces will not allow you to open the packet but you can tell if at the time of the event the NCP is sending and receiving states and statuses



 



VERY IMPORTANT  to make sure these are going to and from the macs for the tels and other devices .



 



Possible to have a bad nic as well !



Hence checking to see if the states and status packets are talking between the NBX and the devices ...



JimC_1
Advisor

Re: v3000 lockups - S2 on solid

Packets were traveling bidirectional to phones and T1 card on layer two, then suddenly stop.  When they stop the S2 light illuminates solid and the PWR/Status light begins to blink.  Ethernet lights show normal connection with traffic - even though the packets are only traveling to the v3000.



It took me a few days to get ahold of another v3000 for testing, but it appears that replacing the processor card last night seems to have done the trick.



Have a good day,



Jim