Operating System - OpenVMS
1824487 Members
3585 Online
109672 Solutions
New Discussion юеВ

BASEstar occurred error event?

 
BG Jeong
Advisor

BASEstar occurred error event?

I'd pleased if anyone could tell me what this means ...

our BASEstar on VMS system occurred some Events repeatedly.

==============================================================================
BS> show history
.............
Event 21.25.6 occurred at 17-JUN-2005 18:00:31.21
Error occurred on VAX port LTA781:.
Possible DLE ACK or DLE NAK embedded in packet - 5.

Event 21.25.6 occurred at 18-JUN-2005 00:00:33.75
Error occurred on VAX port LTA783:.
Possible DLE ACK or DLE NAK embedded in packet - 5.

Event 21.25.6 occurred at 18-JUN-2005 00:00:33.87
Error occurred on VAX port LTA784:.
Possible DLE ACK or DLE NAK embedded in packet - 5.

Event 21.25.6 occurred at 27-JUN-2005 10:39:15.91
Error occurred on VAX port LTA788:.
Possible DLE ACK or DLE NAK embedded in packet - 5.

Event 21.25.6 occurred at 18-JUN-2005 00:00:33.75
Error occurred on VAX port LTA1016:.
Possible DLE ACK or DLE NAK embedded in packet - 5.

Event 21.25.6 occurred at 17-JUN-2005 18:00:32.18
Error occurred on VAX port LTA781:.
Discarded garbage data on line - message lost.

Event 21.25.6 occurred at 18-JUN-2005 00:00:33.77
Error occurred on VAX port LTA783:.
Discarded garbage data on line - message lost.

Event 21.25.6 occurred at 21-JUN-2005 01:00:20.86
Error occurred on VAX port LTA784:.
Discarded garbage data on line - message lost.

Event 21.25.6 occurred at 17-JUN-2005 18:00:28.81
Error occurred on VAX port LTA788:.
Discarded garbage data on line - message lost.

Event 21.25.6 occurred at 18-JUN-2005 00:00:33.78
Error occurred on VAX port LTA1016:.
Discarded garbage data on line - message lost.

Event 21.25.4 occurred at 17-JUN-2005 18:00:28.83
Error occurred on VAX port LTA781:.
Bad QIO read status,
data overrun

Event 21.25.4 occurred at 18-JUN-2005 00:00:33.87
Error occurred on VAX port LTA783:.
Bad QIO read status,
data overrun

Event 21.25.4 occurred at 18-JUN-2005 00:00:33.89
Error occurred on VAX port LTA784:.
Bad QIO read status,
data overrun

Event 21.25.4 occurred at 17-JUN-2005 18:00:28.83
Error occurred on VAX port LTA788:.
Bad QIO read status,
data overrun

Event 21.25.4 occurred at 18-JUN-2005 00:00:33.81
Error occurred on VAX port LTA1016:.
Bad QIO read status,
data overrun

Event 21.25.4 occurred at 18-JUN-2005 00:00:33.81
Error occurred on VAX port LTA2906:.
Bad QIO read status,
data overrun

Event 21.25.6 occurred at 17-JUN-2005 18:00:32.18
Error occurred on VAX port LTA781:.
Did not find DLE STX at the start of message.

Event 21.25.6 occurred at 18-JUN-2005 00:00:33.75
Error occurred on VAX port LTA783:.
Did not find DLE STX at the start of message.

Event 21.25.6 occurred at 21-JUN-2005 01:00:20.86
Error occurred on VAX port LTA784:.
Did not find DLE STX at the start of message.

Event 21.25.6 occurred at 27-JUN-2005 10:47:35.72
Error occurred on VAX port LTA788:.
Did not find DLE STX at the start of message.

Event 21.25.6 occurred at 18-JUN-2005 00:00:33.77
Error occurred on VAX port LTA1016:.
Did not find DLE STX at the start of message.

Event 21.25.6 occurred at 17-JUN-2005 18:00:33.22
Error occurred on VAX port LTA781:.
1005.

Event 21.25.6 occurred at 18-JUN-2005 00:00:33.78
Error occurred on VAX port LTA783:.
1005.

Event 21.25.6 occurred at 27-JUN-2005 10:23:35.81
Error occurred on VAX port LTA1016:.
1005.

Event 21.25.6 occurred at 19-JUN-2005 10:00:26.66
Error occurred on VAX port LTA783:.
Unexpected ACK received.

Event 21.25.6 occurred at 27-JUN-2005 10:47:35.79
Error occurred on VAX port LTA788:.
Unexpected ACK received.

Event 21.25.6 occurred at 17-JUN-2005 20:00:31.78
Error occurred on VAX port LTA1016:.
Unexpected ACK received.

Event 21.25.6 occurred at 27-JUN-2005 10:30:52.53
Error occurred on VAX port LTA2906:.
Unexpected ACK received.

==============================================================================

Does it mean that there is something wrong with the DECserver?
It's not a very big problem as it works, but I'd love to understand.
is there any way I could change any setting so it would not occure error event?

Tru64 from Korea
1 REPLY 1
John Gillings
Honored Contributor

Re: BASEstar occurred error event?

BG,

I think it's better to look at these messages in temporal order, and sorted by device.

The first "block" affects two devices LTA788 and LTA781 at about the same time. I'd be looking for an environmental cause. Maybe something is introducing some noise on the serial lines? Do the cables follow a common path? Can you identify any environmental events at 18:00 on 17th June? Storm, power surge, maintenance?

21.25.6 17-JUN-2005 18:00:28.81 LTA788:. Discarded garbage data on line - message lost.
21.25.4 17-JUN-2005 18:00:28.83 LTA788:. Bad QIO read status, data overrun

21.25.4 17-JUN-2005 18:00:28.83 LTA781:. Bad QIO read status,data overrun
21.25.6 17-JUN-2005 18:00:31.21 LTA781:. Possible DLE ACK or DLE NAK embedded in packet - 5.
21.25.6 17-JUN-2005 18:00:32.18 LTA781:. Did not find DLE STX at the start of message.
21.25.6 17-JUN-2005 18:00:32.18 LTA781:. Discarded garbage data on line - message lost.
21.25.6 17-JUN-2005 18:00:33.22 LTA781:. 1005.

Same device, but separated by 4 hours. Now what's curious about this whole block, and includes the previous block, is that something seems to happen between 30 and 33 seconds past the hour. Could this be some event that actually happens on the hour, but your clocks have drifted by 30 seconds? Say something on a power grid, or perhaps an external event?

21.25.6 17-JUN-2005 20:00:31.78 LTA1016:. Unexpected ACK received.
21.25.6 18-JUN-2005 00:00:33.75 LTA1016:. Possible DLE ACK or DLE NAK embedded in packet - 5.

21.25.6 18-JUN-2005 00:00:33.75 LTA783:. Possible DLE ACK or DLE NAK embedded in packet - 5.
21.25.6 18-JUN-2005 00:00:33.75 LTA783:. Did not find DLE STX at the start of message.
21.25.6 18-JUN-2005 00:00:33.77 LTA783:. Discarded garbage data on line - message lost.
21.25.6 18-JUN-2005 00:00:33.78 LTA783:. 1005.

21.25.6 18-JUN-2005 00:00:33.77 LTA1016:. Did not find DLE STX at the start of message.
21.25.6 18-JUN-2005 00:00:33.78 LTA1016:. Discarded garbage data on line - message lost.
21.25.4 18-JUN-2005 00:00:33.81 LTA1016:. Bad QIO read status, data overrun

21.25.4 18-JUN-2005 00:00:33.81 LTA2906:. Bad QIO read status, data overrun

21.25.6 18-JUN-2005 00:00:33.87 LTA784:. Possible DLE ACK or DLE NAK embedded in packet - 5.
21.25.4 18-JUN-2005 00:00:33.89 LTA784:. Bad QIO read status, data overrun

21.25.4 18-JUN-2005 00:00:33.87 LTA783:. Bad QIO read status, data overrun

Could your clocks have drifted back by 4 seconds by the 19th and another 6 seconds by the 21st?

21.25.6 19-JUN-2005 10:00:26.66 LTA783:. Unexpected ACK received.

21.25.6 21-JUN-2005 01:00:20.86 LTA784:. Did not find DLE STX at the start of message.
21.25.6 21-JUN-2005 01:00:20.86 LTA784:. Discarded garbage data on line - message lost.

These samples kind of squash that idea...

21.25.6 27-JUN-2005 10:23:35.81 LTA1016:. 1005.

21.25.6 27-JUN-2005 10:30:52.53 LTA2906:. Unexpected ACK received.

21.25.6 27-JUN-2005 10:47:35.72 LTA788:. Did not find DLE STX at the start of message.
21.25.6 27-JUN-2005 10:47:35.79 LTA788:. Unexpected ACK received.

21.25.6 27-JUN-2005 10:39:15.91 LTA788:. Possible DLE ACK or DLE NAK embedded in packet - 5.


Bottom line here is there are probably thousands or even millions of events flying around this system. Even at very low error rates, some errors are bound to occur. Basestar is smart enough to recover from some errors, so that's probably happening here. Note that the events are sporadic, and are distributed over several devices.

Look for patterns to see if there is some systemic cause (for example, a loose plug or cable). Otherwise, just keep track of the number of faults reported against a particular device and make sure they stay below your thresholds. If you find a particular device or set of devices playing up, consider sampling the data stream with a line analyzer (though, at those rates, you'll need a lot of storage to save the traces!).
A crucible of informative mistakes