- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: CPU error
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-17-2005 02:33 AM
тАО11-17-2005 02:33 AM
CPU error
In one of my VAX servers,when giving "$Show Error" command, it is showing an error count of 1 against device "CPU".
Do we require to replace the CPU board ? Please Suggest.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-17-2005 02:47 AM
тАО11-17-2005 02:47 AM
Re: CPU error
ANAL/ERROR/INCLUDE=CPU/SINCE=XX-XXX-XXXX
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-17-2005 08:15 AM
тАО11-17-2005 08:15 AM
Re: CPU error
"The system reboot is the only supported approach", but it is obviously undesirable in various situations-there is presently no supported mechanism to reset error counts once the error(s) have been logged.
As for an unsupported approach-and be aware of the potential for causing a system crash...
To reset the error count, one needs to determine the system address of the error count field. For a device, this is at an offset within the device's UCB structure. On VAX, the field is at an offset symbolically defined as UCB$W_ERRCNT and in Alpha UCB$L_ERRCNT.
You now need to locate the system address of the UCB$%_ERRCNT field of the CPU.
Now invoke SDA
$ ANALYZE/SYSTEM
SDA> READ SYS$SYSTEM:SYSDEF.STB
SDA> SHOW DEVICE
SDA> EVALUATE UCB+UCB$ W_ERRCNT
hex=xxx, Decimal=-xxx UCb+offset
Take the hex value,
SDA> exit
$ RUN SYS$SHARE:DELTA --- this will ask series of questions along with the error count noted above in hex value, respond carefully to reset the error count. If any mistake, it agins leads to system crash.
If unable run this successfully, possible reboot solve your problem.
Archunan
Archie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-17-2005 09:01 AM
тАО11-17-2005 09:01 AM
Re: CPU error
To reset the error count, one needs to determine the system address of the error count field. For a device, this is at an offset within the device's UCB structure. On VAX, the field is at an offset symbolically defined as UCB$W_ERRCNT and in Alpha UCB$L_ERRCNT.
You now need to locate the system address of the UCB$%_ERRCNT field of the CPU.
Now invoke SDA
$ ANALYZE/SYSTEM
SDA> READ SYS$SYSTEM:SYSDEF.STB
SDA> SHOW DEVICE
SDA> EVALUATE UCB+UCB$ W_ERRCNT
hex=xxx, Decimal=-xxx UCb+offset
--
This may come as a surprise, but the CPU has no UCB. It is not a device; it is not associated with a device driver
-- Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-17-2005 10:20 AM
тАО11-17-2005 10:20 AM
Re: CPU error
I am sorry, I totally misunderstood the question, after I read error count, immly I got device error count only.
Yes Noorul, my procedure is to reset error count of any device.
Archunan
Archie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-17-2005 02:21 PM
тАО11-17-2005 02:21 PM
Re: CPU error
If this system is life critical, I'd recommend you log a hardware service case immediately and have an engineer analyze your error log.
If the system isn't that critical, I'd tend to keep an eye on the errors, if it increases within (say) one week, then log a case.
You could also install WEBES and/or ISEE. One of its purposes is to gather statistics on hardware errors to determine what needs immediate attention. If configured for it, ISEE is capable of logging a service case automatically if a serious enough hardware error is detected.
Talk to your local customer support centre for more details.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-18-2005 01:35 AM
тАО11-18-2005 01:35 AM
Re: CPU error
Don't know if you have had a chance yet to invoke the $anal/err/includ=cpu command that Ian had mentioned, but you might want to modify to $anal/err/incl=(cpu, mach)/sin=xx. The Vax systems will report a majority of their "machine-checks" as cpu errors. If the Machine-Check occurred in user/supervisor mode, then the system "typically" will NOT crash with a Fatal Bugcheck, but instead just Log the Error to Event-Logger. If the machine-check is due to cache-parity-error, while in user-mode, and the threshold limit is exceeded, the system will disable the cache, which will affect performance. As Mr. Gillings pointed out, if you have the opportunity, you may wish to install Webes or ISEE, which may supply more info. It would appear that currently, the system has NOT suffered a serious error in Exec/Kernel mode, but you may be on "borrowed-time", so you may wish to schedule a Service-Call.
Thanx,
r_white
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-18-2005 05:46 AM
тАО11-18-2005 05:46 AM
Re: CPU error
Lawrence
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-22-2005 02:01 AM
тАО11-22-2005 02:01 AM
Re: CPU error
The error is not increasing since the server became duty on 17/11/2005. Still, under observation. Thanks for ur suggestions.