- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Oracle/RDB page corruption. Complete restore r...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-19-2006 01:31 PM
тАО07-19-2006 01:31 PM
Oracle/RDB page corruption. Complete restore required.
We had to restore the production oracle/rdb database because of the following error. Incremental recoveries failed. A complete delete and restore was performed. In the 10 years as system manager at this site, this is the third time such a problem has occurred First in 1996, then 2002 and now 2006.
From OPCOM we have
%%%%%%%%%%% OPCOM 19-JUL-2006 16:02:06.15 %%%%%%%%%%%
Message from user API_PROD on WIZ22
Oracle Rdb V7.1-441 Event Notification for Database
DSA30:[WIZ_CMPRD.DATA.RDB30]WIZARD_DATA.RDB;1
Requested page 15:330926, received page 0:0; retrying disk read
%%%%%%%%%%% OPCOM 19-JUL-2006 16:02:06.16 %%%%%%%%%%%
Message from user API_PROD on WIZ22
Oracle Rdb V7.1-441 Event Notification for Database
DSA30:[WIZ_CMPRD.DATA.RDB30]WIZARD_DATA.RDB;1
Process 2BE11FE2 generating bugcheck dump file
DISK071:[RDMBUGCHK]RDSBUGCHK.DMP;
Exception at 13792BD4 : PIOFETCH$VALIDATE_PAGE + 000003C4
%RDMS-F-CANTREADDBS, error reading pages 15:330926-330926
-RDMS-F-BADPAGRED, read requesting physical page 15:330926 returned page 0:0
Output from one of the bugcheck dumps indicating page corruption
Alpha OpenVMS 7.3-2
Oracle Rdb Server 7.1.4.4.1
Got a RDSBUGCHK.DMP
RDMS-F-CANTREADDBS, error reading pages 15:330926-330926
RDMS-F-BADPAGRED, read requesting physical page 15:330926 returned page 0:0
Exception occurred at PIOFETCH$VALIDATE_PAGE + 000003C4
Called from PIOFETCH$FETCH + 00000AD4
Called from PIO$FETCH + 00000904
Called from PIO$UPDATE_FIB + 0000029C
Bugcheck accessing storage area CUSTOMER_DESCRIP_SA, area id 15
TSNBLK COMMIT_TSN higher than next TSN (0:429813568)
Line TSN higher than next TSN (0:429813568)
Running image JAVA$JAVA.EXE
Dump created: 19-JUL-2006 16:41:38.09
Database root: WIZ_CMPRD_DB:[000000]WIZARD_DATA
This bugcheck may have been caused by a corrupt TSN block.
The database should be verified to check for such corruption.
Suggested command: RMU/VERIFY WIZ_CMPRD_DB:[000000]WIZARD_DATA
Output from verify command
%RMU-W-SPAMFRELN, area CUSTOMER_DESCRIP_SA, page 330926
error in space management page's free space length
expected: 2996, found: 0
%RMU-W-PAGERRORS, 3 page errors encountered
2 page header format errors
0 page tail format errors
0 area bitmap format errors
0 area inventory format errors
0 line index format errors
0 segment format errors
1 space management page format error
0 differences in space management of data pages
%RMU-I-ESGPGLARE, completed verification of WIZ_CUSTOMER_DESCRIP logical area
as part of CUSTOMER_DESCRIP_SA storage area
Output from rmu/show corrupt
WIZ22_CMPRD $ rmu/show corrupt wizard_data
*------------------------------------------------------------------------------
* Oracle Rdb V7.1-441 19-JUL-2006 17:56:29.62
*
* Dump of Corrupt Page Table
* Database: WIZ_CMPRD_DB:[000000]WIZARD_DATA.RDB;
*
*------------------------------------------------------------------------------
Entries for storage area CUSTOMER_DESCRIP_SA
--------------------------------------------
Page 330926
- AIJ recovery sequence number is -1
- Live area ID number is 15
- Consistency transaction sequence number is 0:0
- State of page is: corrupt
*------------------------------------------------------------------------------
A page in one of the storage areas was "corrupted". Initial diagnose using RMU/Verify did not indicate a problem. We closed and opened the database. The RMU/Verify did then indicate a corrupt storage area.
Oracle's general reponse is that this problem was caused by hardware either memory or disk or both.
A patched version of Oracle/RDB was installed this week.
$ rmu/show ver
Executing RMU for Oracle Rdb V7.1-441
The previous version was Executing RMU for Oracle Rdb V7.1-401
We are running VMS 7.3-2.
The initial Oracle reponse shifts the onus to the system managers to prove that hardware was not the underlying cause. No hardware exception have been raised. Our Production environments host 9 oracle/rdb databases. Only one was broken.
What's your view ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-19-2006 02:47 PM
тАО07-19-2006 02:47 PM
Re: Oracle/RDB page corruption. Complete restore required.
A disk error on a shadow set? How many members? If it's two or more members, a disk error seems unlikely. Are there any error log entries from the time window? I'd expect something for any type of memory or disk hardware error.
I don't see any condition codes for the read error, perhaps there's something in the bugcheck dump file?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-19-2006 05:24 PM
тАО07-19-2006 05:24 PM
Re: Oracle/RDB page corruption. Complete restore required.
in a case like this, I would like to see an OpenVMS DUMP output of the database page(s) involved - but it may be too late to ask for that ;-(
That would allow you to 'see' the on-disk contents and may allow you to spot any unusual patterns. It should at least allow Oracle to diagnose the extent of the 'corruption' (just a byte/word/longword etc.). From the extent of the corruption, you could then speculate about possible reasons for this to have happened.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-19-2006 05:42 PM
тАО07-19-2006 05:42 PM
Re: Oracle/RDB page corruption. Complete restore required.
RDB was patched on the morning of the 18th.
V7.1-401 to 7.1-441.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-19-2006 07:59 PM
тАО07-19-2006 07:59 PM
Re: Oracle/RDB page corruption. Complete restore required.
sometimes, on old version of Rdb this was a memory only corruption in the global buffer structure, but this problem is fixed for a long time. The workaround was just to close/open the database.
I have also encounter this problem on another site where the culprit was the firmware of some disks... But each time doing a rmu/restore/just then rmu/recover/just fixed the problem and this can be done online :-)
I strongly suggest you add journals (AIJ) to your database.
You may also check revision of firmware of your disks and controllers.
JF
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-19-2006 08:55 PM
тАО07-19-2006 08:55 PM
Re: Oracle/RDB page corruption. Complete restore required.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-20-2006 12:27 AM
тАО07-20-2006 12:27 AM
Re: Oracle/RDB page corruption. Complete restore required.
You may have some warnings, for example RMU-W-NOTRANAPP or RMU-W-USERECCOM
which are not fatal, these are just warnings.
JF
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-20-2006 10:10 AM
тАО07-20-2006 10:10 AM
Re: Oracle/RDB page corruption. Complete restore required.
I too have had Rdb corruption in the past, and get a similar response from Oracle.
Obviously over the years you have upgraded the OS & Rdb versions (7.3-2 & 7.1-441), but has the hardware always remained the same ?
Are running the latest F/W on your I/O sub-system ? What about ECO patches for 7.3-2 (FIBRE_SCSI in particular).
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-21-2006 01:41 AM
тАО07-21-2006 01:41 AM
Re: Oracle/RDB page corruption. Complete restore required.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-21-2006 02:26 AM
тАО07-21-2006 02:26 AM
Re: Oracle/RDB page corruption. Complete restore required.
By moving the preferred paths around for the disks that contained our RDB database, I was able to determine that an HSJ controller was the common link. We replaced the HSJ controller and never saw the message again. I added the OPCOM message to my ConsoleWork's scan profile to ensure I'd pick it up again if it ever returned.
At no point did the HSJ controller/disks log any type of error message.
Based on my experience, I would agree with Oracle that you likely have a hardware issue.
-Jeff