Databases
cancel
Showing results for 
Search instead for 
Did you mean: 

backup script failed to stop db instance

Fauziah Mahdan
Super Advisor

backup script failed to stop db instance

Hi all,
I have db1 and db2 server in shared storage mode.
We issue orastop script when backup start at 2.30am and orastart script when backup finish at 7.00am.
Orastop manage to down the instance at db2 but not db1.
Why?

Fauziah
13 REPLIES
Patrick Wallek
Honored Contributor

Re: backup script failed to stop db instance

Good question.

What do your log files say? Perhaps it was busy. Perhaps there was a problem.

Without significantly more information, like a database log or the output from the orastop script, it is impossible to say what happened.
Yogeeraj_1
Honored Contributor

Re: backup script failed to stop db instance

hi Fauziah,

did you check the script individually for both databases?

if yes, then verify the alert_log of db2.

Check if you find any errors there...

kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Yogeeraj_1
Honored Contributor

Re: backup script failed to stop db instance

hi again,

maybe you should also review your backup strategy. If you are running 8i or greater, you should investigate into RMAN. With RMAN, you no longer have to shutdown your database minimising your downtime and also the risks of errors.

kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Fauziah Mahdan
Super Advisor

Re: backup script failed to stop db instance

not sure.When we issue oracle shutdown immediate it hangs. Have to run shutdown abort than oracle manage to down. The next day orastop command can work. After 3-4 days orastop error again.
Sample of prod1 log which is from db1

Wed Nov 22 01:29:39 2006
Thread 1 advanced to log sequence 2856
Current log# 2 seq# 2856 mem# 0: /u01/oracle/oradata/prod/redo_1_02_01.log
Current log# 2 seq# 2856 mem# 1: /u01/oracle/oradata/prod/redo_1_02_02.log
Wed Nov 22 02:37:13 2006
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 96
Wed Nov 22 02:37:13 2006
ALTER DATABASE CLOSE NORMAL
Wed Nov 22 02:37:14 2006
SMON: disabling tx recovery
SMON: disabling cache recovery
Wed Nov 22 02:37:20 2006
Thread 1 closed at log sequence 2856
Wed Nov 22 02:37:23 2006
Completed: ALTER DATABASE CLOSE NORMAL
Wed Nov 22 02:37:23 2006
ALTER DATABASE DISMOUNT
Completed: ALTER DATABASE DISMOUNT
Wed Nov 22 02:37:23 2006
Reconfiguration started
List of nodes: 0,
Global Resource Directory frozen
one node partition
Communication channels reestablished
Server queues filtered
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Resources and enqueues cleaned out
Resources remastered 15783
147420 GCS shadows traversed, 0 cancelled, 147420 closed
0 GCS resources traversed, 0 cancelled
set master node info
147420 GCS shadows traversed, 0 replayed, 147420 unopened
Submitted all remote-enqueue requests
Update rdomain variables
0 write requests issued in 0 GCS resources
0 PIs marked suspect, 0 flush PI msgs
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed Nov 22 02:37:25 2006
Reconfiguration complete
Wed Nov 22 02:37:26 2006
ARCH: Archiving is disabled
Shutting down archive processes
archiving is disabled
Archive process shutdown avoided: 0 active
ARCH: Archiving is disabled
Shutting down archive processes
archiving is disabled
Archive process shutdown avoided: 0 active
Wed Nov 22 02:42:36 2006
SHUTDOWN: waiting for detached processes to terminate.
Wed Nov 22 07:16:08 2006
Starting ORACLE instance (normal)
Wed Nov 22 07:16:08 2006
Global Enqueue Service Resources = 10946
Wed Nov 22 07:16:08 2006
Global Enqueue Service Enqueues = 16732
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
GES IPC: Receivers 1 Senders 2
GES IPC: Buffers Receive 1000 Reserved 1000
GES IPC: Msg Size Regular 376 Batch 2048
LICENSE_MAX_USERS = 0
Starting up ORACLE RDBMS Version: 9.0.1.4.0.
System parameters with non-default values:
processes = 500
timed_statistics = TRUE
resource_limit = TRUE
event = 10841 trace name context forever
shared_pool_size = 419430400
large_pool_size = 500M
java_pool_size = 100663296
_lm_cache_res_cleanup = 10
control_files = /u01/oracle/oradata/prod/control01.ctl, /u01/oracle/oradata/prod/control02.ctl, /u01/oracle/oradata/prod/control03.ctl
db_block_size = 4096
db_cache_size = 637534208
compatible = 9.0.0
log_archive_start = FALSE
cluster_database = TRUE
thread = 1
fast_start_mttr_target = 300
instance_number = 1
undo_management = AUTO
undo_tablespace = UNDOTBS_1
max_enabled_roles = 100
remote_login_passwordfile= EXCLUSIVE
db_domain =
service_names = prod
background_dump_dest = /u01/oracle/admin/prod/bdump
user_dump_dest = /u01/oracle/admin/prod/udump
core_dump_dest = /u01/oracle/admin/prod/cdump
sort_area_size = 1048576
db_name = prod
open_cursors = 600
ifile = /u01/oracle/admin/prod/pfile/initprod_comm.ora
Wed Nov 22 07:16:28 2006
cluster interconnect IPC version:Oracle UDP/IP
IPC Vendor 1 proto 2 Version 1.0
PMON started with pid=2
DIAG started with pid=3
LMON started with pid=4
LMD0 started with pid=5
LMS0 started with pid=6
LMS1 started with pid=7
DBW0 started with pid=8
LGWR started with pid=9
CKPT started with pid=10
SMON started with pid=11
RECO started with pid=12
Wed Nov 22 07:16:37 2006
ALTER DATABASE MOUNT
Wed Nov 22 07:16:38 2006
lmon registered with NM - instance id 1 (internal mem no 0)
Wed Nov 22 07:16:39 2006
Reconfiguration started
List of nodes: 0,1,
Global Resource Directory frozen
Communication channels reestablished
Server queues filtered
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Resources and enqueues cleaned out
Resources remastered 0
0 GCS shadows traversed, 0 cancelled, 0 closed
0 GCS resources traversed, 0 cancelled
set master node info
0 GCS shadows traversed, 0 replayed, 0 unopened
Submitted all remote-enqueue requests
Update rdomain variables
0 write requests issued in 0 GCS resources
0 PIs marked suspect, 0 flush PI msgs
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed Nov 22 07:16:39 2006
Reconfiguration complete
LCK0 started with pid=14
Wed Nov 22 07:16:45 2006
Successful mount of redo thread 1, with mount id 18975056.
Wed Nov 22 07:16:45 2006
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE).
Completed: ALTER DATABASE MOUNT
Wed Nov 22 07:16:46 2006
ALTER DATABASE OPEN
Picked Lamport scheme to generate SCNs
Wed Nov 22 07:16:47 2006
Thread 1 opened at log sequence 2856
Current log# 2 seq# 2856 mem# 0: /u01/oracle/oradata/prod/redo_1_02_01.log
Current log# 2 seq# 2856 mem# 1: /u01/oracle/oradata/prod/redo_1_02_02.log
Successful open of redo thread 1.
Wed Nov 22 07:16:47 2006
SMON: enabling cache recovery
Wed Nov 22 07:16:47 2006
Undo Segment 1 Onlined
Undo Segment 2 Onlined
Undo Segment 3 Onlined
Undo Segment 4 Onlined
Undo Segment 5 Onlined
Undo Segment 6 Onlined
Undo Segment 7 Onlined
Undo Segment 8 Onlined
Undo Segment 9 Onlined
Undo Segment 10 Onlined
Successfully onlined Undo Tablespace 1.
Wed Nov 22 07:16:47 2006
SMON: enabling tx recovery
Wed Nov 22 07:16:48 2006
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: ALTER DATABASE OPEN
Yogeeraj_1
Honored Contributor

Re: backup script failed to stop db instance

hi Again,

did you check for any run-away oracle processes?

I have seen cases where such processes "block" proper shutdown of the database.

kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Fauziah Mahdan
Super Advisor

Re: backup script failed to stop db instance

How to check?
Yogeeraj_1
Honored Contributor

Re: backup script failed to stop db instance

Hi,

First way would be to check the active sessions before the actual shutdown. Verify the logon_time field of v$session
e.g.
select * from (
select to_char(logon_time,'dd/mm/yyyy hh24:mi'), username
from v$session
order by logon_time)
where rownum <30

kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Yogeeraj_1
Honored Contributor

Re: backup script failed to stop db instance

hi again,

Another way would be to check the OS for the PID which is consuming excessive server resources and then drill-down to the database to determine the user process associated with the PID.

Step 1. run TOP
CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
0 ? 23609 oracle 154 20 2061M 7640K sleep 1:37 6.71 6.70 oraclemydb
1 ? 26134 oracle 154 20 2057M 6800K sleep 0:37 6.46 6.45 oraclemydb
0 ? 29765 oracle 154 20 2059M 7808K sleep 1:24 2.98 2.97 oraclemydb
0 ? 17219 oracle 154 20 2059M 6744K sleep 2:33 2.29 2.29 oraclemydb


The PID can be identified by the high values in fields TIME, %WCPU and %CPU


Step 2. run the following SQL query to determine culprit user session.

select b.sid SID,b.serial# "Serial#", c.spid "srvPID", b.osuser, b.username, b.status, b.client_info, machine from v$session b, v$process c where b.paddr = c.addr and c.sPID = &OSPID


hope this helps too!

kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)
Fauziah Mahdan
Super Advisor

Re: backup script failed to stop db instance

Actually the db down for backup at 2.30am
We have set cronjob to check any hang process but when pass to oracle team they couldn't find anything.
Last solution we restart the server and until now the instance is stable. We still monitor the issue afraid if the same thing happened again.
William Lai
Occasional Visitor

Re: backup script failed to stop db instance

Hi,

Just curious, why do you need to shutdown the database to perform backup?

Regards,
William
Volker Borowski
Honored Contributor

Re: backup script failed to stop db instance

Hello,
just a word on configuration:
- your 3 copies of your controlfile are at the very same location
/u01/oracle/oradata/prod/control01.ctl
/u01/oracle/oradata/prod/control02.ctl
/u01/oracle/oradata/prod/control03.ctl
you should locate them on diffrent physical disks

- your oracle onlineredolog software mirror is at the very same location:
mem#0: /u01/oracle/oradata/prod/redo_1_02_01.log
mem#1: /u01/oracle/oradata/prod/redo_1_02_02.log
you should locate them on diffrent physical disks

Volker
Fauziah Mahdan
Super Advisor

Re: backup script failed to stop db instance

Volker,
any reason for that?


Thanks
Fauziah
Yogeeraj_1
Honored Contributor

Re: backup script failed to stop db instance

hi Fauziah,

All the control files should not be kept in the same location. It is for obvious reasons that they are always multiplexed - corruption of the files or disk crash.

If a control file has been lost or damaged
to a point that Oracle cannot recognize it anymore, when trying to start up the
database will result in an ORA-00205 ("error in identifying control file '%s'")
and an operating system level Oracle error such as ORA-27037 (unable to obtain file status).

In case of a disk crash and loss of all the controlfile, it would be harder to restore.

See metalink note 1014504.6 - Recovering From A Lost Control File.

Below part of the Backup and Recovery basics:
- Running in Archivelog mode
- multiplexing the controlfile
- Taking regular backups
- Periodically doing a complete restore to test your procedures.

hope this helps!

kind regards
yogeeraj
No person was ever honoured for what he received. Honour has been the reward for what he gave (clavin coolidge)