Operating System - Tru64 Unix
1826372 Members
4746 Online
109692 Solutions
New Discussion

Re: Domain Panic

 
SOLVED
Go to solution
admin1979
Super Advisor

Domain Panic


Hello,

We have a TRU64 host with few domains created. It has the external disk array as well connected via SCSI cables.
one of the admins , accidently removed the scsi connectors from the external disk and that made one of the domain
as inaccessible. Domain Panic. The domain has some data files in it so the server is still up.

Now the question is , in order to bring the system to the previous stage, if we shutdown the host and connected
the cables as before and then start up the host, will the system take up??
I am thinking of hash out the domain entry from /etc/fstab and then bring the system up and then try to
mount the domains/shares one by one.

Kindly suggest at the earliest.

Thanx,
admin
25 REPLIES 25
admin1979
Super Advisor

Re: Domain Panic

Hello,

One more update, we have found the below error in messages file,

Nov 3 11:22:04 SERVER1 vmunix: io/vol.c(volerror): Uncorrectable write error on volume vol_source, plex pl-02, block 24101008
Nov 3 11:22:04 SERVER1 vmunix: io/vol.c(volerror): Uncorrectable write error on volume vol_source, plex pl-01, block 24101008
Nov 3 11:22:04 SERVER1 vmunix: io/vol.c(volerror): Uncorrectable write error on volume vol_source, plex pl-02, block 17776
Nov 3 11:22:04 SERVER1 vmunix: io/vol.c(volerror): Uncorrectable write error on volume vol_source, plex pl-01, block 17776
Nov 3 11:22:05 SERVER1 vmunix: voliod_error: plex detach - volume vol_source, plex pl-01
Nov 3 11:22:05 SERVER1 vmunix: AdvFS I/O error:
Nov 3 11:22:05 SERVER1 vmunix: Domain#Fileset: source_domain#util
Nov 3 11:22:05 SERVER1 vmunix: Mounted on: /tools
Nov 3 11:22:05 SERVER1 vmunix: Volume: /dev/vol/sourcedg/vol_source
Nov 3 11:22:05 SERVER1 vmunix: Tag: 0x000198be.8007
Nov 3 11:22:05 SERVER1 vmunix: Page: 304
Nov 3 11:22:06 SERVER1 vmunix: Block: 24101008
Nov 3 11:22:06 SERVER1 vmunix: Block count: 16
Nov 3 11:22:06 SERVER1 vmunix: Type of operation: Write
Nov 3 11:22:06 SERVER1 vmunix: Error: 5
Nov 3 11:22:06 SERVER1 vmunix: To obtain the name of the file on which
Nov 3 11:22:06 SERVER1 vmunix: the error occurred, type the command:
Nov 3 11:22:06 SERVER1 vmunix: /sbin/advfs/tag2name /tools/.tags/104638
Nov 3 11:22:06 SERVER1 vmunix: AdvFS I/O error:
Nov 3 11:22:06 SERVER1 vmunix: Volume: /dev/vol/sourcedg/vol_source
Nov 3 11:22:06 SERVER1 vmunix: Tag: 0xfffffff7.0000
Nov 3 11:22:06 SERVER1 vmunix: Page: 289
Nov 3 11:22:06 SERVER1 vmunix: Block: 17776
Nov 3 11:22:06 SERVER1 vmunix: Block count: 16
Nov 3 11:22:06 SERVER1 vmunix: Type of operation: Write
Nov 3 11:22:06 SERVER1 vmunix: Error: 5
Nov 3 11:22:06 SERVER1 vmunix:
Nov 3 11:22:06 SERVER1 vmunix: bs_osf_complete: metadata write failed
Nov 3 11:22:06 SERVER1 vmunix: AdvFS Domain Panic; Domain source_domain Id 0x36fbab0b.000d26ec
Nov 3 11:22:06 SERVER1 vmunix: An AdvFS domain panic has occurred due to either a metadata write error or an internal inconsistency. This domain is being ren
dered inaccessible.
Nov 3 11:22:06 SERVER1 vmunix: Please refer to guidelines in AdvFS Guide to File System Administration regarding what steps to take to recover this domain.


Thanx,
admin
Kapil Jha
Honored Contributor

Re: Domain Panic

You can try to connect the connector and scan the disks again and see if they are availabel.
If its only related to SCSi connector, i don think you need to hash out /etc/fstab.
But its good idea.

BR,
Kapil+
I am in this small bowl, I wane see the real world......
admin1979
Super Advisor

Re: Domain Panic

Thanks Kapil. Problem is I can see all the disks in scu show. Still I did scan. Shows 17 disks.

But the problem is , when I try to access any share I get,

# cd /opt
ksh[6]: /opt: permission denied


So looks like something we need to do to remap the domain.
We use LSM.

Thanx.
admin1979
Super Advisor

Re: Domain Panic



Any help would be appreciated.
Venkatesh BL
Honored Contributor

Re: Domain Panic

Are the devices showing up with the same names as before? Or has the scsi ids changed? If the names are same, then you can try showfdmn directly.
admin1979
Super Advisor

Re: Domain Panic

Yes the SCSI ids are same as before. I cross checked with scu show edt. And we have aleady used showfdmn in /etc/fdmns/

Getting following error for the said domain,

# showfdmn source_domain
showfdmn: unable to get info for domain 'source_domain'
showfdmn: error = I/O error

Thanx
Kapil Jha
Honored Contributor

Re: Domain Panic

can you please try to do
voldctl disable and enable it again.

and check if thats not working
just try to export and import dg

BR,
Kapil+
I am in this small bowl, I wane see the real world......
admin1979
Super Advisor

Re: Domain Panic

Like said before, we have below dg's in it.

# voldg list
NAME STATE ID
rootdg enabled 966101943.1025.bwgb550.vt.siemens.de
sourcedg disabled 1007210557.3627.bwgb550.vt.siemens.de
dokudg enabled 940102215.1973.bwgb550.vt.siemens.de
toolsdg enabled 944403128.1335.bwgb550.vt.siemens.de

If I do voldg disable command , will it affect any of the working dgs...i mean it should not ...as it also has rootdg in it.
So concerned.

If disable/enable do not work. Can you please give me the further steps to import/deport
Kapil Jha
Honored Contributor

Re: Domain Panic

from your post i think your sourcedg is affected and if you do anything on it other DG would not be impacted.

i would suggest
export dg
voldctl disable/enable
import dg

Should work!!

Only touch affected dg nothing would happen :)

BR,
Kapil+
I am in this small bowl, I wane see the real world......
admin1979
Super Advisor

Re: Domain Panic

The commands that you provided are not clear to me. I am sorry.

export dg
voldctl disable/enable
import dg


Can you please help me give the commands w.r.t the sourcedg.

Your help will be appreciated.
Kapil Jha
Honored Contributor
Solution

Re: Domain Panic

export dg

#voldg export sourcedg [user -f is not happening]

#voldctl disable
#voldctl enable

#voldg deport sourcedg

BR,
Kapil+
I am in this small bowl, I wane see the real world......
Kapil Jha
Honored Contributor

Re: Domain Panic

OOps type

export dg

#voldg deport sourcedg [user -f is not happening]

#voldctl disable
#voldctl enable

#voldg import sourcedg

BR,
Kapil+
I am in this small bowl, I wane see the real world......
admin1979
Super Advisor

Re: Domain Panic

Thank you. To start with , I just cross checked the first command in MAN page.

#voldg export sourcedg [use -f if not happening]

But unfortunately could not find the export option against voldg command.
There are only import/deport options.

Or am I missing something here. Please let me know.
By the way , ours is a TRU64 V4.0D system.
admin1979
Super Advisor

Re: Domain Panic

Ohhhh,

Are we sure this time,

#voldg deport sourcedg [user -f is not happening]

#voldctl disable
#voldctl enable

#voldg import sourcedg

By the way I could not find -f option for deport.
Kapil Jha
Honored Contributor

Re: Domain Panic

aaaaaahhhhhhhh........
I have no idea about V4....I am sorry mite :(

I can not help you, and not sure if these command would run there or not.

I suppose concept should be same you just figure out what command are there....

BR,
Kapil+
I am in this small bowl, I wane see the real world......
admin1979
Super Advisor

Re: Domain Panic

Commands are present as I can see in MAN.
Will reboot fix this problem? What do you think?
Kapil Jha
Honored Contributor

Re: Domain Panic

If command are there i don think there is any harm in running,
At least you would save a reboot , else I suppose reboot would definetly solve the issue.
BUT
before rebooting try to run commands.

BR,
Kapil+
I am in this small bowl, I wane see the real world......
admin1979
Super Advisor

Re: Domain Panic

Ok hope so.

Anyways first command said,

# voldg deport sourcedg
voldg: Disk group sourcedg: Some volumes in the disk group are in use.

Is it becauze few of the filesystems are still mounted and could not be unmounted from the sourcedg? There are getting Device busy error.


For you information,

here is the o/p of my working sourcedg taken long back,

volprint -g sourcegdg
TYPE NAME ASSOC KSTATE LENGTH COMMENT
dg sourcedg sourcedg - -
dm rz10 - - -
dm rz11 rz11 - 35564040
dm rz17 rz17 - 35564040
dm rz18 rz18 - 35564040
dm rz19 rz19 - 71130960
dm rz20 rz20 - 35564040
dm rz9 rz9 - 35564040
sd rz10-01 - - 35564040
sd rz11-01 pl-01 - 35564040
sd rz17-01 pl-02 - 35564040
sd rz18-01 pl-02 - 35564040
sd rz19-01 pl-02 - 35564040
sd rz20-01 pl-01 - 35564040
sd rz9-01 pl-01 - 35564040
plex pl-01 vol_source ENABLED 106692120
plex pl-02 vol_source ENABLED 106692120
vol vol_source fsgen ENABLED 106692120

So what do we do now?
admin1979
Super Advisor

Re: Domain Panic

Should I try this,

volume -g sourcedg stopall

They say "To deport a disk group: If applicable, stop the volumes"

admin1979
Super Advisor

Re: Domain Panic

That command did not work :(

# volume -g sourcedg stopall
volume: Diskgroup sourcedg not found

Any idea??
Kapil Jha
Honored Contributor

Re: Domain Panic

:(
i think you should try to to reboot the server.

BR,
Kapil+
I am in this small bowl, I wane see the real world......
Venkatesh BL
Honored Contributor

Re: Domain Panic

Try 'reboot' if that is possible.
admin1979
Super Advisor

Re: Domain Panic

We rebooted the system as last option and the system came up with all LSM dommains.
Alls well.

Thanx Kapil for timely advice.
Thanx V. BL as well.
U all get points.


Admin
admin1979
Super Advisor

Re: Domain Panic

Thanx