Tru64 Unix
1753819 會員
9050 線上
108805 解決方案
發表新文章

关于磁盘状态的疑问

 
watermelonyu
教授

关于磁盘状态的疑问

大家好:

  我的一台机器,操作系统为4.0F,下面有几块磁盘,盘都是好的。状态如下,请帮助我解释一下可以吗

/dev/rrz139c:character special (8/281602) SCSI #17 BD018635 disk #1112 (SCSI ID #3) (SCSI LUN #0) offline

(这个offline表示这块盘现在没有用还是没有数据读写呢?这块盘不在dg里面)

/dev/rrz16c:character special (8/32770) SCSI #2 BD018645 disk #128 (SCSI ID #0) (SCSI LUN #0)

(这个状态是否表示这块盘现在正在使用呢,或者是空闲状态,这块盘在dg里面)

/dev/rrz2c:character special (8/2050) SCSI #0 AD018323 disk #16 (SCSI ID #2) (SCSI LUN #0) errors = 12/3

(这个状态是否表示我的这块盘有问题吗?我的系统盘也是这样,但对于数据读写也没有问题呀!)

请大家有时间的时候帮助我解释一下,先谢谢了!



7則回覆 7
watermelonyu
教授

关于磁盘状态的疑问

你確定下列 offline disk 的mount point

還能正常運作嗎???

出現offline 應該disk 已經不在了

你可以試著rescan bus again

#scu scan edt

or remove the device name ,then use MAKEDEV make the new device name

and try again



dev/rrz139c:character special (8/281602) SCSI #17 BD018635 disk #1112 (SCSI ID #3) (SCSI LUN #0) offline

*********************************

dev/rrz16c:character special (8/32770) SCSI #2 BD018645 disk #128 (SCSI ID #0) (SCSI LUN #0)

這才是正常的訊息

it mean disk is online,與是否空閑無關

假設這顆disk並沒有加入file domain,他的狀態仍是一樣

*********************************

dev/rrz2c:character special (8/2050) SCSI #0 AD018323 disk #16 (SCSI ID #2) (SCSI LUN #0) errors = 12/3

表示i/o已經出現錯誤了,建議還是換硬碟

可以由其他log 判斷是否有問題

uerr -R or syslog





watermelonyu
教授

关于磁盘状态的疑问

打錯了

是uerf -R
watermelonyu
教授

关于磁盘状态的疑问

joey:

我刚才使用voldisk list看了一下

rz139 sliced - - offline

rz16a  simple rz16a rootdg online

rz2 sliced rz2 rootdg online



其中139系统好像真没有用!rz16系统在用着的!而rz2是刚刚换上去的一块新盘,并且换了几块都是这个状态!

使用scu show edt查看换的状态为:

Device: BD018635C4 Bus: 17, Target: 3, Lun: 0, Type: Direct Access ->rz139

Device: ST318203LC Bus: 0, Target: 2, Lun: 0, Type: Direct Access ->rz2

因为rz2是属于rootdg里面的一块dm,当rz2坏后我更换rz2以

修复rootdg的状态,但出现如下错误,观察rootdg的状态如下

smp003:/dev>volrecover -sb -g rootdg

smp003:/dev>fsgen/volplex: Plex swapvol02-01 in volume swapvol02 is locked by another utility



#volprint -g rootdg -ht



v swapvol02 fsgen ENABLED ACTIVE 16776176 SELECT -

pl swapvol02-01 swapvol02 DISABLED RECOVER 16776176 CONCAT - WO

sd rz2a-01 swapvol02-01 0 0 16776176 rz2a rz2a

pl swapvol02-02 swapvol02 ENABLED ACTIVE 16776176 CONCAT - RW

sd rz18a-01 swapvol02-02 0 0 16776176 rz18a rz18a



watermelonyu
教授

关于磁盘状态的疑问

Hi all,



> /dev/rrz2c:character special (8/2050) SCSI #0 AD018323 disk #16 (SCSI ID #2) (SCSI LU N #0) errors = 12/3



The number 12 is indicated a software correctable error.

The other "3" indicated a hardware error (maybe ignore if it gets from SCSI bus reset). Please analyze it via "uerf" or "dia" if DECevent had installed.



> #volprint -g rootdg -ht

>

> v swapvol02 fsgen ENABLED ACTIVE 16776176 SELECT -

> pl swapvol02-01 swapvol02 DISABLED RECOVER 16776176 CONCAT - WO

> sd rz2a-01 swapvol02-01 0 0 16776176 rz2a rz2a

> pl swapvol02-02 swapvol02 ENABLED ACTIVE 16776176 CONCAT - RW

> sd rz18a-01 swapvol02-02 0 0 16776176 rz18a rz18a



From the rz2a-1 shown a recover status within WO (write_only) that seems to process the command "volrecover" right now.



Please waiting it finish, otherwise the LSM recovery and bad disk replacement need to call HP to fix ASAP!



Best regards,

Richard.

watermelonyu
教授

关于磁盘状态的疑问

Hi all,



> /dev/rrz2c:character special (8/2050) SCSI #0 AD018323 disk #16 (SCSI ID #2) (SCSI LU N #0) errors = 12/3



The error information "12/3" will be clear until system restarting.



Best regards,

Richard.
watermelonyu
教授

关于磁盘状态的疑问

hi Richard Kuo:

谢谢你的回复!

 我在使用volrecover的时候出现如下错误:

#volrecover -sb -g rootdg

fsgen/volplex: Plex swapvol02-01 in volume swapvol02 is locked by another utility

请问这是怎么回事呢?

 

 另外,可否再多贴一点关于磁盘报错,例如上面的error=12/3之类的信息出来吗?以供学习!

 先谢谢你!
watermelonyu
教授

关于磁盘状态的疑问

Hi,



>  我在使用volrecover的时候出现如下错误:

> #volrecover -sb -g rootdg

> fsgen/volplex: Plex swapvol02-01 in volume swapvol02 is locked by another utility



The rz2 is still worked and don't run the command "volrecover". And it gets a error due to it locked by the swap usage.



PS: It needs to replace and running the command "volrecover" while it gets a "offline" or "error" from "voldisk list". Currently, please restarting this problem system and verify again.



> 可否再多贴一点关于磁盘报错,例如上面的error=12/3之类的信息出来吗?

I think this 12/3 (soft/hard) errors are a correctable on rz2.

And it should be encountered more than 15 (12+3) SCSI CAM events from "uerf -R". Please analysis this errors via command "uerf -R -o full" or "dia -R" if DECevent has installed or "wsea x trans" if it's a DS/ES40 (EV6 CPU) systems.



Best regards,

Richard.