HP-UX

log분석 부탁드립니다.

 
Youngmin
조언자

log분석 부탁드립니다.

안녕하세요. 조언을 구하고자합니다.



정전이후 서버에서 오류로그를 보이고 있습니다.

현재 운영에는 이상이 없는 상태이며, 정전복구이후

장비의 각 LED램프는 정상적이었습니다.



총 3대의 서버중 2대에서 각 log를 보이고 있습니다.



각 dmesg 마지막 부분과 /var/opt/resmon/log/event.log일부입니다.



첫번째 서버

- dmesg에서 LVM관련된 device에러가 보이고, disk관련메세지

- /var/opt/...에서는 LVM정보가 바뀌었나는 메세지

두번째 서버

- dmesg에서 첫번째서버와 같이 LVM과련정보

- syslog정보

- /var/opt/...에서 B slot의 controller에러



총 3대의 서버가 FC60으로 MC/SG로 사용되고 있습니다.

아래는 각 메세지입니다.



조언 부탁드립니다.



============= 첫번째 서버 시작 ================

LVM: VG 64 0x010000:Failed to attach some device(s). Starting the deferred attach daemon.

LVM: VG 64 0x020000:Failed to attach some device(s). Starting the deferred attach daemon.

0/7/0/0.8 fcp

0/0/1/0.1 tgt

0/0/1/0.1.0 sdisk

0/7/0/0.8.0.4.0 fcparray

0/7/0/0.8.0.4.0.0 tgt

0/7/0/0.8.0.4.0.0.0 sdisk

0/7/0/0.8.0.4.0.0.1 sdisk

0/7/0/0.8.0.4.0.0.2 sdisk

0/7/0/0.8.0.4.0.0.3 sdisk

0/7/0/0.8.0.4.0.0.4 sdisk

0/7/0/0.8.0.4.0.1 tgt

0/7/0/0.8.0.4.0.1.0 sdisk

0/7/0/0.8.0.4.0.2 tgt

0/7/0/0.8.0.4.0.2.0 sdisk

0/7/0/0.8.0.4.0.3 tgt

0/7/0/0.8.0.4.0.3.0 sdisk

0/7/0/0.8.0.4.0.3.7 sdisk

0/7/0/0.8.0.255.0 fcpdev

0/7/0/0.8.0.255.0.4 tgt

0/7/0/0.8.0.255.0.4.0 sctl



>------------ Event Monitoring Service Event Notification ------------<



Notification Time: Tue May 10 12:35:22 2005



mdvan1 sent Event Monitor notification information:



/storage/events/disk_arrays/FC60/000A00A0B807A2A4

is >= 1.

Its current value is INFORMATION(1).







Event data from monitor:



Event Time..........: Tue May 10 12:35:22 2005

Severity............: INFORMATION

Monitor.............: fc60mon

Event #.............: 31

System..............: HOST1



Summary:

Disk Array at hardware path :

Array at hardware path 0/4/0/0.8.0.5.0.0.0, path 0/7/0/0.8.0.4.0.0.0: Lun

Configuration

details is stored in files named as alias_id.date in log path

/var/opt/hparray/log for every 24 hrs, if differences are found.







Description of Error:





This event message is an information from the monitor to notify the

difference found in LUN Configuration in 24 hrs.Helps to recover

a missing or defective LUN by looking at the details stored in files.





Probable Cause / Recommended Action:





No action required .



Additional Event Data:

System IP Address...: 61.78.62.191

Event Id............: 0x42802bfa00000000

Monitor Version.....: B.01.02

Event Class.........: I/O

Client Configuration File...........:

/var/stm/config/tools/monitor/default_fc60mon.clcfg

Client Configuration File Version...: A.01.01

Qualification criteria met.

Number of events..: 1

Associated OS error log entry id(s):

None

Additional System Data:

System Model Number.............: 9000/800/L2000-44

EMS Version.....................: A.04.00

STM Version.....................: A.45.00

Latest information on this event:

http://docs.hp.com/hpux/content/hardware/ems/fc60mon.htm#31



v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v







Product/Device Data:



Logger ID.........: (not available/applicable)

Product Identifier: (not available/applicable)

Product Qualifier.: (not available/applicable)

SCSI Target ID....: (not available/applicable)

SCSI LUN..........: (not available/applicable)





>---------- End Event Monitoring Service Event Notification ----------<



============= 첫번째 서버 끝 ================



============= 두번째 서버 시작 ================

LVM: VG 64 0x040000:Failed to attach some device(s). Starting the deferred attac

h daemon.

LVM: VG 64 0x030000:Failed to attach some device(s). Starting the deferred attac

h daemon.





May 9 17:16:52 mdvan2 syslog: AM60Srvr Started

May 10 12:32:13 mdvan2 EMS : ------ EMS Event Notification ------ Value: "SERIOUS (4)" for Resource: "/storage/events/disk_arrays/FC60/000A00A0B807A2A4" (Threshold: >= " 3") Execute the following command to obtain event details: /opt/resmon/bin/resdata -R 130285570 -r /storage/events/disk_arrays/FC60/000A00A0B807A2A4 -n 130285570 -a



>------------ Event Monitoring Service Event Notification ------------<



Notification Time: Tue May 10 12:32:13 2005



mdvan2 sent Event Monitor notification information:



/storage/events/disk_arrays/FC60/000A00A0B807A2A4

is >= 1.

Its current value is SERIOUS(4).







Event data from monitor:



Event Time..........: Tue May 10 12:32:13 2005

Severity............: SERIOUS

Monitor.............: fc60mon

Event #.............: 6

System..............: HOST2



Summary:

Disk Array at hardware path :

Array at hardware path 0/2/1/0.8.0.5.0.0.0, path : The controller in slot B

has failed,

or is not accessible







Description of Error:





This event message is displayed by one of several conditions:



1. Problem with the connection.

2. Interface cable/terminator.

3. Controller.

4. Host but Adapter.





Probable Cause / Recommended Action:





Replace the indicated controller.



Additional Event Data:

System IP Address...: 61.78.62.193

Event Id............: 0x42802b3d00000000

Monitor Version.....: B.01.02

Event Class.........: I/O

Client Configuration File...........:

/var/stm/config/tools/monitor/default_fc60mon.clcfg

Client Configuration File Version...: A.01.01

Qualification criteria met.

Number of events..: 1

Associated OS error log entry id(s):

None

Additional System Data:

System Model Number.............: 9000/800/rp4440

EMS Version.....................: A.04.00

STM Version.....................: A.45.00

Latest information on this event:

http://docs.hp.com/hpux/content/hardware/ems/fc60mon.htm#6



v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v







Product/Device Data:



Logger ID.........: (not available/applicable)

Product Identifier: (not available/applicable)

Product Qualifier.: (not available/applicable)

SCSI Target ID....: (not available/applicable)

SCSI LUN..........: (not available/applicable)





>---------- End Event Monitoring Service Event Notification ----------<



============= 두번째 서버 끝 ================





2 응답 2
Jongmin, Lee
유치원

log분석 부탁드립니다.

안녕하세요

이종민입니다.



일단 각 서버에서

# amdsp -a {array_id}

의 결과 중, status가 "ready"가 아닌지 확인 해보시기 바랍니다.

만일 "ready"가 아니라면, 그 결과를 첨부해주시기 바랍니다.



위의 로그만으로 볼때, 두번째 서버에서 문제가 있는듯 합니다.



그럼..
Youngmin
조언자

log분석 부탁드립니다.

조언감사드립니다.



실행결과 2번서버에서 아래의 메세지를 보이네요.



Vendor ID = HP

Product ID = A5277A

Array ID = 000A00A0B807A2A4

Array Alias = fc60

-----------------------------------------

Array State = WARNING

Server Name = HOST2

Array Type = 3

Mfg. Product Code = 348-0040789

Date = Thu May 12 09:53:41 KST 2005

Path to Ctlr A VSB0 = 0/2/1/0.8.0.5.0.0.0



--- Disk space usage --------------------

Total Physical = 817.2 GB

Allocated to LUNs = 510.5 GB

Used as Hot Spare = 67.8 GB

Unallocated (avail for LUNs) = 0.0 GB

-----------------------------------------



THE FOLLOWING WARNING(S) EXIST:

- Array Enclosure Controller B: UNKNOWN



....생략



그리고 메세지를 안보였던 다른 서버에서 새로운 메세지를 보이네요.



----- EMS Monitor Restart -----

Title: fc60mon

Command: /usr/sbin/stm/uut/bin/tools/monitor/fc60mon

Vendor: Hewlett-Packard Company

Version: B.01.02

To obtain a list of currently monitored resources,

execute the following: /opt/resmon/bin/resdata -M 1652016795



실행결과입니다.

/opt/resmon/bin/resdata -M 1652016795

/storage/events/disk_arrays/FC60/000A00A0B807A2A4

/storage/events/disk_arrays/FC60/AM60Srvr

resmon log에서 24시간안에 정보가 바뀌었다는 메세지가 있습니다.



*********

2번째 서버는 어느 분이 리붓을 권장을 했었습니다.

일단 리붓이후 같은 메세지가 나오는지 일단 봐야되는거 아닌가하는

생각이 듭니다.

다른서버는 일반적인 정보를 보여주는건지 어떤건지 모르겠습니다.

설명해주시면 감사드리겠습니다.



*********

그런데 amdsp의 실행결과 중 아래의 메세지의 풀이가 궁금합니다.

설명 좀 부탁드려도 될런지요..?



-----------------------------------------

LUN Status Capacity Ctrl RAID Segment Disks

--- ----------------- --------- ---- ---- ------- -----

0 OPTIMAL 67.8 GB A 5 8 1:0

3:0

5:0



1 OPTIMAL 67.8 GB A 5 4 2:0

4:0

6:0



2 OPTIMAL 101.6 GB A 5 16 1:1

2:1

4:1

5:1



3 OPTIMAL 136.7 GB A 5 16 1:2

3:2

5:2



4 OPTIMAL 136.7 GB B 5 16 2:2

4:2

6:2



31 UTM:GOOD





LUN WCE RCD CME CWOB WCA RCA CMA

--- --- --- --- ---- --- --- ---

0 X X X X X



1 X X X X X



2 X X X X X



3 X X X X X



4 X





Total capacity of LUNs on controller A = 373.8 GB

Total capacity of LUNs on controller B = 136.7 GB

Total capacity of all configured LUNs = 510.5 GB



...생략...



SCSI Channel:ID = 1:0

Enclosure = 0

Slot (0-based) = 0

Disk State = OPTIMAL

Disk Group and Type = 07A4F0000C39925CEB LUN

Capacity = 33.9 GB

Manufacturer and Model = SEAGATE ST336704LC

Serial Number = 3CD01SZ7

Firmware Revision = HP01



감사드리며 수고하십시요.