Databases
cancel
Showing results for 
Search instead for 
Did you mean: 

Oracle RAC installatino problem

BAUKnight
Frequent Advisor

Oracle RAC installatino problem

Dears,
we are running hpux on tow rx machines rx660 and rx4640, using HP EVA4100 as our SAN.
our oracle dba is trying to install oracle RAC, but we are stuck in some place, during the creation of the diskgroup part of RAC, the process hangs an we can not continue.

here are my machines details:

root@dbclust1:/>uname -a
HP-UX dbclust1 B.11.23 U ia64 0883107040 unlimited-user license
root@dbclust1:/>model
ia64 hp server rx6600
root@dbclust1:/>

root@dbclust2:/>uname -a
HP-UX dbclust2 B.11.23 U ia64 1432413547 unlimited-user license
root@dbclust2:/>model
ia64 hp server rx4640
root@dbclust2:/>

on node1,

#ioscan -funC disk | more

Class I H/W Path Driver S/W State H/W Type Description
====================================================================================
disk 27 0/3/1/0.1.0.0.0.1.0 sdisk CLAIMED DEVICE HP HSV200
/dev/dsk/c2t1d0 /dev/rdsk/c2t1d0 /dev/rdsk/db_database

on node2,
#ioscan -funC disk | more

Class I H/W Path Driver S/W State H/W Type Description
====================================================================================
disk 10 0/3/2/0.1.0.0.0.1.0 sdisk CLAIMED DEVICE HP HSV200
/dev/dsk/c2t1d0 /dev/rdsk/c2t1d0 /dev/rdsk/db_database
/dev/dsk/c6t1d0 /dev/rdsk/c6t1d0

on node2, there are two different device files, the original one is /dev/dsk/c6t1d0, so we followed oracle RAC installation manual and created the virtual device file using the command

#mksf -C disk -H 0/3/2/0.1.0.0.0.1.0 -I 10 /dev/dsk/c6t1d0
#mksf -C disk -H 0/3/2/0.1.0.0.0.1.0 -I 10 -r /dev/rdsk/c6t1d0


our dba ask oracle about the problem, they asked for executing the following commands on each machine, here what they asked with the results:

*************
Node2

root@dbclust2:/>id
uid=0(root) gid=3(sys) groups=0(root),1(other),2(bin),4(adm),5(daemon),6(mail),7(lp),20(users)
root@dbclust2:/>dd if=/dev/dsk/c2t1d0 of=b.txt bs=1024 count=100
100+0 records in
100+0 records out
root@dbclust2:/>ll b.txt
-rw-rw-rw- 1 root sys 102400 May 6 08:40 b.txt
root@dbclust2:/>cat b.txt
root@dbclust2:/>
root@dbclust2:/>rcp b.txt dbclust1:/b.txt
root@dbclust2:/>

*************
Node1

root@dbclust1:/>id
uid=0(root) gid=3(sys) groups=0(root),1(other),2(bin),4(adm),5(daemon),6(mail),7(lp),20(users)
root@dbclust1:/>dd if=/dev/dsk/c2t1d0 of=a.txt bs=1024 count=100
100+0 records in
100+0 records out
root@dbclust1:/>
root@dbclust1:/>ll a.txt
-rw-rw-rw- 1 root sys 102400 May 6 08:54 a.txt
root@dbclust1:/>ll b.txt
-rw-rw-rw- 1 root sys 102400 May 6 08:55 b.txt
root@dbclust1:/>cat a.txt
root@dbclust1:/>cat b.txt
root@dbclust1:/>diff a.txt b.txt
root@dbclust1:/>

***************************************

I wonder why the output of the commands is empty, both file a.txt and b.txt are empty, and sure the diff command will result in no difference.

can any one advice with this, and in case you need any other information please ask for it.

we are in a critical situation, please advice as soon as possible.

regards.
10 REPLIES
Ivan Krastev
Honored Contributor

Re: Oracle RAC installatino problem

With this dd commands you just copy some raw data from disks to the files a.txt and b.txt.

Can you give the exact error message, when creating the volume groups for Oracle RAC ?


regards,
ivan
BAUKnight
Frequent Advisor

Re: Oracle RAC installatino problem

There is no error messages appearing, the oracled installer just hangs and nothing appeas, niether on the installer nor on the shell command line.

now since the dd reads raw data from the device, why i can not view the contents from a.txt or b.txt, now since the comparison results in nothing difference, this means a.txt and b.txt are the same, meaning that we are referencing the same SAN LUN from both nodes using the same device file, am I right with this.

i will check with our DBA if oracle replies with somthing or recommends something.

any othe hints.
whiteknight
Honored Contributor

Re: Oracle RAC installatino problem

Hi

FRom the dd output, look there is no hardware issue, as there is no i/o error, I would recommend they check the oracle log why it failed ?


WK

Problem never ends, you must know how to fix it
BAUKnight
Frequent Advisor

Re: Oracle RAC installatino problem

i will ask the dba to redirect the output of the command to a file, so we can see if there anything happens.
i will check and update.
BAUKnight
Frequent Advisor

Re: Oracle RAC installatino problem

dears,
there is no log found for the installation, the process only hangs and can not discover the disks, here what oracle asked us to do.


If the db_database is not visible on node-2/asm then:

1] Start the truss on RBAL process on asm node-2

$ truss -ef -o /tmp/rbal.out -p

$ script /tmp/a.txt
$ strace kfod asm_diskstring='/dev/rdsk/*' disks=all
$ exit

Upload the /tmp/a.txt file


the problem i face here, that there is no truss command, it seems OS command, but i could not run it, also there is no manual entry page for such a command,

any hint please
Duncan Edmonstone
Honored Contributor

Re: Oracle RAC installatino problem

Truss, otherwise known as tusc is available from here:

http://hpux.connect.org.uk/hppd/hpux/Sysadmin/tusc-7.9/

HTH

Duncan

HTH

Duncan
BAUKnight
Frequent Advisor

Re: Oracle RAC installatino problem

Duncan,
i downloaded the file tusc7.9, and when i try to gunzip it, it gives the error saying

file not in a gzipped format

root@dbclust1:/tmp>gunzip tusc-7.9-ia64-11.23.depot.gz

gunzip: tusc-7.9-ia64-11.23.depot.gz: not in gzip format

What might be the problem? i downloaded the file 3 times, and i swapped the FTP and HTTP source, but always get the same error.

pls advice.
kevin_m
Valued Contributor

Re: Oracle RAC installatino problem

Regarding the initial post, I think you're on the right track with the alternate device file (/dev/rdsk/db_database). I found that creating alternates is a good option for a couple reasons. First of all, Oracle needs specific permissions and ownership on the device files. If you modify the defaults like /dev/rdsk/c2t1d0, the changes will probably reset after a reboot. Secondly, you can assign names that match the purpose of each raw device. Here's an example of the configuration I used:

/dev/dbclust/crs/ocr crw-r----- root:oinstall
/dev/dbclust/crs/voting crw-r--r-- oracle:oinstall
/dev/dbclust/asm/arch01 crw-rw---- oracle:dba
/dev/dbclust/asm/db01 crw-rw---- oracle:dba
/dev/dbclust/asm/db02 crw-rw---- oracle:dba
/dev/dbclust/asm/db03 crw-rw---- oracle:dba
/dev/dbclust/asm/db04 crw-rw---- oracle:dba
/dev/dbclust/asm/flash01 crw-rw---- oracle:dba
/dev/dbclust/asm/spfile crw-rw---- oracle:dba

Regarding "the process only hangs and can not discover the disks", we ran into a similar issue during the discovery process. The fix was to configure the parameter 'asm_diskstring' to scan disks using the alternate device files. I don't know exactly where this parameter is set but it relates to spfile somehow. Based on my example config the syntax is:

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring string /dev/dbclust/asm/*


Let me know if you would like more details or fix ideas. There are some other steps we took to resolve install problems and I may have overlooked something.

Kevin
BAUKnight
Frequent Advisor

Re: Oracle RAC installatino problem

Kevin,
thanks for your hint, but unfortunately i am the unix admin and not the DBA, our DBA is the one who is doing the install, so i will let him do what you suggested to see what will happen.

but as i c, you are talking about the ASM, we installed the ASM on both nodes, and now we are stuck in the middle of
create diskgroup command

i don't know if u have any idea how we can fix this.

i will check with our DBA, and see what will happen.
kevin_m
Valued Contributor

Re: Oracle RAC installatino problem

I too am a Unix admin so I'm mostly relaying what my DBA's asked me to do. If you haven't set specific permissions on custom device files that might be something to try. It seems plausible that changing to Oracle ownership may correct disk access problems, whether it's for ASM or DB usage.