System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Hello.
I am seeking for a solution of this i/o performance problem after upgrading from 11.23 to 11.31.

System rx6600 with EVA (HSV210) storage.
Now running HP-UX v. 11.31 Sep09 Dist.
Before running HP-UX v. 11.23 Sep06 Dist.

The nightly RMAN backup job og aprox. 2 Tbyte into separate disk are took before upgrade 5 hours.
Now the same job takes 8 hours.

I have testet the /exp with dd on another machine (same setup, just movig the vg04 over) )and get the same dd result.

Kernel params are set after Oracle recommandations.

I am looking for inputs from other sysadms with sort of problem.
Right now I am looking into vxtunefs.

Regards MArtin Rønde Andersen Miracle A/S miracleas.dk
34 REPLIES
butti
Frequent Advisor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Hi,

did you have changed the Mount Options from the volumes (cache or without)?

Did you have change any Oracle Paramter (SGA etc..)?

Could you monitor the Disk queue during the Backup (with glance)?

by,
butti
TTr
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

How much memory does the server have? The 11.31 might be using more memory and may be running out of physical memory.

Also did oracle get upgraded as well or stayed the same?
TwoProc
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

My question is:

What else, besides the OS has changed? SGA? Kernel Params?
We are the people our parents warned us about --Jimmy Buffett

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

The system is the same ..
rx6600 ,Memory 32G , Cpu 2 x dual core Itanium

The test config (a similar machine), has been running clones of the database for 16 months.

The production database setup is the same:
- no changes in SGA, oracle
- I have not used any special mount options.
(what do you mean ?) I tested before the upgrade, moving the mount point between the prod (11.23) and test (11.31) and all application functionality was aproved.

- After going into production, we experienced that nfile=30000 was throttling the system and we changed it to 0.

- I am at the moment investigating disk que figures from the glance collections I have (1 minut interval) from before and after the upgrade.

- I tried (with no success to change vxtunefs params: for the /exp area. (now changed back after test overnight.
read_ahead=2 (now back to 1)
max_buf_data_size=64k (now back to 8k)
scsimgr param for the disk182 (/exp):
leg_mpath_enable=false (now back to true)

-I have attached the current varux01 (prod) kctune output, in which you also will see the changes I am planning to get changed at next reboot.

Thanks for the comments, sorry for the late answer back,I was hung up at another customer.
Regards, Martin Roende Andersen

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

For the reference.
Things done, that takes longer:'
RMAN backup BEFORE:
runs from 18:40 - 23:59

RMAN backup AFTER:
runs from 18:40 - 01:30
Changed in schedule
runs from 17:00 - 00:00 (aprox)

----
After RMAN backup to /exp, the /exp mount point is backup up.
BEFORE:
DP backup to fiber connected Tape:
Runtime 5:09:21
Backup speed 106.181,44 KB/s
Endtime 07:16:37

AFTER:
Runtime 8:05:55
Backup speed 66.29 MB/s
Endtime 10:05:55

Thats how the customer sees the problem.

Regards Martin Roende Andersen



TTr
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

> moving the mount point between the prod (11.23) and test (11.31)...After going into production

Is the 11.31 production rx6600 a different server than the rx6600 11.23 production?

Also unless you have a typo, the DP times are the same (actually the AFTER is shorter but the speed is different. How is that happening?

The /exp is the common factor for both the RMAN and the DP backups but is that really the issue? Have you done any disk tests on /exp?

Also whether the hardware changed or not, I would look into the fiber interfaces (and the fiber in general). This is common to both the /exp RMAN writes and DP reads as well as the tape writes.

Have you noticed anything in the glance metrics? It might be better to observe the glance during the RMAN and DP runs so that you will see any alerts.
Duncan Edmonstone
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

>> The test config (a similar machine),

- How similar? Completely identical (same model, same number and type of CPU, same memory, same FC cards etc...)

These jobs look like big sequential IO jobs - in my experience nothing much you might do with vxtunefs will change that - it would be helpful to see the mount options on the 2 servers:

mount -p

HTH

Duncan

HTH

Duncan
Duncan Edmonstone
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

The other thing to look at is the EVA...

It's possible all the ownership of LUNs has moved to just one controller (whcih would tally with about half the sequential IO performance you are seeing)

on 11.11/11.23 the EVA Vdisk controller preference should be left at â No Preferenceâ and the LVM Primary/Alternate paths should be set carefully to be balanced evenly across the two controllers and ports (because the UX IO will always be down one path only). (Iâ m assuming Securepath was not used). This way UX determines the balance of IO and handles failover/failback, and the EVA should match Vdisk ownership to the prevailing IO pattern.

in contrast, with 11iv3, the EVA Vdisk preferences should be set deliberately to a 50/50 mix of â Controller A failover/failbackâ and â Controller B failover/failbackâ and Multipathing should be set to RR with ALUA (the defaults). In this case the UX server respects the EVA preferences/balance.

SO you need to look at the EVA CommandView config for these LUNs and then compare to what you see from evainfo and "scsimgr lun_map" on the 11.31 host, to be sure that all your IO is using both EVA controllers (any given LUN will just use 1 controller, but you should have an even number of LUNs spread across both controllers.)

HTH

Duncan

HTH

Duncan

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Duncan ..
They are exactly the equal machines ..
Your point about the EVA has not been into consideration, and I will talk to the storage manager..

With /exp on varux01 machine:
/dev/vg04/exp1vol /exp vxfs ioerror=mwdisable,largefiles,delaylog,dev=40040001 0 0

From fstab:
varux01 # grep /exp /etc/fstab
/dev/vg04/exp1vol /exp vxfs rw,suid,largefiles,delaylog,datainlog 0 2


See the atttached file for vginfo, dsf, scsimgr info ..
Duncan Edmonstone
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

So /exp is presumably the FS you are writing to... what about the FS you are reading _from_? How do you know that is still performing?

From what I can see here:

-first point is you are still using legacy DSFs in /dev/[r]/dsk - if you are migrated to 11.31, you should really start using the new agile DSFs in /dev/[r]disk/ - I don't think will have any impact on performance but is good practice

- its also not good practice to have VGs with only one LUN, as this absolutely guarantees you will only ever hit one controller on the EVA - you should always have at least 2 LUNs in a VG and then strip across those so you get the benefit of both controllers - but again, if that's how things were setup before its not the source of your problem

So as I said at the start of my post - checkout the config of the filesystems you are reading data off...

HTH

Duncan

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

The database we are reading from :
They are the same FS setup, but having 3 volume groups and several Luns..

Do you want to see the specifics ..

Yes I know about the DSF names...

But as you say, no impact, and therefore no focus right now.

regards Martin ..
TwoProc
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Is one of the servers set to use ILM and maybe the other a percentage of memory as SLM ? Testing has shown that it does make a difference. A small SGA that would fit into the memory of a single cell board in your server would be fine as configured SLM, but only if you're OK with just using the CPUs in a single cell. Meaning; most times you'll want to use ILM at 100% for an Oracle Database server of any signifigant size, and by that, I mean one that is to primarily set up to use most of the server that you've stood up.

So, my point is to make sure you've reviewed this setting on both servers.

Look for the paper from HP titled
"Running Oracle Database 10g or 11g on an HP-UX ccNUMA-based server", subtitled "Updated for Oracle 11gR2 and HPUX 11i v3.

The bottom of the documents id info:
4AA2-4194ENW, Created January 2009; Updated September 2010, Rev. #1




We are the people our parents warned us about --Jimmy Buffett

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Thanks , for your answers so far ....
suggestions are still welcome.
STATUS.:
We still have the same problem.

The latest action, was to follow the advice on the EVA side, for the /exp mount point, without any improvement in dump / backup time.

Right now I will start a formal HPUX case at the call logging facility, and eventually post any results here.

The customer is still suffering backup time into produktion time in the morning (up until 11:00AM)

Regards Martin Rønde Andersen
chris huys_4
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Hi Martin,

Did you setup, both the "hp-ux 11.23" rx6600 and the new "hp-ux 11.31" rx6600 ?

Anyway, show us a sar -d 1 100 of the "before 24h00 backup" on both systems i.e of both the "RMAN backup BEFORE:" and the "RMAN backup AFTER:" and this taken at the same time, i.e. f.e. when the backup was allready running for 30 minutes.

Greetz,
Chris

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Update ..

I have testet the mount options for /exp.
mincache=direct,convosync=dir
ect with no change in the behaviour described above.

I have opened a HPUX sw call at HP

Regards Martin Rønde Andersen

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Support suggestion is to add patches:

PHKL_40627 11.31 Buffer cache cumulative patch
PHKL_39594 11.31 vfs_bio QoS patch

PHKL_39594 was added when we upgraded to 11.31 Sep09.

PHKL_40627 , requires reboot, and I have requested a service window at customer. Presumeably, I will get one the 15-oct--2010.

I'll be back with results then.

Regards Martin Rønde Andersen

PS, in the meantime I will change all volume groups to "Persistant DSF" (/dev/disk/disk###), and no pvlinks (Alternate Link ) in vg definitions.

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

To Chris ..
I can dig out 1 minute interval glance outputs, from BEFORE (11.23) and AFTER (11.31) , will that satify your request. ?

rgds Martin R. A.
chris huys_4
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Hi Martin,

> I can dig out 1 minute interval glance
> outputs, from BEFORE (11.23) and AFTER
> (11.31) , will that satify your request. ?
No. ;)

Greetz,
Chris

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Chris ..
I will make a "sar -d 1 100" for you on 11.31, tonight.

Martin

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Which fiber channel HBA's are in the new rx3600?

We had similar issues with the AH400A cards using the fcd driver that were fixed in the 10.03.01 version of that driver.
chris huys_4
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Hi Martin,

small correction.
on HP-UX 11.31 instead of sar -d 1 100, execute

#sar -LdR 1 100

on HP-UX 11.23 it stays sar -d 1 100.

Greetz,
Chris

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Chris ..
The numbers collected here is taken while the nightly backup is still running.

It means lots of reads from /exp (disk182)

regards Martin Rønde Andersen
chris huys_4
Honored Contributor

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

Hi Martin,

Can the file be attached as a .txt file. Having problems in ie to "open" the .Z file.

Greetz,
Chris

Re: Degraded backup and Oracle dump performance after 11.23 to 11.31 upgrade

sar_out.txt in attachment ..

Regards Martin R. A.