cancel
Showing results for 
Search instead for 
Did you mean: 

HP-UX Performance Issue

Vinod_22
Advisor

HP-UX Performance Issue

Dear all,
I will brief here very strange and peculiar performance problem with two HP-Itanuim Servers
The Server details are
Rx2660 (Good Configuration but very bad performance)
CPU: Two dual core (1.42 GHz, 12MB 9120N) processor
Memory: 16GB
Disk: 146GB 10k SAS
OS version HP-UX 11.31 release Sep2009
N/W: 1gbps
Script runs time: 68 min


Rx4640 (Not that great Configuration as 2660 but very impressive and faster performance)
CPU: HP mx2 dual processor
Memory 8GB
Disk: 73GB
OS version HP-UX 11.31 release Sep2009
N/W: 1gbps
Script runs time: 20 Sec

When we run congnos script the script will get complete in 15sec on rx4640 vs. it takes more than hour (68min) on Rx2660.
When the congnos script runs it call DB which is running on other Server and choose the fields and performs insert to another DB. In this case both Servers ( 2660 & 4640) are connecting to same DB Server. Also the DB client is also same on both the Servers.

If I compare both Server CPU, memory, disk and N/W the Slower system (2660) has all good horsepower compare to rx4640 but, when performance comes it acts like 10year old box.

We have tried to bring both OS same version and all kernel parameters same but not able to understands why the performance differ with such huge unbelievable difference.

Really appreciate if we get faster response on this.

24 REPLIES
Modris Bremze
Esteemed Contributor

Re: HP-UX Performance Issue

Maybe it's not the servers themselves, but something in between them - how are both servers connected to that "other DB server" they both rely on to run the script? Have you looked at both server CPU utilization and system load while your script is running?
S. Ney
Trusted Contributor

Re: HP-UX Performance Issue

Have you looked at glance on the 4640?
sar -L
iostat -L -t
Do you have tusc installed on the 4640?
study unix
Regular Advisor

Re: HP-UX Performance Issue

try to use top or glance ,but glance needs to be paid
Vinod_22
Advisor

Re: HP-UX Performance Issue

Please find below o/p for the both Servers
rx4640 (Great Perf)
rx2660 (Bad Perf)
#uname -a >>/tmp/perfor/hp1.txt
#bdf >>/tmp/perfor/hp1.txt
#dmesg >>/tmp/perfor/hp1.txt
#uptime >>/tmp/perfor/hp1.txt
#model >>/tmp/perfor/hp1.txt
#sar -u 2 10 >>/tmp/perfor/hp1.txt
#sar -d 2 10 >>/tmp/perfor/hp1.txt
#sar -b 2 10 >>/tmp/perfor/hp1.txt
#sar -v 2 10 >>/tmp/perfor/hp1.txt
#vmstat 2 10 >>/tmp/perfor/hp1.txt
#iostat 2 10 >>/tmp/perfor/hp1.txt
#swapinfo -tam >>/tmp/perfor/hp1.txt
#top -d2 -s10 -f /tmp/toperf.txt
#cat /tmp/toperf.txt
#ipcs â mob >>/tmp/perfor/hp1.txt
#exit
Vinod_22
Advisor

Re: HP-UX Performance Issue

top for rx4640
Vinod_22
Advisor

Re: HP-UX Performance Issue

top for rx2660
Vinod_22
Advisor

Re: HP-UX Performance Issue

rx2660 complete logs

These both Servers are connected to DB host thru L2 N/W switch.
Modris Bremze
Esteemed Contributor

Re: HP-UX Performance Issue

The lines

Synchronous Page I/O error occurred while paging to/from NFS server qiublrboanfs001
file system is /home

Look suspicious in rx2660 logs. How much are /home and NFS involved?
rick jones
Honored Contributor

Re: HP-UX Performance Issue

The stats you've collected seem to cover everything except the network. If we can assume that when these Cognos scripts are running they are the only things happening on the systems, you might:

netstat -s > before.netstat
lanadmin -g mibstats > before.lanadmin
- run the cognos script -
netstat -s > after.netstat
lanadmin -g mibstats > after.lanadmin

where you replace with the ppa of the lan in use - from your description I would guess '0' from lan0 but that is only a guess.

You can run the before and after files through "beforeafter" from ftp://ftp.cup.hp.com/dist/networking/tools

beforeafter before after > delta

and that will subtract all the numbers it finds in before from their counterparts in after - giving you the change over the interval.
there is no rest for the wicked yet the virtuous have no pillows
Vinod_22
Advisor

Re: HP-UX Performance Issue

The path information for the installation is as follows:
  
1. Cognos AW Runtime (Scriupts are being run from here) : /apps/sterling/sbi/cognos/ap/bin (nfs mount is /apps/sterling & NFS host:qiublrboanfs001)
2. Oracle Client : /home/oracle11g_client/oracle_home    ( nfs mount is /home & NFS host:qiublrboanfs001 )

Here both the file systems are exported by single NFS Server qiublrboanfs001 and it is the same env for rx2600 and rx4640.


In below trace Between two Red marked line it has take near 7mins , where it has to happened in less then second.

Trace During Application
---------------------------------------------------------------------------------------------------

jobstream -- start run on qiublrboaapp011.blrqa-nis (25-Jun-2010 15:56:34)
[VARIABLE   - 15:56:36] initialLoadStartDate = ''
[VARIABLE   - 15:56:36] initialLoadEndDate = ''
[VARIABLE   - 15:56:36] DS_LOG_DIR = '/apps/sterling/sbi/cognos/ap/datamanager/log/20100625155634'
[VARIABLE   - 15:56:36] DS_UDA_FETCH_ROWS = '50'
[VARIABLE   - 15:56:36] DS_UDA_BULKWRITE_ROWS = '50'
[VARIABLE   - 15:56:36] DS_LOB_MAXMEM_SIZE = '65536'
[VARIABLE   - 15:56:36] DS_LOB_BUF_SIZE = '8000'
[VARIABLE   - 15:56:36] DS_DBMS_TRIM = 'SPACE'
[VARIABLE   - 15:56:36] DS_RUN_TIMESTAMP = 2010-06-25 15:56:35
[VARIABLE   - 15:56:36] DS_JOBSTREAM_NAME = 'Configuration'
[VARIABLE   - 15:56:36] DS_JOBSTREAM_FNAME = 'Configuration'
[VARIABLE   - 15:56:36] DS_AUDIT_ID = 1
[VARIABLE   - 15:56:36] DS_JOB_AUDIT_ID = 1
[VARIABLE   - 15:56:36] DS_RUN_ID = 1
[VARIABLE   - 15:56:36] DS_LOG_NAME = '/apps/sterling/sbi/cognos/ap/datamanager/log/20100625155634/Job_Configuration_0001.log'
[VARIABLE   - 15:56:36] DS_MAX_RECURSION = '100'
[VARIABLE   - 15:56:36] . TRACE_VALUES = 'PROGRESS,DETAIL,INTERNAL,SQL,EXECUTEDSQL,USER,VARIABLE'
[VARIABLE   - 15:56:36] . AUDIT_VALUES = 'TIMING,ALERT,USER'
[PROGRESS   - 15:56:36] JobStream 'Configuration'; starting
[INTERNAL   - 15:56:36] Start Node 1 'Start'; Idle -> Succeeded
[INTERNAL   - 15:56:36] Build Node 2 'YFS_CAT_DOMAIN_LOCALE'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 3 'YFS_REGION_SCHEMA'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 4 'YFS_REGION'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 5 'YFS_STATUS_MODIFICATION_TYPE'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 6 'YFS_COMMON_CODE'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 7 'YFS_REGION_USAGE'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 8 'YFS_DATA_SECURITY_GROUP'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 9 'YFS_REGION_BEST_MATCH'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 10 'YFS_ORGANIZATION'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 11 'YFS_CUSTOMER_GRADE'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 12 'YFS_CHARGE_CATEGORY'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 13 'YFS_PAYMENT_TYPE'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 14 'YFS_HOLD_TYPE'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 15 'YFS_DOCUMENT_PARAMS'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 16 'YFS_LOCALIZED_STRINGS'; Pending -> Executing
[INTERNAL   - 15:56:36] Build Node 17 'YFS_CATEGORY_DOMAIN'; Pending -> Executing
[INTERNAL   - 15:56:37] Build Node 15 'YFS_DOCUMENT_PARAMS'; initializing
[PROGRESS   - 15:56:38] Build Node 15 'YFS_DOCUMENT_PARAMS'; executing (pid 3689)
[DETAIL     - 15:56:38] Build Node 15 'YFS_DOCUMENT_PARAMS'; component 'YFS_DOCUMENT_PARAMS', RunId 1, AuditId 2
[INTERNAL   - 15:56:39] Build Node 15 'YFS_DOCUMENT_PARAMS'; Executing -> Succeeded
[PROGRESS   - 15:56:39] Build Node 15 'YFS_DOCUMENT_PARAMS'; succeeded
[VARIABLE   - 15:56:39] RESULT = TRUE
[VARIABLE   - 15:56:39] DM_COMPONENT_AUDIT_ID = 2
[INTERNAL   - 16:03:38] Build Node 4 'YFS_REGION'; initializing
[INTERNAL   - 16:03:38] Build Node 13 'YFS_PAYMENT_TYPE'; initializing
[PROGRESS   - 16:03:38] Build Node 4 'YFS_REGION'; executing (pid 3678)
[PROGRESS   - 16:03:38] Build Node 13 'YFS_PAYMENT_TYPE'; executing (pid 3687)
[DETAIL     - 16:03:39] Build Node 4 'YFS_REGION'; component 'YFS_REGION', RunId 1, AuditId 3
[DETAIL     - 16:03:39] Build Node 13 'YFS_PAYMENT_TYPE'; component 'YFS_PAYMENT_TYPE', RunId 1, AuditId 4
[INTERNAL   - 16:03:39] Build Node 4 'YFS_REGION'; Executing -> Succeeded
[PROGRESS   - 16:03:39] Build Node 4 'YFS_REGION'; succeeded
[VARIABLE   - 16:03:39] RESULT = TRUE
[VARIABLE   - 16:03:39] DM_COMPONENT_AUDIT_ID = 3
[INTERNAL   - 16:03:39] Build Node 13 'YFS_PAYMENT_TYPE'; Executing -> Succeeded
[PROGRESS   - 16:03:39] Build Node 13 'YFS_PAYMENT_TYPE'; succeeded
[VARIABLE   - 16:03:39] RESULT = TRUE
[VARIABLE   - 16:03:39] DM_COMPONENT_AUDIT_ID = 4
Vinod_22
Advisor

Re: HP-UX Performance Issue

Hi Rick and here are the all netstat and lanadmin logs.

beofre= before running cognos script
after1= while running cognos script sample 1
after2= while running cognos script sample 2
after3= while running cognos script sample 3
Vinod_22
Advisor

Re: HP-UX Performance Issue

Hi Rick and Bremze,


Please see some more lanshow output from slow porf Server i.e. rx2660.
Vinod_22
Advisor

Re: HP-UX Performance Issue

Hello Bremze,

I have exlpained the NFS setup in the attachment also some details about cognos script
rick jones
Honored Contributor

Re: HP-UX Performance Issue

Well, a cursory check of the network stats looks clean.
there is no rest for the wicked yet the virtuous have no pillows
chris huys_4
Honored Contributor

Re: HP-UX Performance Issue

Hi,

Longshot, but, the problem systems vmstat output shows 1 blocked process.

procs memory page faults cpu
r b w
1 1 0
1 1 0
1 1 0
1 1 0
1 1 0

If this blocked process "starts when"/"disappears after" the cognos job is done, most probably this process has something to do with why the job taking so long.

So then find the process and see where its blocked up on.

glance will give the wait reason for the process, the hp support crashinfo utility can give you, with crashinfo -c /stand/vmunix /dev/kmem, the wait reason/process stack of the process

And did you log a support call with cognos, ibm?, support ?

Greetz,
Chris
Basheer_2
Trusted Contributor

Re: HP-UX Performance Issue

Hello Vinod,

you are comparing an ENTRY level server with a Medium Level Server.

Perf will NOT be same on both the servers.
Dave Olker
HPE Pro

Re: HP-UX Performance Issue

Since you have debug logging available at the application layer and you can see things like an unexplained 7-minute delay at certain points of the run, my suggestion would be to use tusc to collect system-call level logging of both the fast and slow runs. Comparing the two tusc outputs should show you which system calls are stalling for 7 minutes when they should be instantaneous.

Once we know what the application is doing during that 7 minute period we can better decide where to look at the slower system's configuration.

Regards,

Dave
Shiva Bhaskar
Occasional Visitor

Re: HP-UX Performance Issue

hi Chris Huys ,

Pls find the attached crashinfo out put ,

where we have collected the info 5 times while cognos scripting.

chris huys_4
Honored Contributor

Re: HP-UX Performance Issue

Hi,

My ie has problems with saving/opening the rar attachment. A text file as attachment should be better..

Greetz,
Chris
Shiva Bhaskar
Occasional Visitor

Re: HP-UX Performance Issue

hi Pls. find the attached file in text format
Dennis Handly
Acclaimed Contributor

Re: HP-UX Performance Issue

>Chris: My IE has problems with saving/opening the rar attachment.

Downloading 7-Zip 9.14 works wonders. It has a GUI to look inside the .rar.
chris huys_4
Honored Contributor

Re: HP-UX Performance Issue

Hi,

Looks like rundsjob is the parent process and rundsnode the childprocesses running beneath it. And this childprocesses have allready not run for 48 seconds.

The system also seems to be using automount and nis.

Connect as dave said a tusc to the rundsjob process and see what happens..

# date;./tusc -p -f -v -E -n -T%X -r all -w all -o /var/tmp/tusc_rundsjob.out

Greetz,
Chris
PS. Dennis, not to sure what I can do with 7-zip. I have a rar extracter on my pc, but its IE/java in IE, that cant seem to open something with a rar extension in a IE window. And when I save the file to my harddisk, I only get a getattachment.do file.
Dennis Handly
Acclaimed Contributor

Re: HP-UX Performance Issue

>Chris: not to sure what I can do with 7-zip. I have a rar extractor on my pc

(Before 7-zip, my rar extractor would only work on the command line.)

>but it's IE/java in IE, that cant seem to open something with a rar extension in a IE window.

Hmm, I've seen that problem with IE and .zip files in the forum. But my IE8 has no problem with .rar. I use firefox.

>when I save the file to my harddisk, I only get a getattachment.do file.

Hmm. You could try the same .Zip workaround for rar?
http://forums.itrc.hp.com/service/forums/getattachment.do?attachmentId=359505&ext=.RAR

(Or use .txt and rename it after the download.)
Vinod_22
Advisor

Re: HP-UX Performance Issue

Based on syslog errors
Jun 23 17:01:23 qiublrboaapp011 syslog: ypbind: no entry in /var/yp/secureservers file
Jun 23 17:01:23 qiublrboaapp011 /usr/sbin/rpc.lockd[1155]: Cannot get address for transport udp host \1 service lockd
Jun 23 17:01:23 qiublrboaapp011 /usr/sbin/rpc.lockd[1155]: Cannot establish NLM service over /dev/udp: transport setup proble
m.
Jun 23 17:01:23 qiublrboaapp011 /usr/sbin/rpc.lockd[1155]: Cannot get address for transport tcp host \1 service lockd
Jun 23 17:01:23 qiublrboaapp011 /usr/sbin/rpc.lockd[1155]: Cannot establish NLM service over /dev/tcp: transport setup proble
m.
Jun 23 17:01:23 qiublrboaapp011 /usr/sbin/rpc.lockd[1155]: Could not start NLM service for any protocol. Exiting.
Jun 23 17:01:23 qiublrboaapp011 nfs4cbd[1174]: nfsv4 cannot determine local hostname binding for transport tcp - delegations
will not be available on this transport

We have stopped and restarted lockd and nfs.client service and here is the output

/sbin/init.d/lockmgr stop
killing rpc.lockd
killing rpc.statd
# /sbin/init.d/lockmgr start
Starting up the Status Monitor daemon
/usr/sbin/rpc.statd
Starting up the lock manager daemon
/usr/sbin/rpc.lockd
# /sbin/init.d/nfs.client stop
killing nfs4cbd
# /sbin/init.d/nfs.client start
Starting NFS CLIENT subsystem

Starting up nfs4cbd daemon
/usr/sbin/nfs4cbd
Starting up nfsmapid daemon
Mounting remote NFS file systems ...
mount: qarhl25:/apps/bi on /apps/bi - WARNING unknown option "delaylog"
Mounting remote CacheFS file systems ...