- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Weird performance problem
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-16-2005 06:46 AM
тАО01-16-2005 06:46 AM
Weird performance problem
this one is killing me.
For a customer I did some performance tuning. Using a secure approach I configured a reference system on a smaller system to ensure, that their will be no unexpected failures. All things were fine - and - now I've got an additional performance problem I can't pin down.
The first week the results were good, but now they have response time problems.
Biggest problem is, that it is a fully desupported configuration. (And no chance/budget to upgrade (so this isn't the solution).
- VMS 7.1 (with the recent patches, but not 7.1-1h2 or later)
- Oracle 7.3.2.3
- UCX 4.1 ECO 10 (this is the last ECO)
Maybe some kind soul can give me hints/tipps.
First I suspected (I did the UCX patch upgrade) it is something related to this. But PING results are very good (< 1 ms) and the TNSPINGs, too (< 80 ms).
What I checked:
- LSNRCTL startup times are very long (I checked all possibilities I found after intensive search within METALINK)
like:
- UCX parameters (large, small buffers, device sockets, all ok)
- TCPIP-problems (dns resolving) not existent, interfaces reporting no errors.
- SDU etc. values correct (probably the reason for the now very good TNSPING values)
- Tracefiles activated, but no real errors found.
What I observe:
Logging in via TCPIP and BEQ is very slow.
Description:
Sqlplus user
... fast response
password: xxxx
20 seconds wait, than accepted.
I don't think, that I have a TCPIP/SQL*Net problem.
Same on the reference system needs only 3 seconds.
Alert-log, sqlnet.trc, etc. don't report anything unusual.
To put more speedup in the database I recommended a memory upgrade. This was installed (add. 512 MB kit (refurbished, no original parts available)).
Prod.System: AS 800 5/333 (640 MB)
Ref.System: AS 1000 4/266 (256 MB)
The rest, especially disk layout is a close 1:1 - clone of the production environment.
My current suspects:
- Problem within the memory (but I can't find any proof for that)
- Miscalculated system parameters (the system had last friday problems with the resident programs for oracle (global page table full, fixed this and no problems with this)
- The customer told me about some time-by-time problems with the (!) tokenring card (never had any since all the time) - Interface wt0.
Attached the results of the Oracle RDA (remote diagnostic assistant). I had to to some "patching" to convince the DCL-program to run in this environment.
Any hints/pointers greatly appreciated.
Regards
Andreas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-16-2005 07:35 AM
тАО01-16-2005 07:35 AM
Re: Weird performance problem
Another suspect - the disks:
I've got a small tool to test read access times:
ORASRV::SYSTEM $ r access
Device to test: dka300
Seek range in MB (0 for full disk): 10
Single or double buffering (s/d): d
Double buffered average access time is 4.6 ms
Test has completed after 3375 random reads.
ORASRV::SYSTEM $ r access
Device to test: dka200
Seek range in MB (0 for full disk): 10
Single or double buffering (s/d): d
Double buffered average access time is 6.6 ms
Test has completed after 1500 random reads.
ORASRV::SYSTEM $ r access
Device to test: dka100
Seek range in MB (0 for full disk): 10
Single or double buffering (s/d): d
Double buffered average access time is 7.5 ms
Test has completed after 1500 random reads.
ORASRV::SYSTEM $ r access
Device to test: dka0
Seek range in MB (0 for full disk): 10
Single or double buffering (s/d): d
Interrupt
The last test on DKA0 (where the OS and Oracle live together) didn't complete after several minutes. So I interrupted the command.
The disk report doesn't show any errors.
Disk ORASRV$DKA0:, device type DEC RZ1CB-CS, is online, mounted, file-oriented
device, shareable, available to cluster, error logging is enabled, device is
busy.
Error count 0 Operations completed 11143527
Owner process "" Owner UIC [SYSTEM]
Owner process ID 00000000 Dev Prot S:RWPL,O:RWPL,G:R,W
Reference count 625 Default buffer size 512
Total blocks 8380080 Sectors per track 113
Total cylinders 3708 Tracks per cylinder 20
Volume label "AXPVMSSYS" Relative volume number 0
Cluster size 9 Transaction count 574
Free blocks 2956563 Maximum files allowed 419004
Extend quantity 5 Mount count 1
Mount status System Cache name "_ORASRV$DKA0:XQPCACHE"
Extent cache size 64 Maximum blocks in extent cache 295656
File ID cache size 64 Blocks currently in extent cache 14544
Quota cache size 0 Maximum buffers in FCP cache 184
Volume owner UIC [1,1] Vol Prot S:RWCD,O:RWCD,G:RWCD,W:RWCD
Volume Status: subject to mount verification, protected subsystems enabled,
write-through caching enabled.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-16-2005 01:57 PM
тАО01-16-2005 01:57 PM
Re: Weird performance problem
I would think there is no point in any other analysis untill that is resolved. You might try some more speed verifications with simple searc/stat or convert/stat, making sure the file is big enough not to fit in your (small vcc cache), just search the pagefile or something like that.
There are a few minor odd things, and the tuning is aggresive the reserved memory and such: ** Reserved memory size = 402653184 greater than created SGA size = 371589120 **
and:
Memory Reservations (pages): Reserved In Use Type
Main Memory (640.00Mb) 81920 13551 65203 3166
ORA_REPORT_SGA 49152 45361 Allocated
ORA_REPORT_SGA 48 45 Page Table
Total (384 Mb reserved) 49200 45406
Oracle is right... those 30M are wasteed and are almost 10% of the SGA, and 5% of the whole system.
There is still plenty of free memory, but the system was not under load was it.
ORA-600 [17114]
see note: Note:34782.1
"KGH Bad magic number in header"
Oracle has detected that the magic number in a memory chunk header has been
overwritten.
This is a heap (in memory) corruption and there is no underlying data
corruption.
The error may occur in the one of the process specific heaps
(the Call heap, PGA heap, or session heap) or in the shared heap (SGA)."
>>> AlphaServer 800 5/333
If you do consider/are forced to upgrade Oralce, then you can not go too far without having to upgrade the CPU. This is an EV5, not EV5.6, so not suitable for the latests Oracle. Also... that's a rather dated alpha with modest memory. It's going to be hard to make that look real good compared to a more modern (alpha) system.
fwiw,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-16-2005 09:41 PM
тАО01-16-2005 09:41 PM
Re: Weird performance problem
On the DKA0 timing issue, I would check if this is an activity issue or a hardware problem. In some cases, small configuration differences can causes dramatically different IO rates, which can cause something similar to what you describe.
I would check the Cumulative IO count on the disk when you are experiencing the pause. If it is continuing to otherwise process normally (and in some cases, the SHOW command itself may be impacted -- after all, paging/image loading is also from this disk), then I would be suspicious that a difference between the configurations is causing a higher io/paging rate, and you are seeing a contention problem.
You could also verify this by running MONITOR DISK from another workstation and checking the results.
I have also seen a variety of tuning induced similar behaviors.
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-16-2005 10:04 PM
тАО01-16-2005 10:04 PM
Re: Weird performance problem
Can you see which step in the login process is taking a long time? - Is it the password validation or after that (something in SYLOGIN parhaps)?
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-17-2005 12:33 AM
тАО01-17-2005 12:33 AM
Re: Weird performance problem
thanks for the first responses.
Perhaps I wasn't explicit enough:
- Hardware-Upgrade/Software-Upgrade etc. isn't a option in the moment. (Neither VMS nor Oracle)
- The clone system (an elder CPU generation with lower frequency) is running fine.
@Ian:
Login is no problem. I checked the startup of a sql-session with the old "set watch file/class=all" trick. It is not a network configuration problem. All is running fine, but with different (slower) execution times on the AS800.
@Robert:
The problem IS dka0. But I can't find any hint pointing to hardware problems.
Question: Is a DEC RZ1CB-CS with 70 IO/s over the max. IO?
@Hein:
Yes, I was thinking about relocating the Oracle-Home directory, but this should be some sort of last choice. And again, the cloned system is running fine.
The memory wasting hint: I'm aware of this, but it is very difficult to calculate the correct size of the single SGA parameters. I'm using an Excel sheet to do the calculations, but the larger the available memory the larger the potential loss.
The Oracle 600 error you mentioned happens there once a day. Don't ask me, why, probably an old bug of 7.3 (and absolute no change to get a patch).
Just for information about the background:
- I setup this system (I really forgot this when I accepted the task) 8 years ago. A typical launch and forget system (a big advantage of OpenVMS). Additional database software was developed meanwhile, but the developer died last year. No sources in the moment, so no change to tune at the application side.
I'm taking care of this system, because we want to offer a follow-up system (vms based :-) ).
The happier the customer after the tuning, the bigger the chance to convince them to buy a new one from us. :-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-28-2005 03:35 AM
тАО01-28-2005 03:35 AM
Re: Weird performance problem
The customer is happy, tuning gave a 2x faster result. Some queries now take couple of seconds instead of 1 hour. :-)
regards
Andreas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-28-2005 03:35 AM
тАО01-28-2005 03:35 AM