- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Same job different direct i/o and Cpu, test vs...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2010 03:15 AM
тАО11-02-2010 03:15 AM
Same job different direct i/o and Cpu, test vs prod
Hope someone kan piont me to what to look for. I┬┤m not the guru.
I've got a production cluster consisting of 2 Alpha DS25 with OpenVMS 8.3, connected to two EVA8000, using volume shadowing.
We have a nightly batch which is loading data to a RDB-database (7.2-201). The batch gets some infiles - creates tmp-tables - loads (rmu/load) the tmp_tables, and then do some sql to put in rows from the tmp-tables to production tables. The time for this job has increased over time due to bigger infiles and more data in production tables.
Now i tried testing this in our test environment, 3 Alpha DS15 clustered, same operating system, connected to the same SAN.
When I run the exact same batch in test (with copy last backup of prod database, and same infiles), it takes 20 min CPU time .vs. 100 min in prod and a third of DIO in test when I look at the Accounting information.
The user UAF parameters for running the batchjob is the same or higher in prod than test.
The prod db is newly tuned using dbtune and the disk is not fragmented at all.
I┬┤d be very happy to get some hints of what possible causes to look at...
Best Regards
Bj├╢rn R
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2010 04:24 AM
тАО11-02-2010 04:24 AM
Re: Same job different direct i/o and Cpu, test vs prod
>> Hope someone kan piont me to what to look for. I├В┬┤m not the guru.
Bj├Г┬╢rn,
You started out pretty good with lots of pertinent information. Thanks.
Looks like you made sure most obvious potential sources of differences have been taken into account.
raw CPU speed differences for a DS15 vs DS25 can not explain what you see, but what about memory? Similar physical memory and similar memory pressure, or does the test box perhaps have much more memory to spare?
Are the processes allowed to use the memory properly on prod?
RDMS$BIND_WORK_VM
Good thing you mentioned UAF quotas, but you may also need to check the sysgen PQL settings, notably the minimums which may give processes much more quotas than the UAF suggests.
Or just watch what they use with SHOW PROC/ID or the RMU alternative, the PROCESS ACCOUTING screen.
Using a copy of the DB suggests the same indexes and buffer settings and so on.
You may have to treat this as a regular performance issue with the added advantage of having a comparison system.
Notably check out RMU (active user) Stall Messages and so one.
I would focus on the difference in DIRIO more so than the CPU difference, hoping that they are closely related anyway and the DIRIO often being more tangible: Which files have more IO ?! TMP? RUJ? DB?
Maybe this is a SORT / temp file problem only? .. $ SHOW RMS ?
Spend some quality time with the RDB tuning manual:
http://download.oracle.com/otn_hosted_doc/rdb/pdf/dbpt.pdf
And last but not least, there are a good few folks/companies out there eager to help you with RDB. Google will find them. Some that come to mind in alphabetic order : JCC, Oracle itself, SCI, VXcompany... several independent consultants and so on.
OpenVMS Bootcamp notes, and RDB update proceedings may provide further hints.
Good luck!
Hein van den Heuvel
HvdH Performance Consulting
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2010 07:28 AM
тАО11-02-2010 07:28 AM
Re: Same job different direct i/o and Cpu, test vs prod
In reality these are some very normal questions that have been experienced many times from the early VAX days to present. OpenVMS is, if I may say so, great stuff and highly configurable. That both works as an advantage AND disadvantage in these type circumstances. Many of my customers have experienced problems because their "test" platform was "close to production" only by the virtue that it was an Alpha running the same O/S release OR what they were testing was too small a subset of their application to reflect real life. So, personally, I'd like to know more about your "car" and it's components before I start measuring for tires.
bob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2010 12:53 PM
тАО11-02-2010 12:53 PM
Re: Same job different direct i/o and Cpu, test vs prod
I/O loading on both systems?
Disk queue length and I/O throughput?
File fragmentation?
Default process quotas (i.e. PQL* values from SYSGEN)?
Cluster comms performances, especially resends?
Any db rollbacks or journalling?
It's a nightly batch job, so do you also run Backup overnight and is there a conflict?
... and that's with about 30 seconds thought.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2010 02:18 PM
тАО11-02-2010 02:18 PM
Re: Same job different direct i/o and Cpu, test vs prod
I'd punt for something "dbtune" did.
Can you dbtune a test copy of the datbase and see if you can reproduce the results in test?
Optimizer changes? Different index node sizes? More/Less indices? SPAM threashholds? Many more page-discards?
Cheers Richard Maher
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2010 03:06 PM
тАО11-02-2010 03:06 PM
Re: Same job different direct i/o and Cpu, test vs prod
Thank you for mentioning the hardware details in the original posting, it is appreciated.
To add some items onto the list:
Install T4 and gather the broad spectrum of statistics from BOTH systems. The OP notes that both systems are using the same SAN; how are both sets of logical volumes configured (size, RAID-level, not to mention actual disks).
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2010 03:07 PM
тАО11-02-2010 03:07 PM
Re: Same job different direct i/o and Cpu, test vs prod
(eg. perhaps the test db uses a quiet path, while the production disks all share the same path ?)
Following that, a comparison of "rmu/dump header" on each database might be useful - although you say that the test db is created from a backup of production, so peresumably it would have all the dbtune changes as well.
Another idea might be to run rmu/show stats on the database while each run occurs and compare the results (assuming nothing else is happening on the production database when the load runs).
cheers,
chris
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2010 09:23 AM
тАО11-03-2010 09:23 AM
Re: Same job different direct i/o and Cpu, test vs prod
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2010 12:04 PM
тАО11-03-2010 12:04 PM
Re: Same job different direct i/o and Cpu, test vs prod
Lots of good information here. I am more inclined to be concerned about "induced" work load due to environment rather than system "pressure".
Something along the path is making the job do more direct IO on prod versus test for some reason. You mention that you have run dbtune on prod, but no mention of if you ran it on the test db environment as well.
You say there is no fragmentation on the prod disk. Is free space similar when you create the temporary tables on prod and on test?
What about file extents? Are the infiles newly created on prod or are they the result of appends? Are the disks in prod and test created with similar window sizes and cluster sizes? Are they the same size disks?
Just some thoughts where I might look.
Bill.
CCSS - Computer Consulting System Services, LLC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-05-2010 04:36 AM
тАО11-05-2010 04:36 AM
Re: Same job different direct i/o and Cpu, test vs prod
You started so strongly, but we heard nothing since.
>> a third of DIO in test when I look at the Accounting information.
And we are talking significant IO counts right? Millions, not thousands?
That CPU time, when watched with MONI MODE or correlated to a T4 window, is mostly EXEC MODE (RDB) or something else?
fwiw, in these cases significant difference in IO, I look as MONI and T4 only go guide me a little in the right direction.
It is less important to know what else the system is doing, whether there is fragmentation, or shadowing or which exact IO sub-systems.
All those thing influence the speed for sure, but not the IO count. And the speed impact would but fractional, not the factor 4x - 5x as suggested.
So first order of business is to understand where those extra IOs are going... assuming all along there are millions of them.
We want to look at GETJPI data more than GETSYI.
How much memory is the process using is a more important start here than how much free memory there is in the system. Assuming there is some.
Good luck!
Hein