- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: OpenVMS System stats collectiion
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2007 09:48 PM
05-24-2007 09:48 PM
We are looking for a tool that will allow the following :
Data collection on a running OpenVMS system - Like the CA data collector provides ...but will allow investigation at a future date on percieved application hangs of as little as .5 seconds.
From what I remember of DEC/Unicenter/CA performance suite tools ...you could not get granularity as low as this. I think you could only go down to about 10 seconds ? And then you would end up with huge CPD files.
Does anyone use any tools that would give me what I'm looking for ?
Regards
Kevin
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2007 10:07 PM
05-24-2007 10:07 PM
Solutionhttp://hpperfdat.compinia.com/DOCUMENTATION/PerfDat_Arch_Tech_v33.pdf
about the data collector
sample interval is freely definable (minimum 1 second)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-24-2007 10:14 PM
05-24-2007 10:14 PM
Re: OpenVMS System stats collectiion
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-26-2007 10:13 PM
05-26-2007 10:13 PM
Re: OpenVMS System stats collectiion
You may have to write something based on TDC or pay someone to write something.
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-27-2007 01:45 AM
05-27-2007 01:45 AM
Re: OpenVMS System stats collectiion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-27-2007 01:38 PM
05-27-2007 01:38 PM
Re: OpenVMS System stats collectiion
Although this doesn't answer your question, it does provide some guidelines on the type of tools that you choose. You don't state if there is a specific application you are attempting to measure, or if you are trying to determine the cause of 0.5-second deviation in response time for any arbitrary application.
Is this system dedicated to a specific application, or is it a general-purpose time-sharing system. In other words, why is 0.5 seconds considered unacceptable? Is there time that slow response is acceptable?
Several things you need to consider.
1. How do your users connect to the system? If they connect via a network, some of the delay may be network related, and nothing you measure at the VMS system will include that delay.
2. For measuring short duration events, you should consider event-based measurements. Google on Nyquist for appropriate sampling periods needed to detect events with polling. Also, when you are trying to measure small duration events with polling, you are likely to affect the results due to the sampling (Google "observer effect"). Using event-based measurement will probably require modifications to the application to insert instrumentation hooks, but it will give you the most accurate measurement. Also, remember point #1, these times will not include any time spent waiting for the network to deliver packets. It can show that the delay was not due to VMS.
3. If you want to measure the delays seen by the user, you will need to instrument the application from the PC's perspective. It the user interface is telnet, that is not going to be straight forward (at least I can't think of an easy way to do it). You could have something at the PC that pings the host, and measures the response time, or if possible, measure the time for acks to arrive for packets sent by the PC's TCP/IP stack (again, I don't know how that would be done, other than with wireshark/ethereal).
You should download the T4 & Friends package as it is free, and will give you the ability to see the overall conditions at the time you are investigating. I don't think it is going to give fine granularity "micro" picture you are looking for.
On the other end of the spectrum from T4 are the SDA extensions like PCS and PRF (and on Alpha, DCPI) give a much higher resolution view into the what is using the CPU, although they won't provide much information about thing like I/O and locking delays.
Have fun,
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-28-2007 02:37 AM
05-28-2007 02:37 AM
Re: OpenVMS System stats collectiion
http://h71000.www7.hp.com/openvms/products/tdc/
but you will have to develop some software to use TDC
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-28-2007 03:07 AM
05-28-2007 03:07 AM
Re: OpenVMS System stats collectiion
Jon makes some good points.
Retrospectively resolving a 0.5 second perceived gap in response is a challenge. Sufficient data will need to be collected at many points.
Collecting data at resolutions finer than 0.5 second requires some thought in advance. It is likely that the actual sampling resolution will need to be between 0.1 second and 0.25 second in order to accurately identify anomalies occurring over a period of 0.5 second. Depending on the scope of information gathered, there are several issues, including:
- distortion caused by the sampling and recording process
- gathering sufficient information to examine the situation retrospectively in a constructive manner
- ensuring that the data collection process and the analysis process do not produce misleading artifacts
The correct solution may not be a simple package. As an example, T4 can be used for the analysis of data that is collected outside of the normal tools. Network performance and traffic may need to be collected using a LAN monitor (e.g. WireShark) for later correlation with performance data.
Having data collection tools developed specifically for this situation is certainly a valid option. Deeper planning of the situation is also likely called for (Disclosure: Our firm has done these types of projects)
Looking backward after an event is an interesting problem in a variety of spheres. The challenge is that the information must be captured going forward, once the event has occurred, there is often no potential to capture useful information.
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-28-2007 03:23 AM
05-28-2007 03:23 AM
Re: OpenVMS System stats collectiion
Purely Personal Opinion
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-28-2007 09:52 PM
05-28-2007 09:52 PM
Re: OpenVMS System stats collectiion
You would need to trace the network too (to find out why the other party was saying).
And trace all the IO (to find out what the controller was saying).
And when you have all that, how to find what really happened ? E.g. someone unplugged a cable for a second, cpu taken by a real time process (or a system manager) on another node, ...
Wim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-28-2007 10:20 PM
05-28-2007 10:20 PM
Re: OpenVMS System stats collectiion
This does look like more of a task than first it did on first look.
I will correlate the responses and feed back to management.
It does look like a custom tool will need to be developed ...if we are serious about gathering good data for future analysis.
Thanks
Kevin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-28-2007 10:34 PM
05-28-2007 10:34 PM
Re: OpenVMS System stats collectiion
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=939438 about "virtual IO".
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=603651 about how disks block IO during certain operations.
Wim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-29-2007 01:22 AM
05-29-2007 01:22 AM
Re: OpenVMS System stats collectiion
Yes, that can be done. You would need to write a simple C/C++ application that could be easily derived from various examples provided with the SDK in the kit at http://www.hp.com/products/openvms/tdc.
Your application would be responsible for the timing of data collections (the built-in timer has a 1-second resolution); it would call TDC_COLLECT_SNAPSHOT() at expiration of each timed interval.
You would want to be careful in your selection of data records to be collected because 1) TDC collects lots of metrics and can create VERY large files (particularly if collecting at .5-sec intervals over an extended period), and 2) collecting all TDC metrics at the frequency you cite could certainly have system performance implications. Selection of data records to be collected can be easily controlled by your application (again, code examples and documentation are provided in the SDK).
You could further elaborate on your application to pull the data of interest at each collection point via TDC's API and store only that data in your own file without creating a TDC data file (I know of one commercial application that does exactly that). CSV might be a suitable format for such a file, as might T4's TLC format.
Lee Clark
OpenVMS engineering
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-29-2007 09:24 AM
05-29-2007 09:24 AM
Re: OpenVMS System stats collectiion
If humans are noticing the delays, then it is probably related to how long it takes to echo characters to the screen.
If the characters are being echoed by the VMS terminal driver, the delays the user see are almost guaranteed to be due to the network, since the character processing is done by code at elevated IPL. There are some applications that do their own character echoing, doing single character I/O without echo, and then processing and sending the character back. There can be noticeable delays in those applications, especially if resources (memory, processor, i/o) are over utilized.
Also, for readers that may not be familiar with all the terminology, here are some definitions.
Polling - measurements made periodically, usually reading a set of event counters, or in the case of CPU, determining the process that was active at the time of the interrupt triggering the taking of a sample. In general with polling, no changes are needed to the application.
Event-based - something in the code that records each occurrence of an event. E.g. each time an I/O is completed, a counter can be incremented.
Transaction timing can be accomplished by taking a snapshot of the time, and perhaps some other items right before starting an operation, and another snapshot can be taken when the operation is complete. Then the elapsed time, and resources used during the operation can be determined by taking the difference between the starting values and the ending values. A relatively simple example are the routines lib$init_timer, lib$stat_timer, and lib$show_timer.
Note that unlike polling, event based sampling requires modifications to the code, at the locations where you want to record things. The VMS operation system already has many event-based counters, which can be read by timer based polling routines, and in fact these are the main source of information monitor and TDC report.
Jon