- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: $GETRMI returning SS$_SUSPENDED
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-06-2008 01:50 PM
тАО08-06-2008 01:50 PM
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-06-2008 03:24 PM
тАО08-06-2008 03:24 PM
Re: $GETRMI returning SS$_SUSPENDED
Just the minimum to reproduce what you see?
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-06-2008 03:31 PM
тАО08-06-2008 03:31 PM
Re: $GETRMI returning SS$_SUSPENDED
You're probably not doing anything wrong. The exact reason it most likely dependent on what your item list is asking for and the state of the system at the time of your call.
Generally the system uses SS$_SUSPENDED when "something" is preventing the collection of data from a process. For example, SHOW PROCESS/CONTINUOUS will say "Process is suspended" when it's really in a MWAIT state, and thus can't respond.
Although it's not really a correct usage of SS$_SUSPENDED, it's expedient.
I'd guess that you're asing for a statistic that needs to be gathered from another process, and at the time the process is not responding. This may be a symptom of a real problem, or could be just timing (for example, the process is in RWSCS, waiting for a response from another node).
Post a summary of your item list, and maybe we can have a guess as to which item is the cause.
Of course, I'm assuming that this is an occasional, transient error. If it's repeatable, can you trim down your item list to the minimum required to get the error?
If it's transient, you need to decide what data to use for your missing sample point. Zero? Infinity? Missing? What data does $GETRMI return (if any)?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-08-2008 12:37 PM
тАО08-08-2008 12:37 PM
Re: $GETRMI returning SS$_SUSPENDED
The error happens every few minutes, so repeating it is not a problem. Unfortunately, while I could iteratively remove itemcodes until I find the culprit, it would be very impractical because I'd have to release the software each time (it's running in a production environment and is not having this problem in the development environment - of course), and each release takes months.
I can list the itemcodes. Here they are:
RMI$_CPUIDLE, RMI$_CPUINTSTK, RMI$_CPUMPSYNCH, RMI$_CPUKERNEL, RMI$_CPUEXEC, RMI$_CPUSUPER, RMI$_CPUUSER, RMI$_DIRIO, RMI$_BUFIO
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-08-2008 05:35 PM
тАО08-08-2008 05:35 PM
SolutionWhat is different between the development environment and production environment? Lack of load? Single processor vs. SMP? Different versions of VMS? Different architectures?
What version of VMS is running on your production server, and what type of processor is it?
Here's the description from the SSREF manual (July 2006, OpenVMS I64 Version 8.3 OpenVMS Alpha Version 8.3)
--------------------------------------------------------------------------------
$GETRMI
Returns system performance information about the local system. $GETRMI is an asynchronous system service and requires the $SYNCH service or another wait-state synchronous mechanism to guarantee that the required information is available. There is no synchronous wait form for this system service.
For additional information about system service completion, see the Synchronize ($SYNCH) service.
--------------------------------------------------------------------------------
Format
SYS$GETRMI [efn] [,nullarg] [,nullarg] ,itmlst [,iosb] [,astadr] [,astprm]
--------------------------------------------------------------------------------
So, if you are using only the documented functionality, it isn't clear to me what process it could be waiting on (re John Gillings' comment). It isn't like $GETJPI where process specific information is being retrieved, and the documentation suggest it can't return information from another node in a cluster.
The data being requested is coming from cells in S0, for the itemcodes you list.
Are param 2 and 3 specifying 0 by value, or are you treating them like a $GETSYI call?
Does your $getrmi call look similar to this (lifted from http://www.eight-cubed.com/examples/framework.php?file=sys_getrmi.c )
r0_status = sys$getrmi (efn,
0,
0,
itemlist,
&iosb,
0,
0);
Are you using an event flag or ast completion for notification that the data is ready? If you aren't synchronizing, that could explain why it appears to work on a lightly loaded system, but sometimes fails on your production system.
$GETRMI is much more likely to have bugs than $GETSYI, since it is relatively new (7.3-2?) and is probably used much less frequently than $GETSYI. So it is possible that there is a bug or undocumented feature. But there is also the possibility that your code has a bug, and since we can't see how you are calling the service, and what synchronization you are using, a bug there can't be ruled out.
Can you try running the program on your test system with a low priority, and generate some load with something like
sys$test:uetp.com and possibly some compute intensive processes?
Good luck,
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-09-2008 08:28 AM
тАО08-09-2008 08:28 AM
Re: $GETRMI returning SS$_SUSPENDED
>> This error code is not documented as a possible return value of $GETRMI.
Correct, and t looks like it is not an system service return code.
>> $GETRMI returns good status; the SS$_SUSPENDED is coming in iosb.L0.
Is the code waiting to look into the iosb untill it is done, typically after a $synch call?
What is the scope of the iosb variable?
The only system service documented to return SS$_SUSPENDED is SYS$GETJPI. Is the programm also using that service?
Is the program using the same iosb for both calls?
Good luck,
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2008 01:55 PM
тАО08-10-2008 01:55 PM
Re: $GETRMI returning SS$_SUSPENDED
> $GETRMI returns good status; the SS$_SUSPENDED is coming in iosb.L0.
This is normal for an asynch service. The good return status means your call was well formed. The iosb sattus is the result of the request. I'm assuming you're properly synchronizing with $SYNCH, or equivalent?
Of your item codes, my suspect is DIRIO and BUFIO. All the CPU stuff is readily available from system data cells, but, depending on the definition, I/O counts might require gathering data from all(?) processes on the system. So even one process in an MWAIT state might give you SS$_SUSPENDED.
Looking at the time series of the data you're gathering, do you see any pattern in the returned data depending on the status? Try outputting your samples in T4 format, add a column with 0/1 depending on the state of SS$_SUSPENDED. Look at it under TLVIZ. First cut just do a "CORRELATE" against the status column.
>it would be very impractical because I'd
>have to release the software each time
>(it's running in a production environment
>and is not having this problem in the
>development environment - of course), and
>each release takes months.
So you need to write a baby program that just exercises this issue. I'd break the item list into two. One with all the CPU stuff, and one with the IO. Run it on your production system, in parallel with your production code. Yes, you'll get arguments, but do they want to answer this question or not? I'd also add items checking counters of process states. See if there are any MWAIT states reported, and if they correlate with the SS$_SUSPENDED.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-10-2008 11:04 PM
тАО08-10-2008 11:04 PM
Re: $GETRMI returning SS$_SUSPENDED
I think Hein's conjecture about the IOSB being shared with a $GETJPI call is much more likely.
The following is a great checklist when programs fail intermittently. It specifically addresses synchronization bugs that SMP systems tend to bring out of hiding.
http://h71000.www7.hp.com/wizard/wiz_1661.html
Since you have specified you are using an IOSB, and assuming you are using $synch, make sure that the IOSB is in memory that remains valid for the duration between the initial $getrmi and the $synch (using static storage is by far the easiest method to ensure that), that the $getrmi and the $synch are using the same iosb, and that nothing else is using the memory used by the IOSB (don't share this static storage with other concurrent asynch operations, For example an asynch $getjpi using the same IOSB as a concurrent $getrmi could produce results like you see.)
The following are some threads that describe problems that can arise with incorrect IOSB usage.
sys$qiow(efn$c_enf,...,iosb,...) - must iosb be specified?
http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=1163915
ASTs corrupting stack frames in DECC 6.5 /optimize
http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=942947
Good luck,
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-11-2008 01:38 AM
тАО08-11-2008 01:38 AM
Re: $GETRMI returning SS$_SUSPENDED
The sys_getrmi_dir_buf is a slightly modified version of sys_getrmi.c from James Duff's examples. See attached .zip file that contains everything you need to recreate the source code.
Here's an example run on a 4 processor ES40 running VMS 7.3-2
OT$ analyze/system
OpenVMS (TM) system analyzer
SDA> read sys$loadable_images:sysdef
%SDA-I-READSYM, 10724 symbols read from SYS$COMMON:[SYS$LDR]SYSDEF.STB;1
SDA> eval @pms$gl_dirio
Hex = 00000000.13C1D0BB Decimal = 331468987
SDA> eval @pms$gl_bufio
Hex = 00000000.07F87943 Decimal = 133724483
SDA> spawn run sys_getrmi_dir_buf
DIRIO: 331470726
BUFIO: 133725603
SDA> eval @pms$gl_dirio
Hex = 00000000.13C1D979 Decimal = 331471225
SDA> eval @pms$gl_bufio
Hex = 00000000.07F87EBC Decimal = 133725884
SDA>
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-11-2008 03:28 AM
тАО08-11-2008 03:28 AM
Re: $GETRMI returning SS$_SUSPENDED
I agree with others here, in so much as something else could be stomping on your IOSB. (What is ss$_suspended in ascii perhaps?)
One other option may be a TCP/IP $qio which can perfectly-well return ss$_suspended (A quick glance says I have some Multinet-specific code that I think is more to do with spurious ss$_shut rather than suspended)
Can't see why a $getrmi for the local system would ever use a TCP/IP call, but who knows?
Cheers Richard Maher