Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

I need to explain why PIDs > 2170(hex) are occurring

 
Gavin Medbery
Occasional Visitor

I need to explain why PIDs > 2170(hex) are occurring

My mom has to request a change to some stupid code which checks if a PID is valid by if the number is greater or equal to 2170[hex] (which is 10,000[dec]). The problem is that she can't get it signed off until she can explain why this check is failing a lot more over the past to weeks. Now I normaly wouldn't care at all, but yesterday she gave me the report to do, oh...joy.
13 REPLIES 13
Steven Schweda
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

Did you ask your mom which operating system
we're dealing with here? Care to reveal it?
Gavin Medbery
Occasional Visitor

Re: I need to explain why PIDs > 2170(hex) are occurring

OpenVMS
Robert Gezelter
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

Gavin,

OpenVMS Process IDs vary for all number of reasons. The only guarantee reported in the Internals and Data Structure manual is that the "high bit will never be set" [IDSM, Alpha 7.0, pp 225].

To be precise, SHOW SYSTEM displays the Extended PID [ibid]. The EPID is described in detail in the IDSM [ibid, pp 282, et seq.] and includes:

- the node sequence number (in clusters, 2 bits)
- the node index (clusters, 8 bits)
- the process sequence number (7-16 bits) and
- the process index (14-5 bits)

The boundary between the last two is sliding.

Presuming that there is any pattern to process ID assignments is generally a poor strategic choice. Even if a particular process appears to get the same process ID, there are no guarantees. Additionally, if a process is restarted for some reason (e.g., Queue Manager), the process number will change.

- Bob Gezelter, http://www.rlgsc.com
Steven Schweda
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

> OpenVMS

That's a start. Version? Hardware?

Cluster? I assume not. Around here:

alp $ show process

1-MAR-2009 18:01:20.01 User: SMS Process ID: 2020048A
[...]


Why would you expect a limit of 10000 on a
process ID?
Hein van den Heuvel
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

Pid's do not wrap at 9999.
Badly broken software might like it do but is has no business expecting that.
Maybe is wraps at Hex FFFF, but I don't really know (any more) and I don't care because no proper software should care.

A sign-off to stop checking PID ranges is well worth it.

Now BATCH job entry number (used to) recycle.
Maybe that's the problem?

fwiw,
Hein.

David Jones_21
Trusted Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

You can make the code a little less stupid by having is use $GETJPI on the PID to get the JPI$_PROC_INDEX and having it use that instead of the PID. PROC_INDEX is always in the range 1 to MAXPROCESSCNT (sysgen value).
I'm looking for marbles all day long.
Jan van den Ende
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

Gavin,

check Bob G's answer, 1st dash:
in a cluster it is even GUARANTEED to be (much) larger. PIDs are 8 char hex numbers, and in a cluster the first digit is always >= 2. Th e first 3 digits are the same for any process on a specific node in the cluster, and as long as the cluster is not completely dissolved, any new boot raises that 3-digit number by 2.

( I never found out what happens if THAT overflows, never got over 7xx )

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Robert Gezelter
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

Gavin,

As David Jones has observed in his post, it would be slightly better to check the PROC_INDEX, rather than the PID.

However, since the original post does not specify the context, I recommend extreme caution.

The OP did not mention if this is a real-time check on a running system, or a ex-post facto check against some log or accounting file.

If it is a check on (emphasis "ON") the current system, then some additional components can be checked (e.g., the Process Index against MAXPROCESSCNT). However, if this is not a check on the current system against a PID from the current running system, such a presumption would not be valid.

Depending on the internal structure of the PID is not, insofar as I know, a supported feature and should not be relied upon in any form. While it has been so for many years, there is no guarantee that it will not change in some way in the future. The only guarantee is that no two processes will have the same Process ID at the same time (in the same cluster).

- Bob Gezelter, http://www.rlgsc.com
Jon Pinkley
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

Gavin,

What type of software does your mom work on?

Specifically are the PIDs you are referring to Process IDentification numbers, or something else? You really need to look at the code and determine why the programmer thought that PID values under 10,000 were valid.

Re:"she can't get it signed off until she can explain why this check is failing a lot more over the past to weeks."

In general, if something starts behaving differently, it is because something changed.

You or your mom are in a better position to determine what has changed than anyone here.
it depends
Hein van den Heuvel
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring


Now if this this has been 'working' for a long time then the easy workaround, as much as this pains me to write, will be to just reboot the box and move on.

A 'therapeutic' reboot ever 3 months is typically 'a tad' cheaper than any software remediation.

Cheers,
Hein.

Hoff
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

Whoa, blast from the past. This sort of PID dependency stuff was last seen back around VAX/VMS V4.0, as then older code slammed into PID-level changes that were implemented back then.

The PID is an opaque longword value. Any other assumptions should be avoided.

The Job Entry number (from the queue manager) is (also) an opaque longword. Other assumptions should be avoided. But I digress.

Fix the code.

Or reboot the box and hope you don't get a run-away process restart or a dictionary attack or other such; something that'll push the observed PID values upwards more quickly.

Discussions going back twenty-some years will introduce the EPID and the IPID as differentiated from the pre-cluster PID values; these sorts of PID assumptions were once somewhat common (albeit even then-questionable) but have largely fallen by the wayside since.

Nice to see some broken code still exists.

Jon Pinkley
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

If the PIDs Gavin is referring to are in fact process ids, then my guess is that Gavin is mistaken in his assumption that the 10000 is a decimal number. For IPIDs, there isn't anything significant about 10,000 decimal, but 0x10000 is significant, as it is the least significant bit of the high word of the IPID. Re-Reading the description, it may be the check is if PID .ge. 10000 then valid. My guess is that the 10000 is really 0x10000. IPIDs initially start with the sequence set to 1 (increments the previous sequence number, which is initially zero), so the upper word will be non-zero "most of the time".

The IDSM (I happened to be looking at the VAX/VMS 4.4 manual) says that the sequence number cycles back to zero after 32767 to avoid it becoming negative. So it may be possible that the high word of the IPID can have a value of 0 after the process slot has been used 32768 times. (I didn't look at the code to see how the wrap is done; I am assuming it checks to see if the value is negative after the increment, and if so sets the sequence number field to zero).

I am also not sure if Gavin was really interested in the answer, or if he was just venting. At least that is the impression he gives in his last sentence of his opening remark.
it depends
Hoff
Honored Contributor

Re: I need to explain why PIDs > 2170(hex) are occurring

PIDs in antiquity were fairly sane; as clusters were brought on-line, a skewed field structure came on-line with a cluster value present in the upper bits. The bit-width partitioning between the lower field of the PID and the upper portion varies.