Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

 
SOLVED
Go to solution
Ruslan R. Laishev
Super Advisor

SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Hi there!

Is there someone who have tried to use a cool VMS feature as "V8.3 Global Buffering" ?


I tried ... got crashes. :-(
14 REPLIES 14
Duncan Morris
Honored Contributor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Ruslan,

what sort of "crash" did you experience? Process or system?

It works for me (see the attachment).

Duncan
Ruslan R. Laishev
Super Advisor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

System's crash...

Your files about 140 blocks, my files -> 700 MBs - 3+ GBs.
Duncan Morris
Honored Contributor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Ruslan,

what was the crash?

If you have the CLUE$*.LIS from sys$errorlog, then please post it as ana attachment - I am sure that Volker/Hein would be keen to see it.

Ruslan R. Laishev
Super Advisor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Attached.
Hein van den Heuvel
Honored Contributor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

I'm working with an RMS Engineer on a global buffers related crash.
That case is on I64 8.3-h1 and _might_ be specific to RELATIVE files.

Ruslan, I can not download your attachement (Duncan's TXT file works fine). Can you Email it to me, or stick it on EISNER and let us know where?

I'm sure that attachment would give the right details, but in the mean time, what about platform and exact version / patches?

What exact command is used .. on what file (size, typ, bucketsize): how many gloabl buffers were picked and how big did the section become. Less than 32K buffers?

When does it crash? On the SET FILE .. I doubt it.
On first usage? Under heavy concurrent usage? Read-only or active updates?
Specific program, or will for example a CONVER/SHARE/FDL=nl: file nl: crash it?

Maybe the system ran out of LOCK resources (pool) at a 'bad time'?

Finally, if this is for a production environment... The old interface allow up to 32K global buffers.. Typically that is plenty. Do you have 'evidence' it is not?

hth,
Hein.
Hein van den Heuvel
Honored Contributor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Hein>> What exact command is used ..

The exact command was probably: "SET FILE/GLOB=DEFAULT".

That is only meaningfull to others when we know how the SYSGEN params GB_CACHEALLMAX and GB_DEFPERCENT were set... on top of the file characteristics.

Hein.
Volker Halle
Honored Contributor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Ruslan,

consider to just rename the CLUE*.LIS file as .TXT and attach it directly, without ZIPping it first...

Volker.
Ruslan R. Laishev
Super Advisor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Hello, Hein!

Thanks. I just sent a mail to your gmail box.
Ruslan R. Laishev
Super Advisor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Gentelemn,
I got my .zip back w/o problems.


Sysgen:
GB_CACHEALLMAX 50000 50000 100 -1 Blocks D
GB_DEFPERCENT 35 35 0 1000 Percent D
Ruslan R. Laishev
Super Advisor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

SYSGEN> SHO NPA
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
NPAGEDYN 200712192 4194304 163840 1879048192 Bytes
NPAGEVIR 687226880 16777216 163840 1879048192 Bytes
NPAG_BAP_MIN 40960 0 0 -1 Bytes
NPAG_BAP_MAX 131072 0 0 -1 Bytes
NPAG_BAP_MIN_PA 0 0 0 -1 Mbytes
NPAG_BAP_MAX_PA 2147483647 -1 0 -1 Mbytes
NPAG_RING_SIZE 2048 2048 0 -1 Entries
NPAGECALC 0 1 0 2 Coded-valu
NPAGERAD 0 0 0 -1 Bytes
NPAG_INTERVAL 30 30 0 -1 Seconds D
NPAG_GENTLE 100 100 1 100 Percent D
NPAG_AGGRESSIVE 100 100 1 100 Percent D
SYSGEN>
Hein van den Heuvel
Honored Contributor
Solution

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Ok, I saw the CLUE output,

- CPUSPINWAIT, CPU spinwait timer expired
- Current process Oracle
- Cause: timeout acquiring spinlock
- Spinlock name: POOL
- Non-Paged Pool:
- Unsuccessful Expansions 5456

Oracle is NOT likely to actually be touching file with RMS global buffers other than SYSUAF/RIGHTLIST.

So Oracle was LIKELY a victim.
An other process which, possibly using global buffers, or creating some for the first time, may have been the cause.

I would first and foremost raise this as a formal support call to HP.

I would verify SMP_SPINWAIT and SMP_LNGSPINWAIT setting and possibly bump those as potential workaround.

I would aggresively increase the POOL pre-allocation.

Seeing that CACHEALLMAX is 'only' 50,000 I would switch back to do the tedious, manual, per file non-default GBC setting with something like: (not actual command)
SET FILE/GLO= MIN( 32000, 35 * ALQ / BLS * 100 )

See if that also suffers.

IFF you can tolerate a potential crash, I would try the above BEFORE changing anything else. But I would only want you to try that to un-couple this problem from the SET FILE/GLO=DEF. I do not expect that the problem is caused by the RMS options that this command triggers, nor to I expect that the 50,000 triggered it. I suspect that the 'old' 32,000 will also cause this, and am curious to know.

How aggresively had you set rms global buffers befor this? none? 5,000? 32,000?

fyi... below the details on how SET FILE/GLO=DEF is used.

Hein.



XAB$M_GBC_DEFAULT --- Requests RMS at run time to recalculate the global cache size based on an algorithm that makes use of two global buffer (GB) SYSGEN parameters: GB_CACHEALLMAX and GB_DEFPERCENT. If the default option is enabled, and if the size (in blocks) of the file is less than or equal to the specified size for the GB_CACHEALLMAX parameter, RMS allocates sufficient global buffers to cache the whole file. If the size (in blocks) is greater than the specified size for the GB_CACHEALLMAX parameter, RMS allocates sufficient global buffers to cache the percentage of the file specified by the GB_DEFPERCENT (global buffer default percent) parameter.
Volker Halle
Honored Contributor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Ruslan,

which CPU is holding the spinlock and what's running there ? SDA> CLUE CRASH should output this information in the spinlock section.

Note that nonpaged pool allocation and de-allocation to a severely fragmented variable nonpaged pool list can cause these type of crashes. Depending on CPU speed, this may happen, if the no. of packets on the variable list exceeds 20000...

SDA> SHO MEM/POOL/FULL

Volker.
Volker Halle
Honored Contributor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Ruslan,

in the CPUSPINWAIT crash, the CPU owning the POOL spinlock (CPU 05) was executing at EXE$DEALLOCATE_C+00014, i.e. trying to allocate a packet of nonpaged pool. I bet the variable list was huge, so it just took 'too long' to find a suitable packet.

Diagnosing the CLUEXIT crash would need more configuration information.

Volker.
Volker Halle
Honored Contributor

Re: SET FILE/GLOB=DEFAULT -> several BugCheck/Crash

Ruslan,

the CLUEXIT crash also shows massive nonpaged pool problems on the local node. Either increase nonpaged pool a lot or consider, whether there may be a nonpaged pool leak on these systems. If the other node (xxx1), which has sent the DISCONNECT message, also has similar pool problems and would be a non-SMP system, this could also explain such a CLUEXIT crash. There could also have been CI problems to do the nonpaged pool shortage.

Volker.