Operating System - HP-UX
1834191 Members
2855 Online
110064 Solutions
New Discussion

Best Practice for Large Memory Systems - Crash Dumps

 
Alzhy
Honored Contributor

Best Practice for Large Memory Systems - Crash Dumps

Our partitions (nPar and vPar) have significantly grown in memory -- averaging 128GB each. We had one partition crashed this morning and it took a full dump.... the dump finished in over 3 hours!!

Question: Is there really a need for a full dump to analyse the most common HP-UX problems?

Our swap/dump config is as follows:

vg00 - small 4GB swap
vg01 - large swap 1 - 36GB
vg02 - large swap 2 - 74GB

/dev/vg02/swap / dump defaults 0 0

Are there any "best practice" for dump configuration out there that HP suggests for their benefit of effectively capturing crashinfo?

Thanks.

Hakuna Matata.
6 REPLIES 6
Ivan Ferreira
Honored Contributor

Re: Best Practice for Large Memory Systems - Crash Dumps

Hi Nelson, this document may help you:

docs.hp.com/en/5991-2881/5991-2881.pdf
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Bill Hassell
Honored Contributor

Re: Best Practice for Large Memory Systems - Crash Dumps

For the majority of crash dumps, 4 to 8 Gb of the crash dump should be enough to identify the problem. That said, a really obscure problem related to the buffer cache, shared memory or some process or daemon may require as much as 25%-20% of RAM. Since the OS has halted, the dump process is exceedingly slow due to the very minimal code available to perform a dump.

You may want to create a separate dump lvol pointed to a large cache disk LUN and turn off the savecrash steps in /etc/rc.config.d so the reboot will occur at normal speed. Then transfer the crash dump manually at a later time.


Bill Hassell, sysadmin
Don Morris_1
Honored Contributor

Re: Best Practice for Large Memory Systems - Crash Dumps

For a kernel dump, the usual requirement to get a good idea of what happened is to dump:

+ Kernel text/static data
+ Kernel dynamic data in use
+ User-space kernel thread stacks (UAREA)

Kernel dynamic memory which is free-and-cached (Super Page Pool) is only needed when there's a problem in the SPP itself [which is pretty rare] or memory corruption happens to hit the SPP [rare, unpredictable in general... and you're quite likely to hit higher level caches anyway]. User data is very rarely needed (plus most Users don't want HP support reading their application private data for security reasons [could be classified, customer sensitive, etc.])

This should be the default configuration for crashconf already:

# crashconf
Crash dump configuration has been changed since boot.

CLASS PAGES INCLUDED IN DUMP DESCRIPTION
-------- ---------- ---------------- -------------------------------------
UNUSED 847833 no, by default unused pages
USERPG 2456409 no, by default user process pages
BCACHE 356147 no, by default buffer cache pages
KCODE 11214 no, by default kernel code pages
USTACK 1537 yes, by default user process stacks
FSDATA 132 yes, by default file system metadata
KDDATA 495684 yes, by default kernel dynamic data
KSDATA 7170 yes, by default kernel static data
SUPERPG 9920 no, by default unused kernel super pages

Total pages on system: 4186046
Total pages included in dump: 504523

Dump compressed: ON

Dump Parallel: ON

DEVICE OFFSET(kB) SIZE (kB) LOGICAL VOL. NAME
------------ ---------- ---------- ------------ -------------------------
1:0x00000e 2612064 33554432 64:0x000002 /dev/vg00/lvol2
----------
33554432

[That's an 11.31 system, you don't say what version you have -- I believe 11.23 is reasonably equivalent, 11.11 is not as powerful in the dump options].

Re: Best Practice for Large Memory Systems - Crash Dumps

Nelson,

If your systems are 11iv1 then installing the compressed dump option can speed up dump tim in some scenarios:

http://h20293.www2.hp.com/portal/swdepot/displayProductInfo.do?productNumber=CDUMP11i

IIRC this is a standard feature in 11iv2

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Geoff Wild
Honored Contributor

Re: Best Practice for Large Memory Systems - Crash Dumps

I do't know if you need a full crash dump or not - but you know Murphy's law :)

Another thing we do is create a /var/adm/crash file system (so var doesn't fill up).

I usally set that to 16GB.

Rgds...Geoff
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.
melvyn burnard
Honored Contributor

Re: Best Practice for Large Memory Systems - Crash Dumps

Have you taken a look at the Compressed Dump utility?

The only issues is it need 5 working processors to run, and a minimum of 2GB Ram
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!