1836715 Members
2843 Online
110108 Solutions
New Discussion

WLM runaway

 
Tim Nelson
Honored Contributor

WLM runaway

Has anyone experience a Work Load Manager runaway scenario ?

Scenario:
1 RP7420 divided equally into 2 vPars using WLM to manage just the CPU allocations according to workload.

WLM executed a CPU reallocation to on server (A).
Upon review here are a couple notes:
1) The CPU utilization reported by glance, sar, top and MWA never showed an increase in CPU utilzation ( consistantly showed about 40% ).
2) WLM reports by wlminfo slo -l shows currently over 600 share requests hence the reallocation. WLM is still reporting 600+ share requests even though the system CPU utilization is not showing the workload.

I have a call into HP but the current review is simply that I am out of date on a single patch out of many.

All attempts to recreate this scenario where WLM will not release it's assumed workload metric have failed. Test show the product works as advertised.

Looking to see if anyone has experienced the same.

Thanks in advance.

Tim
3 REPLIES 3
Tim Nelson
Honored Contributor

Re: WLM runaway

Solution:

The WLM usage goal uses %IDLE which in most cases is 100-%SYS-%USER-%WIO. In a specific test I only had a backup running which generated little if no CPU but reported a lot of %WIO as it should. But, why would I want WLM to allocate another CPU for an IO load ? Not me.
I configured WLM to use the glance metric for CPU_TOTAL_UTIL. Now WLM only reallocates CPU when there is a CPU load, not an IO load.

SLO looks like this.
slo servera_slo {
pri = 1; # Change this value
mincpu = 2;
maxcpu = 12800;
#goal = usage _CPU;
goal = metric avg_glance_cpu < 75;
}

tune avg_glance_cpu {
coll_argv = wlmrcvdc glance_gbl GBL_CPU_TOTAL_UTIL ;
}


This is another terrible case of generic performance monitors using %WIO in CPU load calculations.

Hope this information is usefull to others in the future.
Tim Nelson
Honored Contributor

Re: WLM runaway

One last piece of information:

Identified a patch after more searching ( PHSS_34270 WLM Cumulative ). There is a specific fix for this issue.
1) WLM is not calculating usage correctly for strictly host-based configurations

Either soloution mentioned in this thread will work.
Tim Nelson
Honored Contributor

Re: WLM runaway

can I give myself points :)

See previous details for solutions.