- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- CPU STALLS
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 04:37 AM
06-28-2006 04:37 AM
CPU STALLS
The application response is a cause of concern.
% of cycles lost due to CPU stalls is 71.45.
Analysis of a process using caliper suggests that the CPU stall is very high. Below is the report.
Is there a way to overcome this?
Please guide.
Anurag
HP Caliper Total CPU Event Counts Report for PkMS
================================================================================
Target Application
Program: /u03/wmpso/CMLACARE/opt/app/wmprd/bin/Lttr
Invocation: /u03/wmpso/CMLACARE/opt/app/wmprd/bin/Lttr
Process ID: 3456 (started by Caliper)
Start time: 09:43:11 PM
End time: 09:43:16 PM
Termination Status: 0
Last modified: May 05, 2006 at 09:26 PM
Memory model: ILP32
Main module text page size: default
Processor Information
Machine name: itanium1
Number of processors: 4
Processor type: Itanium2 6M
Processor speed: 1300 MHz
Target Execution Time
Real time: 4.915 seconds
User time: 1.757 seconds
System time: 0.228 seconds
-----------------------------------------------
PLM
Event Name U..K TH Count
-----------------------------------------------
BACK_END_BUBBLE.ALL x___ 0 1551927751
CPU_CYCLES x___ 0 2171937631
IA64_INST_RETIRED x___ 0 2094967607
NOPS_RETIRED x___ 0 403213401
-----------------------------------------------
PLM: Privilege Level Mask
U/K = user/kernel levels (U: level 3, K: level 0)
The intermediate levels (1, 2) are unused on HP-UX or Linux
x : the metric is measured at the given level (_ : not measured)
TH: event THreshold, determines the event counter behavior,
TH == 0 : counter += event_count_in_cycle
TH > 0 : counter += (event_count_in_cycle >= threshold ? 1 : 0)
-----------------------------------------------
% of Cycles lost due to stalls (lower is better):
100 * BACK_END_BUBBLE.ALL / CPU_CYCLES = 71.45
Effective CPI (lower is better):
CPU_CYCLES / (IA64_INST_RETIRED - NOPS_RETIRED) = 1.2838
Effective CPI during unstalled execution (lower is better):
(CPU_CYCLES - BACK_END_BUBBLE.ALL) / (IA64_INST_RETIRED - NOPS_RETIRED) = 0.3665
-----------------------------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 04:58 AM
06-28-2006 04:58 AM
Re: CPU STALLS
This is a bit strange, but I think I'd have the hardware checked.
First use cstm mstm or xstm and run the cpu excercizes. Any failures, HP needs to replace.
Then call in the HP Hardware team.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 05:02 AM
06-28-2006 05:02 AM
Re: CPU STALLS
Thanks a lot for your reply.
Can you please let me know as to how to use these commands and what kind of output should i collect and analyze?
Thanks a lot
Anurag
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 05:07 AM
06-28-2006 05:07 AM
Re: CPU STALLS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 05:15 AM
06-28-2006 05:15 AM
Re: CPU STALLS
Its a ELF-32 executable object file - IA64
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 05:21 AM
06-28-2006 05:21 AM
Re: CPU STALLS
caliper icache -o reports/icachem.txt ./matmul
http://docs.hp.com/en/5991-5499/ch02s04.html
"...A hot spot is an instruction or set of instructions that has a higher execution count than most other instructions in a program. For example, code that is inside a loop inside a loop inside a loop will likely be executed more times than straight-line code. Usually the â hotnessâ is measured with CPU cycles, but it could also be measured with metrics such as cache misses...."
this doc. makes some patching suggestings as well as increasing page size:
http://h21007.www2.hp.com/dspp/files/unprotected/hpux/Top_Ten_Perf_Tips.pdf
There's also a lot of reference to the latest pthread patch. See page 12 MxN v. 1x1.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 05:27 AM
06-28-2006 05:27 AM
Re: CPU STALLS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 05:32 AM
06-28-2006 05:32 AM
Re: CPU STALLS
in fact, cache misses are only 12%
Here is a report.........What do you suggest?
Please guide.
Anurag
L1 data cache miss percentage:
Sampling Specification
Sampling event: DATA_EAR_EVENTS
Sampling period: 10000 events
Sampling period variation: 500 (5.00% of sampling period)
Sampling counter privilege: user (user-space sampling)
Data granularity: 16 bytes
Number of samples: 452
Data sampled: Data cache miss
Data Cache Metrics Summed for Entire Run
-----------------------------------------------
PLM
Event Name U..K TH Count
-----------------------------------------------
DATA_REFERENCES x___ 0 447196040
L1D_READS x___ 0 312349645
L1D_READ_MISSES.ALL x___ 0 37692146
-----------------------------------------------
PLM: Privilege Level Mask
U/K = user/kernel levels (U: level 3, K: level 0)
The intermediate levels (1, 2) are unused on HP-UX or Linux
x : the metric is measured at the given level (_ : not measured)
TH: event THreshold, determines the event counter behavior,
TH == 0 : counter += event_count_in_cycle
TH > 0 : counter += (event_count_in_cycle >= threshold ? 1 : 0)
-----------------------------------------------
L1 data cache miss percentage:
12.07 = 100 * (L1D_READ_MISSES.ALL / L1D_READS)
Percent of data references accessing L1 data cache:
69.85 = 100 * (L1D_READS / DATA_REFERENCES)
-----------------------------------------------
Load Module Summary
------------------------------------------------------------------
% Total Avg.
Dcache Cumulat Sampled Dcache Dcache
Latency % of Dcache Latency Laten.
Cycles Total Misses Cycles Cycles Load Module
------------------------------------------------------------------
95.97 95.97 365 13715 37.6 dld.so
1.66 97.63 31 237 7.6 libunwind.so.1
0.85 98.48 17 122 7.2 libpthread.so.1
0.82 99.30 10 117 11.7 libncursesw.so
0.38 99.69 6 55 9.2 librtc.sl
0.13 99.82 3 19 6.3 liborb_r.so
0.13 99.95 2 19 9.5 libc.so.1
0.05 100.00 1 7 7.0 libCsup.so.1
------------------------------------------------------------------
100.00 100.00 435 14291 32.9 Total
------------------------------------------------------------------
Function Summary
--------------------------------------------------------------------------------------------
% Total Avg.
Dcache Cumulat Sampled Dcache Dcache
Latency % of Dcache Latency Laten.
Cycles Total Misses Cycles Cycles Function File
--------------------------------------------------------------------------------------------
0.76 0.76 9 108 12.0 libncursesw.so::__milli_memcmp
0.40 1.15 7 57 8.1 libunwind.so.1::uwx_step uwx_step.c
0.29 1.44 6 41 6.8 libpthread.so.1::*unnamed@0x4042(920-cc0)* mutex.c
0.25 1.69 4 36 9.0 libunwind.so.1::uwx_get_frame_info uwx_step.c
0.19 1.88 3 27 9.0 libpthread.so.1::pthread_setcancelstate cancel.c
0.19 2.07 3 27 9.0 libunwind.so.1::uwx_reclaim_scoreboards uwx_scoreboard.c
0.17 2.24 4 24 6.0 libunwind.so.1::uwx_decode_prologue uwx_uinfo.c
0.15 2.39 1 21 21.0 { STUB }->libunwind.so.1::uwx_reset_str_pool
0.14 2.53 4 20 5.0 libpthread.so.1::pthread_mutex_lock mutex.c
0.13 2.65 2 18 9.0 libpthread.so.1::pthread_mutex_unlock mutex.c
0.13 2.78 2 18 9.0 librtc.sl::rtc_split_special_region infrtc.c
0.11 2.89 2 16 8.0 libpthread.so.1::ENTER_PTHREAD_LIBRARY_FUNC pthread.c
0.10 2.99 3 15 5.0 libunwind.so.1::uwx_search_utable32 uwx_utable.c
--------------------------------------------------------------------------------------------
[Minimum function entries: 5, percent cutoff: 0.10, cumulative percent cutoff: 100.00]
Function Details
----------------------------------------------------------------------
% Total Avg.
Dcache Sampled Dcache Dcache Line|
Latency Dcache Latency Laten. Slot| >Statement|
Cycles Misses Cycles Cycles Col,Offset Instruction
----------------------------------------------------------------------
[Cutoffs excluded all entries (minimum: 0; percent: 1.00; cumulative percent: 100.00;)]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 07:07 AM
06-28-2006 07:07 AM
Re: CPU STALLS
It's done like this:
aCC +Oprofile=collect -O sample.C -o sample.exe // Compile to instrumented executable. sample.exe < input.file // Collect execution profile data. aCC +Oprofile=use -O sample.C -o sample.exe // Recompile with optimization
Profiled base optimization will make much better decisions in laying the code because it uses statistics gathered during an actual execution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-28-2006 07:10 AM
06-28-2006 07:10 AM