- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Profiling a program built with and without STLport
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-25-2009 02:12 AM
тАО12-25-2009 02:12 AM
For the sake of profiling I build on HP-UX 11.31 ia64 my application with STLport 5.1.7 and without STLport. So there were two configurations:
1) gcc 4.3.1, STLport 5.1.7, level of optimization O2
2) gcc 4.3.1, own gcc STL, level of optimization O2
It is probably important to mention that we have been developing the application for a long time using STLport and now we decided to build it without STLport, compare performance and if it is the same or better then start using own gcc STL library.
So I run a simple test when 500 000 requests are sent to the application, processed in one worker thread and sent back
500 000 responses.
What is interesting is that the version with STLport processes requests 25% faster. So for the time being it is not practical to give up using STLport. However I would like to get some assistance in finding out why STLport gives such a significant boost and what can be done to get this speed in the no-STLport version.
I profiled both applications and the problem seems to be in the fact that the no-STLport verion calls too often pthread_mutex_unlock().
STLport-version of the application (profiled with Caliper) showed that there were 57308 pthread_mutex_unlock() calls count. No-STLport-version of the application (profiled with Caliper) showed that there were 124941 pthread_mutex_unlock() calls count.
Actually I have done some search, found recommendation to assign variables of type std::string in this way (add .c_str() to a right-parameter): string_var_1 = string_var_2.c_str(), changed code in often called functions, profiled again and it resulted in 70000 pthread_mutex_unlock() calls count. It is definitely some improvement but it is still worse than STLport-version and what is more important it is not feasible to find all places where it was necessary to add .c_str().
So what recommendations for this situations could you give?
By the way I attached both profile reports.
Solved! Go to Solution.
- Tags:
- STLPort
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-25-2009 08:50 AM
тАО12-25-2009 08:50 AM
SolutionAny reason you haven't looked into using aC++, even with STLport? g++ can't compete with the code generated by aC++'s +O2 and above.
Of course you may lose more than you gain when using the aC++ runtime.
>What is interesting is that the version with STLport processes requests 25% faster.
Hard to argue with that.
>the fact that the no-STLport version calls too often pthread_mutex_unlock.
(Actually since lock/unlock come in pairs, it is calling pthread_mutex_lock too often.)
>found recommendation to assign variables of type std::string in this way (add .c_str() to a right-parameter): string_var_1 = string_var_2.c_str()
For aC++'s reference counted strings, this would hurt things.
>So what recommendations for this situations could you give?
Change the g++ STL to optimize your cases? Perhaps STLport is using its own allocator and bypassing malloc?
libstlport.so.5.1::stlp_std::__malloc_alloc::allocate
Not using strings but const char*.
If you look closely at your caliper outputs you'll see that for STLport 56.37% of the calls to lock come from __thread_mutex_lock, which comes from other libc functions:
real_free
localtime_r
__gmtime_r_posix
real_malloc
So don't call these that often. ;-)
You might get help by using MallocNG.
For the no STLport case, with 86.91%, you are doing about 5 times as many calls to malloc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-27-2009 11:36 PM
тАО12-27-2009 11:36 PM
Re: Profiling a program built with and without STLport
Thanks for your answer.
>Any reason you haven't looked into using aC++, even with STLport?
It's a good idea. We'll give it a try. We don't have any compelling reasons to use only gcc.
>Perhaps STLport is using its own allocator and bypassing malloc?
There is such way of allocating memory in STLport but we don't use it. We have built STLport with #define _STLP_USE_MALLOC 1. As you can see in [79] 96% of time stlp_std::__malloc_alloc::allocate spends in libc.so.1::malloc
> If you look closely at your caliper outputs you'll see that for STLport 56.37% of the calls to lock come from __thread_mutex_lock, which comes from other libc functions:
Yes, it's true. I have already realized that. It is likely that it might be optimized.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-27-2009 11:44 PM
тАО12-27-2009 11:44 PM
Re: Profiling a program built with and without STLport
1) with acc and STLport
2) with acc and its own STL
and measure time of processing requests.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-28-2009 12:16 AM
тАО12-28-2009 12:16 AM
Re: Profiling a program built with and without STLport
Are you sure that MallocNG is available for download?
I went to this page: http://www.docs.hp.com/en/5992-4174/ch02s07.html, then I went to http://hp.com/go/softwaredepot, then I searched for SWPACKv3 and finanly got (quote): "0 products shown below matched your search on "SWPACKv3".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-28-2009 06:47 AM
тАО12-28-2009 06:47 AM
Re: Profiling a program built with and without STLport
> Are you sure that MallocNG is available for download?
Yes, see:
http://h20392.www2.hp.com/portal/swdepot/displayProductInfo.do?productNumber=MallocNextGen
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-28-2009 06:58 AM
тАО12-28-2009 06:58 AM
Re: Profiling a program built with and without STLport
Thanks, I will try to install it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО12-28-2009 11:39 AM
тАО12-28-2009 11:39 AM
Re: Profiling a program built with and without STLport
Then the decision to call malloc is made higher in the call tree. Perhaps if you turn off inlining, it may be easier to find.
If you use MallocNG, your STLport version should also be faster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-12-2010 05:45 AM
тАО05-12-2010 05:45 AM
Re: Profiling a program built with and without STLport
>It's a good idea. We'll give it a try. We don't have any compelling reasons to use only gcc.
I can build Boost 1.38 with aCC but it seems I can't build recent versions of the Boost library with aCC. Unfortunately this is a problem for my project.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-12-2010 07:37 AM
тАО05-12-2010 07:37 AM