- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: IA64 dec$ForRTL exp(-z) 2000x too slow for lar...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-05-2008 05:17 PM
04-05-2008 05:17 PM
IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
glacially slow exp(-z) evaluations on IA64?
The exponential function exp(-z) in dec$ForRTL on IA64 takes 2000 times as long to return 0.0 for z > 120. as it does to return 1.8e-35 for z= 80. This can be a real "killer" for particle simulations.
This extreme slowdown is observed for both /Float=IEEE_Float and /IEEE_Mode=Fast. Strangely, when /IEEE_Mode=DeNorm, the slowdown is "only" 40x. The slowdown is about the same for /Real_Size=64 and /Real_Size=32.
On an Alpha XP-1000, the slowdown is only 2x when returning 0.0 for large z, making the Alpha 300x as fast as the AXP in this regime.
Details:
OpenVMS 8.3, Fortran V8.1-10492
FORTRAN Source, Listing, Map and Results at
http://sdphI0.ucsd.edu/Exp_Time_Test_Results.txt
C. Fred Driscoll
Physics, UCSD
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-06-2008 06:58 AM
04-06-2008 06:58 AM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
For giggles, I might turn on alignment fault monitoring and see if that's where all the time is going here. Such faults can be potentially subtle, as the code is working and is producing the correct answer, albeit glacially slow. (And if it is alignment within the RTL, it's HP's code.)
Regardless, this looks to involve a look at and an update to the Fortran RTL, and -- short of an RTL patch kit (which you've undoubtedly already looked for and tried* -- there's not much folks out here can do.
--
*If you haven't already looked for kits, I'd start with VMS83I_FORRTL V3, VMS83I_UPDATE V5, and any other mandatory ECOs not already loaded. These may nor may not cure this (and I'd lean toward "not"), but if you don't have these installed, you'll likely be asked to install them.
Preemptive installation of ECOs is a technique intended to reduce the time spent waiting for the front-line support spinlock to clear.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-06-2008 09:16 AM
04-06-2008 09:16 AM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
Thanks for your thoughtful reply. I had missed the recent ForRTL_v3 update, but I tested it and found no improvement over the
ForRTL_v2 initially tested.
I also checked $Monitor Alignment, and saw essentially NO faults.
I have no service contract, so my hope is that this post will attract the attention of the Fortran RTL engineering team. This bug is certainly worth attention, since it can completely "stall" typical particle simulations in physics research.
Best,
Fred
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-06-2008 09:48 AM
04-06-2008 09:48 AM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
> that this post will attract the attention
> of the Fortran RTL engineering team.
You might also try the official unofficial
complaint Web form:
http://h71000.www7.hp.com/fb_business.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-06-2008 10:59 AM
04-06-2008 10:59 AM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
>complaint Web form:
>
>http://h71000.www7.hp.com/fb_business.html
Thanks, I now submitted the error description.
My (limited) experience has been that when something like this gets to the right person in DEC/Compaq/HP, it gets fixed. Sometimes they even distribute an informal patch that's waiting for an ECO-spinlock to clear :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-07-2008 11:04 AM
04-07-2008 11:04 AM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
The MATH$EXP_S routine that does all the work is hand-crafted Itanium assembly. We'll get this to the right folks.
I can believe that /IEEE_MODE=FAST is slower. In that mode, we ask the hardware to complain more often about IEEE special values. For /IEEE=DENORM, we just close our eyes and let whatever happens, happen.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-12-2008 02:31 PM
04-12-2008 02:31 PM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
Sounds promising.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-14-2008 03:06 PM
04-14-2008 03:06 PM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-15-2008 07:17 AM
04-15-2008 07:17 AM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
For the /IEEE=DENORM case, much of the slowdown is from playing (via system services) with the FPSR settings.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-16-2008 02:57 PM
04-16-2008 02:57 PM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
My point is that there is a value that will cause underflow and there is a value that will not. I would expect those values to be constant. There would be a small constant cost for doing the check before starting the work for cases that won't underflow/overflow, but that cost is probably insignificant compared to the cost of the procedure call, and much less than the cost of an exception.
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-16-2008 11:00 PM
04-16-2008 11:00 PM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
There exist a value L for which exp(L) does not underflow, but for which all x
There exist a value U for which exp(U) does not overflow, but for which all x>U, exp(x) will overflow.
Therefore if L<=x<=U then exp(x) will not cause an underflow or overflow exception.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-17-2008 05:50 AM
04-17-2008 05:50 AM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-17-2008 06:41 AM
04-17-2008 06:41 AM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
The Math RTL routines need to check with their callers to see if they should raise an error or do something else.
On Alpha, that information is inside the procedure descriptor (PDSC$V_EXCEPTION_MODE) with the default being PDSC$K_EXC_MODE_SIGNAL : Raise exceptions for all error conditions except for underflows producing a 0 result.
So even when the RTL checks the input arguments, it has to walk back up the stack to find the calling routine to see if that routine wanted underflow checking or not (F90 has a /CHECK=UNDERFLOW).
On I64, which doesn't have a single data structure like a procedure descriptor, has that information buried down in the unwind descriptors.
Part of the slowdown is the stack walk via the LIB$I64 calling standard routines to see if the caller wants the underflow raised as an error or just mapped to 0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-18-2008 04:57 AM
04-18-2008 04:57 AM
Re: IA64 dec$ForRTL exp(-z) 2000x too slow for large z?
$ time a.out 80
exp(-80.000000) = 1.80485e-35
real 0m02.64s user 0m02.63s sys 0m00.00s
$ time a.out 120
exp(-120.000000) = 0
real 1m44.51s user 0m28.71s sys 1m15.54s
The timing info is real interesting in that it clearly is blaming the kernel and the sloppy hardware that hates denorms.