Postmark benchmark on OpenVMS

SDIH1 · ‎04-23-2007

Has anybody run the Postmark file system benchmark on OpenVMS?

Can anybody confirm transaction/seconds being at best 300/s, on Itanium/EVA4000 combination, where
Sun and Linux get at least 3 (and sometimes 10) times as many on a Xeon/Sun Serve system?

This is not meant to offend anyone, but a serious concern...

Hein van den Heuvel · ‎04-23-2007

Beste meneer or mevrouw sdi-1,

(wat dacht je van ten minste een ondertekening me een voornaam? :^)

The OpenVMS filesystem is non-logged based.
Attempt to provide a logbased one "The New File System, aka Dollar, failed 10+ years ago.
To provide the reliability and consistency OpenVMS application designers have come to rely on it performs several synchroneous IOs for each file created. For sure:
- write header
- write data

but often also
- update free extent bitmap
- update index file bitmap
- write directoy block
- shuffle directory blocks to make room
- re-write new file header.

Unix file systems 'commit' a new file meta data with a single Io and make a best effort for the data.... often good enough.
If you were to SYNC 'all the time' on Unix, then the performance would likely become similar.

>> Can anybody confirm transaction/seconds being at best 300/s, on

Unfortunately I could not readily find postmark benchmark descriptions. I though that was by NetApp, but could not find a good reference there.
Please help us out here and provide some links
- to the benchmark
- to sample results
- maybe your won results?

If a transaction is a full file create, then 300/sec sounds like a reasonable upper limit expectation for OpenVMS and if you need more then OpenVMS might not be the right platform to choose.

Met vriendelijke groetjes,

Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting

SDIH1 · ‎04-24-2007

Sorry, this is a group user, so no proper name would really be appropriate. For your pleasure, I added it :-).

I found the Postmark benchmark here, after having been redirected unsuccessfully to the NetApp website:
http://ftp.debian.org/debian/pool/main/p/postmark/post
mark_1.51.orig.tar.gz

I attach a zip file containing the C source
with the minimal changes needed to make it compile under OpenVMS, and a file containing the test parameters I used.

To be honest, I didn't do I thorough analysis
of what the Postmark benchmark actually does,
roughly it creates a number of files called 1,2,3... in a single directory(!) does some opens, writes and appends to it, and then deletes the files.

I got to this in a Whitepaper at the Sun website http://www.sun.com/software/whitepapers/solaris10/fs_performance.pdf on their really soon to be expected new unix filesystem, where they benchmarked
several Unix filesystems on a Xeon configuration. Initially, I was wondering why they left out Windows from such a hardware
configuration. Now I know: running the benchmark on my PC ( NTFS, XP) corrupted the filesystem ( CRC errors...).

Also, I was curious to find out if they picked the most suitable benchmark for their
solution: creating files in a lexicographical order wreaks havoc in a lot of cases, especially in the OpenVMS directory case, and also in a btree based solution ( you don't use half the tree in such a case).

Then I wondered how the benchmark would behave on OpenVMS, and I was a bit disappointed, but wasn't too sure about the validity of the results, hence my question.

Wim Van den Wyngaert · ‎04-24-2007

Antenna tuning but make sure high water marking is not activated on the disk.

Wim

Wim

Hein van den Heuvel · ‎04-24-2007

Jose,

Thanks for the details. I'm sure it is not just me who appreciates performance benchmarking info, specially if it can apply to OpenVMS.

>> Sorry, this is a group user, so no proper name would really be appropriate. For your pleasure, I added it :-).

Thanks. But it is for YOUR pleasure. I'm sure several readers here with potentially valuable advice will refuse to help folks who can not be bothered to sign with their name.

Also... why this notion of a 'group account'. Just grab your own! Put the 'group' in the profile (company field perhaps)

cheers,
Hein.

SDIH1 · ‎04-24-2007

Checked this already. This is indeed 'antenna tuning', 1 dB gain at most...
From previous endeavours I think most important issue is the directory shuffling going on, in an application that produced 40000 little files I was able to get about 5 times the performance by creating subdirectories in a subprocess causing the subdirectories not to contain more than about 30 files.
Interesting difference between ODS-2 and ODS-5 disks (showing only the first summary lines in the report of the benchmark):

ODS-2:

PDCC0A >> define/user sys$input pc:
PDCC0A >> pm
PostMark v1.51 : 8/14/01
pm>pm>pm>pm>pm>Creating files...Done
Performing transactions..........Done
Deleting files...Done
Time:
26 seconds total
13 seconds of transactions (192 per second)
....
ODS-5:
PDCC0A >> pm
PostMark v1.51 : 8/14/01
pm>pm>pm>pm>pm>Creating files...Done
Performing transactions..........Done
Deleting files...Done
Time:
34 seconds total
16 seconds of transactions (156 per second)
....

Although actual numbers vary, the trend remains, ODS-5 is about 25% slower than ODS-2 doing this kind of stuff...

Hoff · ‎04-24-2007

As a general rule...

File create and file delete perform poorly on OpenVMS. Linux and Unix on equivalent hardware will usually win file I/O performance contests hands-down.

If you want raw file I/O performance, you'll likely want to use Unix.

If you want crash-integrity, you'll likely want to use OpenVMS.

OpenVMS file I/O tends to default toward file system integrity and to write-through, while Unix and Linux default to speed and to write-back caching. This is not to imply that you can't make OpenVMS I/O go faster, or Unix select for higher integrity and write-through, nor that either approach is The One True File Approach. These are simply design trade-offs.

There have been various large-scale discussions of the file system and file I/O performance in the comp.os.vms (c.o.v.) newsgroup over the years. Search that newsgroup for "mybenchmark" as a start, and from there skim the various discussions. (And mybenchmark runs on various platforms, including on OpenVMS.) I'd start with these newsgroup discussions too, as they will likely cover most everything that can be written about this topic...

Stephen Hoffman
HoffmanLabs LLC

SDIH1 · ‎04-25-2007

Thanks everybody. The mybenchmark discussion has said it all, basically.
Maybe you could also look at it this way: Using the filesystem as a database is a workaround for not having a proper database, or index-sequential files.
Apparently a lot of effort is put into optimizing this workaround.

Keith Cayemberg · ‎04-25-2007

No, OpenVMS has never needed a "workaround" to have a proper database regardles of how you define "proper database" (Relational, Object, Object-Relational or Codasyl). VMS was one of the first OS to have integrated database services in the form of Oracle Rdb. Also Oracle originally developed their flagship Oracle "Classic" database on VMS before it ever existed on any Unix. Sybase also originated on OpenVMS to my knowledge.

As Hoff already stated the default integration of a database-like Record Management Services layer was an intentional design decision/trade-off to increase reliability. Providing the OS with a universal standard service which has knowledge of the correct interpretation of records and fields allows the automated capture of errors. In our complex CIM environment this capability has repeatedly saved the company damaging propagation of bad data through the production chain which was passed to VMS from less determinant environments (where SAP was installed). RMS also provides a default cluster awareness for all applications since it the cluster file system interface so that programmers need not be aware of, or directly access, the Distributed Lock Manager.

If your application doesn't need RMS, then it can be programmed to go around it and use the QIO$ System Services directly and coordinate it's distributed locks (if needed) itself by ENQUEing the locks directly with the DLM. Oracle Rdb does this and is very competitive to Unix-based RDBM's. The last time Rdb was officially benchmarked against other commercial databases in 1994 it blew away the competition running on the same hardware. Rdb has also improved on this greatly since then inventing new techniques that were first migrated to it's classic brother many years later, if at all.

Also note that storing your files in a text LIBRARY also will circumvent RMS and provide increased performance if needed.

Keith Cayemberg
Consultant
Wipro Technologies

SDIH1 · ‎04-25-2007

The librarian, didn't think of that yet.

What I would like is to simulate a
hierarchical structure, that is the good thing about the file system, so I was thinking of an index-sequential file with as key a bogus filename. Most files are shorter than 20k, so they would fit in a record.

I'm not sure if it's RMS perse that's causing
the overhead in this, I still have the feeling that most time is spent updating the directory file when creating and deleting files, which is more part of the file system than of RMS. (Isn't it?)

I googled around a bit, but could not find
any evidence to support this feeling, nor anything against it. I guess I'll have to
test it myself. Or does anyone have a pointer?

Daniel Gustafsson · ‎04-25-2007

That benchmark opens files with standard fopen/open (without any of the needed VMS kludges to be competetive). Unless one have other VMS and RMS system parameters changed those open kludges are needed.

Some of the things you can test...
SET VOLUME/NOHIGHWATER
SET RMS/BLOCK_COUNT=124
SYSMAN RMS_SEQFILE_WBH 1
SYSMAN ACP_DATACHECK 0
ENABLE SCSI WRITE CACHE ;)

As for databases, I would argue that OpenVMS's file system is actually better than Windows and UNIX for real database systems. Windows have a journaling file system and most UNIXes also today, which is not what a database system want.

There are reasons why Windows based and most UNIX based TPC-C benchmarks have to use raw device and bypass the filesystems completely. That is not needed on OpenVMS.

/Daniel - Mimer SQL Development

Hein van den Heuvel · ‎04-26-2007

Jose>> Using the filesystem as a database is a workaround for not having a proper database, or index-sequential files.

Right on.

Jose>> The librarian, didn't think of that yet.

On RSX-11M that was a solution even more worth considering, as you could 'open' blobs stored in a library as psuedo files.
So libraries were sort of a light file system within a robust (ODS-1) file system.

Joe>> What I would like is to simulate a
hierarchical structure, that is the good thing about the file system, so I was thinking of an index-sequential file with as key a bogus filename. Most files are shorter than 20k, so they would fit in a record.

Absolutely. Go for it!
That's fine usage of RMS indexed files.

And you could overflow large files to real files, like OpenVMS Mail does (except they overflow too soon @1500 bytes), or you could use extention records or (part-1, part-2,...), or just use duplicates indicating 'more' (RMS keeps dups in order of arrival).

RMS will provide the hierarchy, and a little bit of compression. It will also offer alternate keys to the file.

RMS will give you much better control over caching than a directory structure would: pick the optimal bucket sizes for data and index, select your own local buffers and global buffers and such.

>> I'm not sure if it's RMS perse that's causing
the overhead in this, I still have the feeling that most time is spent updating the directory file when creating and deleting files, which is more part of the file system than of RMS. (Isn't it?)

All the overhead in the Postmark test is in the file system. RMS really only gets involved with the actuall data record writing which has to happen no matter how you turn it and is a minor component in the scheme. RMS can also get involved with wildcard file lookup (not for staight names) and used to have this 128 block directory size performance limitation but that's long since fixed.

>> I googled around a bit, but could not find any evidence to support this feeling, nor anything against it. I guess I'll have to test it myself. Or does anyone have a pointer?

What exactly do you want to test here?
If the benchmark is written in C and writes 'simple' (stream-lf) files then the C RTL actually does RMS Block IO, not record IO. Block IO is an unmeasurably thin layer around SYS$QIO. The C RTL does its own thing because too many C program pushed a byte at a time toward RMS and the trip into EXEC mode and back which RMS needs to implement protected file sharing proved too costly for that. A good sized record at a time warrants probing buffers and control structures, but to have to do that every few bytes gets expensive.

Anyhow... if you are worried about RMS overhead vs system vs user, then all you need to do is MONITOR MODE during the run.
RMS is the EXEC MODE component.

WAG: I suspect your benchmark to show 3% User mode, 1% Exec, 30% Kernel, 5% Interupt, rest Idle = Wait

Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting

Hein van den Heuvel · ‎04-26-2007

Forgot to add something Jose probably knows... RMS indexed file record MUST entirely fit in a bucket. The Maximum bucket size is still 63 blocks and will not change (127 blocks = 16 bits would be readily conceivable). Thus RMS solutions are stuck with a maximum record size of about 32,000 bytes.

And you would want to take the key design, bucket size selections, and pre-allocations and such more serious than ever

Daniel>>
Unless one have other VMS and RMS system parameters changed those open kludges are needed.
Some of the things you can test...
SET VOLUME/NOHIGHWATER

Beg to differ! Bad rumor!
Yes, highwatermark can have a (performance) effect on certain usages, but just writting data sequentially into sequential files is NOT influenced by highwater marking

>> SET RMS/BLOCK_COUNT=124

The C RTL indeed listens to this to some degree.

>> SYSMAN RMS_SEQFILE_WBH 1

I don't this C RTL listens to that.

>> SYSMAN ACP_DATACHECK 0
>> ENABLE SCSI WRITE CACHE ;)

Yeah the scsi (disk level) write cache always was a big easy win for Unix workstations. They enabled it, VMS did not and even (used to?) refused to mount disks marked as such.

A real database implementation can not accept scsi write cache nor an unified IO buffer cache of other sorts.

>> As for databases, I would argue that OpenVMS's file system is actually better than Windows and UNIX for real database systems. Windows have a journaling file system and most UNIXes also today, which is not what a database system want.

Argreed from that perspective.
A good, quick FORK would help creating session quickly, allthough multythreading and AST coding can strech that nicely.

>> There are reasons why Windows based and most UNIX based TPC-C benchmarks have to use raw device and bypass the filesystems completely. That is not needed on OpenVMS

Right on.

Hein.

SDIH1 · ‎06-11-2007

After having been distracted by other work, I found the time to do a quick test on Itanium and EVA4000, with an indexed sequential file
with 2064 byte records and a key of 64 bytes to reflect the virtual filename.
Writing 10000 records (even after the file already contains 120000 records) takes a stable 3.6 seconds. This is 2777 records a second.
This is without really optimizing on bucketsize, rms buffers, or what have you.

Nothing to be ashamed about and definitely within the range of the Postmark
benchmark results on Solaris, with the added advantage of record locking, being sure data is on disk etc.

Apparently somewhere around the 3000 ops/second number a hardware limit is reached.

Thanks all for the inspiring comments.

Dean McGorrill · ‎06-11-2007

>the overhead in this, I still have the feeling that most time is spent updating the directory file when creating and deleting files, which is more part of the file

absolutely if there are many files. just
ask a old time vms user who'd let thousands
and thousands of his mail messages pile up in
one directory :)

SDIH1 · ‎06-11-2007

I close the thread, most relevant issues
have been covered. It demonstrated (once again) that a benchmark is useful, but not always for the purpose it is designed for.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Postmark benchmark on OpenVMS

Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS

Re: Postmark benchmark on OpenVMS