Operating System - OpenVMS
1839310 Members
2811 Online
110138 Solutions
New Discussion

fseek and ftell on stdout (i.e. SYS$OUTPUT)

 
SOLVED
Go to solution
John McL
Trusted Contributor

fseek and ftell on stdout (i.e. SYS$OUTPUT)

We would like to use C functions ftell() and fseek() to extract records from a log file during execution of a detached process. (This is web access stuff and during development we'd like to post back, via the intranet, the records from the log file.)

Attempts to use C's ftell() on stdout - i.e. SYS$OUTPUT, the log file for the detached process - fail with status EOF when accessing the log file that was defined in the call to SYS$CREPRC but after I use 'freopen(logfile_name, "w+", stdout, "rop=wbh", "shr=get" , "fop=dfw")' to create a new log file the functions ftell() and fseek() work fine on that new file.

Is it possible to do what we want to do with just the original log file or must we use two log files (original plus the one with certain attributes)?

18 REPLIES 18
John Gillings
Honored Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

John,

Figuring out exactly what "work fine" means in this context can be complex! You don't necessarily know when/if you're seeing the latest data in your file, as it can be in several buffers upstream from the disk.

It all depends on the sharing options you specify on the file for both reader and writer, how the file is flushed, and how the EOF is updated. For some combinations, the EOF is NEVER updated until the file is closed by the writer. Bottom line is, if the default SYS$OUTPUT doesn't give you the right sharing options to do what you want, you'll need to open it yourself.

I do something that sounds much like what you're doing. A detached process which tracks the log files of several other detached processes. I scan the contents for interesting looking strings, and, for some files, send every record to a remote system - kind of file/record level "shadowing". We can do monitoring, analysis and display on the remote system without interfering with production processing.

From the READER, I open the files with:

LogFAB: $FAB FAC=GET SHR= XAB=LogFHC NAM=LogNAM FOP=NAM

Note the "SHR=", I'm using RFAs to keep track of the high water mark, and I OPEN and CLOSE the file on each scan (I'm typically tracking several hundred files, so it's not reasonable to keep all the files open all the time). I'm not sure if it's even possible to have RMS update the EOF of a file open for read access, depending on updates from another process, so the OPEN and CLOSE in the READER may be the key to fix your issue?

Since I don't have source for any of my writer processes, I'm not entirely sure how the files are opened. Since you have control of the source of the writer, you're in better shape.

A few things to test... Does TYPE display the contents of the file? What does this do:

$ OPEN/APPEND/SHARE=WRITE LOG your-log-file
$ CLOSE LOG
$ TYPE your-log-file

(the request to append the file forces RMS to update the EOF).


FWIW, here are some stats from one process, monitoring 130 files, scan interval is 0.25 seconds, with a maximum block of 200 records from a single scan. That means the downstream side will see worst case lag of about 30 seconds for an update in a log file.

ELAPSED: 1 13:53:56.26 CPU: 0:06:10.86 BUFIO: 8346195 DIRIO: 761163 FAULTS: 228

RMS stats: $PARSE:32364, $SEARCH:32364, $OPEN:706328, $CONNECT:706328, $FIND:697914, $GET:11128952

So, don't be scared off by the cost of the OPEN. 4 seconds CPU per day is a tiny price to pay (and that includes a whole lot of pattern matching and message dispatches)

Of course, this would be significantly reduced if RMS had a (supported and documented) mechanism to tell me when a particular file has been updated, so I don't have to poll!

Since you have control of your writers... you could take a whole different approach to the general problem. Instead of writing your log files to disk, send SYS$OUTPUT to a "logfile daemon" process, possibly through a mailbox. Have your logging routine on the writer side add an ID (& timestamp?) to each message. What you do with it once the daemon has it is up to you.
A crucible of informative mistakes
John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

Thanks John, I'll do some investigations as well as wait for further comments.

Just one thing at the moment because it might be relevant to those comments ... we have the line "fsync(fileno(stdout));" just before the freopen() call, so I believe that all data is flushed to disk before we attempt to access it.
John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

I got a chance to test it sooner than expected.

"$ TYPE logfile" works fine.

"$ OPEN/APPEND/SHARE=WRITE LOG logfile" fails with message "RMS-E-FLK, file currently locked by another user" (as does any attempt to edit the file, even /READONLY)

When trying to open a new channel to the file with "r" (read) and "shr=get,put", the open works okay, but then the ftell on stdout fails.

I'm tempted to try opening the file "a+" to test if ftell will work on the new channel, but I'll wait for further responses in case they say "Won't work".
John Gillings
Honored Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

John,

>"$ TYPE logfile" works fine.

Aha! I think that's your clue. Don't be too concerned about the writer.

I'm fairly sure your reader process won't ever see an EOF beyond the point it was at when the file was opened, regardless of what you do in the writer. The only way to see an updated EOF is for the reader to close and reopen the file.

A crucible of informative mistakes
John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

John,

As I understand it, the ftell() function should return the byte offset from the start of the file. The value EOF (which I think is -1) is returned when any error occurs and is not an indication that the file is at EOF. Presumably if the file was positioned at the start the returned value would be zero, so any error flags do need to be negative.

Unfortunately it seems typical of C RTL function that different types of errors are not separately flagged, so I'll just have to experiment with various alternatives.
WW304289
Frequent Advisor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

> Unfortunately it seems typical of C RTL function that different types of errors are not separately flagged [...]

The C RTL functions differentiate the errors using errno.

-Boris

x.c
---
#include

int main() {
if ( ftell(stdout) == -1 )
perror("ftell");
}

$ run x
ftell: illegal seek
$
John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

WW304289, thank you but it just demonstrates my point.

I'd like something like
- SS$_NORMAL (or equivalent)
- Warning - position is past EOF
- Warning - position reset to start of current record
- Error - attempt to access before start of file
- Error - operation not permitted on this file type

etc.



John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

Progress report. (Earlier attempt when blank when I hit preview so I guess it's lost.)

I've managed to extract the log file records by the following method.

(a) Set-up
1 - Open new channel (for Read) on log file, positioning at EOF
2 - use ftell() and save offset for EOF
3 - close channel

(b) to extract
4 - Open new channel (for Read) on Log file
5 - use fseek() to set position saved in step 2
6 - Read forward to EOF (i.e. get extract)

I'd rather open the second channel just once, but it seems the EOF is recorded when I open the file for reading and later I can't read any new records written past that old EOF point.

John G, does RMS give me the freedom to control (or worki around) this? I'd prefer to use RFA's to position the file pointer anyway, but then how do I get the RFA of the initial EOF point?
John Gillings
Honored Contributor
Solution

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

John,

As much as we would like to reduce the number of opens, I don't think it's possible.

I remember going into this for a customer case some years ago. Ultimately the answer was, "you need to reopen the file to get the new EOF". I couldn't find anything, even undocumented/unsupported, to get around that. See source code for TYPE/TAIL/CONTINUOUS as an example.

From my stats, opens aren't the performance bogeyman that one might expect, at least for this type of application.

To find the RFA of the last record before EOF you either need to scan the file with $FIND, or open the file with ROP=EOF and try to work your way backwards (see source of TYPE/TAIL for some heuristics in locating records backwards... it ain't pretty!).

Given the position of the EOF, I believe you could fabricate an RFA for a record written after that position. A few experiments should reveal the secrets. Obviously not supported, but also not something engineering would dare change!

(My log file tracker always reads from the beginning of the file. Even when recovering, I just read to EOF and keep track of the RFA. Even though I'm following numerous files, some of which are quite large, timing is fast enough that I didn't think it worth the effort to try to do anything more heroic)
A crucible of informative mistakes
H.Becker
Honored Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

>>>
I'd like something like
- SS$_NORMAL (or equivalent)
...
<<<

Boris pointed out that there is additional information in errno.

You probably know that ftell() is not a VMS system service or a VMS RTL function. The behavior is defined by the ISO C standards. There it is said that errno can hold (extracted from errno.h):

#define EBADF 9 /* Bad file number */
#define ESPIPE 29 /* Illegal seek */
#define EOVERFLOW 88 /* Value to large for datatype */

What Boris showed with perror() was a ESPIPE.

There may be extensions and depending on the function and the value in errno you may find VMS specific information in vaxc$errno. The CRTL documentation should have the details. But there is no way to have this function behave like a VMS system service.

From what I read (and understand), you just want to position in a log file: one writer and one or more readers. Except for sys$output and if I correctly recollect, this seems doable in C, on VMS. What's the usual discloser, here?
John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

Thanks fro the explanation about perror, H. (That was Boris, was it, who posted the message with a cryptic name?) I still don't regard "illegal seek" as a particularly informative message. Okay, my expectations are high because I've been on VMS for so long. But that's all beside the point for teh real question I had.

As I mentioned above, we have a workable solution. It's something that we'll use to extract diagnostic information on an ad hoc basis. Good performance would be nice but isn't critical; medium performance is fine because it will have little impact on other processes.

That said,... I mentioned how the fopen() seems to retrieve an EOF marker (Max allocated block?) that prevented me reading beyond that point even when more data had been written there.

(Reminder: we're accessing SYS$OUTPUT which is being written by other functions in, or called by, the same image. This extracting is an occasional thing and requires a start point and a later read of SYS$OUTPUT from that start point)

Is there any way to remove or reset this EOF marker so that we can read forward from our initial point? This would mean just one OPEN and one later CLOSE of the "extract" channel into the file SYS$OUTPUT.
Hoff
Honored Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

I'm getting the distinct impression you're familiar with Unix programming models and norms, and that you're looking to apply that to OpenVMS.

I'd probably punt on using the C RTL for this stuff - I usually punt on the C RTL whenever I'm trying to do more than what is the norm on a Unix box - and would go directly to RMS.

Yes, using RMS directly for the first time is daunting; the API is massive and complex. But then I have available a set of wrappers for calling RMS from C at the HoffmanLabs web site. The stuff deals with sequential and indexed formats, sharing and other such. Full source code. BSD-style license. Search for NEWUSER over there.
John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

No Hoff, I've been on VMS since 1979 and over the years have moved from technical programming (Fortran) to system management and recently back to technical programming (C).

I had to read your email twice. "Take a punt" over here means "to bet on" and is an expression of endorsement of an action but I think where you are it means "give it the boot" (i.e. reject it).

As I said we have a workable solution but it would be better if what appears to be an EOF limit (is it max allocated block?) that seems to be recorded during the fopen() (or RMS SYS$OPEN) can be reset. Do you know if this is possible and if so what value means "unknown"?
Craig A Berry
Honored Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

Note that "a successful call to fseek() shall clear the end-of-file indicator for the stream" according to the docs on the standard at:

http://www.opengroup.org/onlinepubs/000095399/functions/fseek.html

I haven't tried to prove that the CRTL follows the standard, but you may want to before giving up -- something like fseek(fp, 0, SEEK_CUR) before calling ftell(fp).

Also, it appears from the discussion that this log file may be record-oriented, so be sure to read the CRTL docs on caveats associated with positioning on a record boundary rather than a byte offset when using ftell().
Hoff
Honored Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

This file-positioning stuff is a Unix-style solution. Add that the Unix emulation inherent in the CRTL I/O path often adds an extra layer of "fun" (eg: complexity, hairiness, slowness, weird behavior) to this approach.

I'd walk forward on the records until I hit the EOF using RMS, and would then wait and retry the reads at intervals. IIRC, some of the older OpenVMS stuff required you to close and reopen the file when you hit EOF, but I think that may have gotten fixed. (And if it didn't, then the RFA will get you back to the location quickly.)

Or I'd punt^H^H^H^Herr, is "scrap it" OK? entirely, and use something like syslog or analogous, or some locally-built analog, and send the activity data over to a remote server for evaluation and processing.
John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

Craig, I'm pretty confident that I tried that and it still failed to read past that point. I'll try to re-test that and get back to you.

I started out trying to use ftell and fseek because from the returned values I could get the total number of bytes that had been written to the file and use that to malloc() before an fread() of all the data. Now I've moved to fread()s into blocks of 8192 bytes and realloc() (and modify the ptr to the input buffer) if I need to read more.

However, if a fseek() clears the EOF marker and I can open and close the "extract channel" once rather than twice it would be nice.

Hoff - what you describe sounds similar to what John Gillings is using. I'm running out of time on what I'm doing and my use is from within what's writing the logfile, which makes the sequencing easier. I'll put your idea aside for a future generic "file record extractor".

John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

Craig,

I tested the sequence open - fseek, then later read - close, but it failed to extract any records from the log file, which probably meant that the channel hit EOF immediately.

This suggests that fseek (fp,0,SEEK_END) and the fsee(fp,0,SEEK_CUR) I tried in a previous image did not clear the EOF flag.

John McL
Trusted Contributor

Re: fseek and ftell on stdout (i.e. SYS$OUTPUT)

Thanks to all. I have a workable solution, even if it does mean two file OPENs - but that shouldn't be an issue because the the file should still be in a cache somewhere down the line to the disk and the penalty not be onerous.

At least I've moved away from the old solution, which was the C function freopen using a new filename, which of course that meant overheads for the processing of the directory immediately and, because of the extra files, ongoing issues.