Operating System - OpenVMS
1748011 Members
4756 Online
108757 Solutions
New Discussion

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write

 
BenAArmstrong
Frequent Advisor

Incorrect file size returned in JRuby (Java in general?) for file open for write

In JRuby when appending to a file where sync=true, I'm getting strange behaviour that makes me believe the file size isn't being read correctly on the currently open file. And indeed, in the Ant release notes at http://h30266.www3.hp.com/odl/axpos/progtool/hpant171/ANT_IGUIDE_RN.PDF, I find this:


Known Issues


• Failures with several org.apache.tools.ant.types.selectors that rely on obtaining the current file size from “C” functions fstat() or stat(). These functions fail to return the correct file size under the following condition:

 

    o Files still open for write and have been modified

 

This makes it sound like it's not a problem peculiar to JRuby, but that applies to Java on OpenVMS in general. So I re-read the JDK User Guide to see if anything is mentioned there about the problem. It isn't, but I (re-)discovered that if you define java$file_open_mode 2, every write is synced to disk. Well, that's what "sync=true" is supposed to be doing anyway, so maybe this points towards a solution ...

 

Any other workaround I may be missing?

 

Ben

 

 

15 REPLIES 15
John Gillings
Honored Contributor

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write

Ben,

 

   An example would help understand your issue. What is being used to determine the size? What size is being returned? Are you reading it from the process that has the file open?

 

Remember that RMS is supremely lazy. It doesn't update anything unless it's absolutely necessary.

 

Please give more detail about what you're doing, what results you're getting and what results you want. Cut and paste exact text of commands issued and messages returned would be preferred.

A crucible of informative mistakes
Ph Vouters
Valued Contributor

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write

Ben,

 

C runtime based languages such as Ruby or JRuby via Java use the lstat, stat or fstat C functions. The struct stat field being used in this case is size. Even Unix compatible ls bases its code upon stat and lstat. Have a look on your OpenVMS system at $ help  crtl {stat | lstat | fstat}. Query Google about these C functions examples. Querying Google, this may lead you to some of my articles on my Web site.

 

Philippe

Ph Vouters
Valued Contributor

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write

Ben,

 

As a complement to your question.

 

In Ruby or JRuby you might have to indirectly use the C runtime fflush + fsync functions to force the underlying file system to a disk physical write. fflush + fsync are Ruby's IO clacc methods. Refer to http://www.ruby-doc.org/core/classes/IO.html Using fflush + fsync on an opened file may then return you a correct file size using {stat | lstat | fstat ] stat struct size field.

 

Just a guess about what you're doing and observing.

 

Philippe

BenAArmstrong
Frequent Advisor

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write


@John Gillings wrote:

Ben,

 

   An example would help understand your issue. What is being used to determine the size? What size is being returned? Are you reading it from the process that has the file open?

 

Remember that RMS is supremely lazy. It doesn't update anything unless it's absolutely necessary.

 

Please give more detail about what you're doing, what results you're getting and what results you want. Cut and paste exact text of commands issued and messages returned would be preferred.


I have been hard at work in parallel with this query trying to distil from the JRuby code a pure Java example that would demonstrate the issue. I have also enlisted the help of the JRuby developers to try to refine my Java example (which currently fails to reproduce the problem) into one that actually reproduces the problem. I could provide you with that code, but since it doesn't reproduce the issue, I'm sure it won't be much help. I will therefore provide you with the Ruby code, but I have my doubts about whether that will make things any clearer, as I don't expect many reading this to actually run JRuby on OpenVMS (well, with the exception of Philippe):

 

File.open('output.tmp','w'){|_|_.puts 'Line one'}
File.open('output.tmp','a'){|_|_.sync=true ; _.puts 'two'; _.puts '3'}
File.read('output.tmp')
=>  "Line one\n\n"
# expected output: => "Line one\ntwo\n3\n"

It appears that after every write, the file position is reset to the old end of file, not the new end of file, and that file writes are happening before and after each newline is emitted.

 

When I find the actual Java construct that does this I will assemble it into a working example which I will post into this thread. But since it has proven difficult to find so far, I had hoped to gain a bit more insight by posting here, drawing on the collective experiences of those who know OpenVMS Java better than I do.

 

Ben

p.s. One theory I have explored with the JRuby developers is that since JRuby does different things if the file is opened read/write/append vs. write/append, changing the mode from 'a' to 'a+' might change the behaviour. Unfortunately, this theory didn't pan out. The behaviour is the same regardless of which mode is used.

BenAArmstrong
Frequent Advisor

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write


@Ph Vouters wrote:

Ben,

 

As a complement to your question.

 

In Ruby or JRuby you might have to indirectly use the C runtime fflush + fsync functions to force the underlying file system to a disk physical write. fflush + fsync are Ruby's IO clacc methods. Refer to http://www.ruby-doc.org/core/classes/IO.html Using fflush + fsync on an opened file may then return you a correct file size using {stat | lstat | fstat ] stat struct size field.

 

Just a guess about what you're doing and observing.

 

Philippe


Yes, I did read this already:

 

         Also be aware that for st_size to report a correct value, you
         need to flush both the C RTL and RMS buffers.

 

But I did not know what to do with this information, as so far as I know, Java does not provide me with the means to directly flush RMS buffers. Also, since I have not yet found the exact line in the JRuby source responsible for misreading the file size, but have only inferred this is what is happening from the test results, I don't even know if the code may already be doing an fflush + fsync.

 

By the way, I did not decide to explicitly set sync=true in any of our application code. That is done for me by the standard Ruby library Logger class. Here is a simple failing test case using Logger:

 

require 'logger'

log=Logger.new 'output.tmp'
log.info 'Test'
log.info '2'
log.close
puts File.read('output.tmp')

failing output:

I, [2011-09-28T07:58:11.724000 #227180285]  INFO -- : 2
st
er.rb/99999

expected output:

# Logfile created on Wed Sep 28 07:55:37 ADT 2011 by /
I, [2011-09-28T07:55:41.610708 #539703601]  INFO -- : Test
I, [2011-09-28T07:55:44.973787 #539703601]  INFO -- : 2

 

When I examined the logger.rb code, I found that the line of code that makes the difference is "logdev.sync = true" in the following excerpt:

 

    def create_logfile(filename)
      logdev = open(filename, (File::WRONLY | File::APPEND | File::CREAT))
      logdev.sync = true
      add_log_header(logdev)
      logdev
    end

 

Since Rails uses Logger, the result was that every line of output Rails wrote to the log "piled up" at the beginning of the file, each line overwriting the previous one, as they all were apparently written with the file position reset to 0 each time.

 

Ben

 

BenAArmstrong
Frequent Advisor

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write

By the way, I do have a workaround for this issue now, but one I'm not happy with. I have noticed that if I force synchronous writes with define java$file_open_mode 2, the bad file positioning bug vanishes. However, since this is a global control that not only applies to the log file but to all files I might write from Rails, and there's a performance hit to making all writes synchronous, it is a really ugly workaround. It will do for now, but I am seeking a better solution that I can target at just the failing case without such unpleasant side-effects.

 

Also, as I have mentioned in another thread, I have some code which needs to do file sharing. As a quick hack, I would have used java$file_open_mode = 3, except now that I need to put this java$file_open_mode = 2 workaround in place for the sake of correctly writing the log file, I need to use java$filename_match_list instead (though as discussed in that thread, it is a better solution anyway, since it is focused on just the files that need to be shared, and therefore has less of a chance of causing bad side-effects). OK, so now maybe I'm splitting hairs because I already have a workaround for this, but isn't it odd that the two settings have been made mutually exclusive, since it might have been useful to use both together?

 

Ben

 

H.Becker
Honored Contributor

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write

In Ruby I would try an fsync. In your initial example:

 

File.open('output.tmp','w'){|_|_.puts 'Line one'}
File.open('output.tmp','a'){|_|_.puts 'two'; _.puts '3'; _.fsync}
File.read('output.tmp')
In the logger example, because the Logger object has no fsync method you may be able to create a logger from a file, like
file = File.open('output.tmp','a')
log = Logger.new(file)
...
file.fsync
log.close
BenAArmstrong
Frequent Advisor

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write


@H.Becker wrote:

In Ruby I would try an fsync. In your initial example:

 

File.open('output.tmp','w'){|_|_.puts 'Line one'}
File.open('output.tmp','a'){|_|_.puts 'two'; _.puts '3'; _.fsync}
File.read('output.tmp')
In the logger example, because the Logger object has no fsync method you may be able to create a logger from a file, like
file = File.open('output.tmp','a')
log = Logger.new(file)
...
file.fsync
log.close


Yes, I realize that by not making all writes to the file synchronous with sync=true and then explicitly fsyncing where needed I can get equivalent behaviour without the bug, but this is far too intrusive a change. I need something that makes 'sync=true' work properly on VMS, so that I don't need to hack the core ruby library & any third-party code that uses the construct with VMS-specific exceptions, but something that is not as drastic as forcing all Java writes to be synchronous.

 

Ben

 

H.Becker
Honored Contributor

Re: Incorrect file size returned in JRuby (Java in general?) for file open for write

As far as I understand the documentation, sync=true flushes the Ruby buffers, not the CRTL and/or RMS buffers. fsync calls the CRTL (or OS) supplied fsync, which ensures the data is written to the disk. 

 

Is there a VMS system on the net with JRuby to do some tests?