Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

VMS Unzip doesn't adjust for GMT

 
Hoff
Honored Contributor

Re: VMS Unzip doesn't adjust for GMT

This project involves incrementally building your own distributed version control system (DVCS; distributed source code control).

Part of the homegrown DVCS system is entirely balanced atop zip compression.

Why "balanced"? Any changes within zip that touch the format or the archive headers within the zip archives (which is very close to any change) will cause the checksum calculations to diverge; it locks the scheme into specific zip versions. On both ends. An OS upgrade or a patch on the Unix box will derail it, for instance.

(Or are you checksumming the zip listings?)

One potential approach here (toward building your own zip-based DVCS) would involve checksumming the individual files and to then zip the contents of the files and the per-file checksums on both platforms; to associate those with and within each zip archive. This gets you free of dependencies on zip itself. And any recipient of the transfer can verify the checksums against the files. (I'd look to embed a shell script into the zip that performed this comparison, too, and would invoke it under bash on Unix and under gnv bash OpenVMS.)

Regardless, a longer-term approach here is to install or migrate to Mercurial (Hg) or Subversion or other distributed source code control environment, and the associated clients. There are various solid DVCS packages around; good, efficient, complete and robust solutions. And these solutions let you get back to working on your product.

I specifically mention Hg and SVN as there are OpenVMS clients. There are other good DVCS packages around, and those too may have clients OpenVMS.
Steven Schweda
Honored Contributor

Re: VMS Unzip doesn't adjust for GMT

> Zip archives contain a local (DOS) date-time
> value for each member, so I gather that
> you're fooling Zip with the same TZ-setting
> trick (as described above) if you're able to
> create identical archives from files which
> have different actual local date-time values.

And, it occurs to me, if you're creating all
these Zip archives on UNIX(-like) systems,
and if you're fiddling with the time-zone
settings there to get the local date-time
values to match, then wouldn't that still
solve the problem even without tossing out
the UTC date-time info?

There's a conflict between a VMS system
(which is based on local time) and a
UNIX(-like) system (which is based on UTC),
but I don't immediately see why two
UNIX(-like) systems with the same time-zone
settings (to equalize the local date-time
values) would do anything different with the
UTC date-time extra fields. If I'm missing
something, please feel free to enlighten me.
Douglas Rupp
Advisor

Re: VMS Unzip doesn't adjust for GMT

Here's an edited (see square brackets) excerpt from a discussion that might help clarify. I'm ">>" by the way.

You may gather that we are going to have to agree to disagree on this issue, and find a way to workaround it (we have one in mind).

-------------------------------------------

> > The content is the bits in the source files which are identical. It
> > seems like this is putting an artificial requirement on the container
> > file.

[This seems like a conceptual inconsistency in zip]. If the contents are the
same, creating a container with those contents should be an idempotent
operation; the result should not depend on the time of day [....]

> > I'd like to understand the technical reason for that

We're working on tracing content from SVN all the way to distributed
binaries. We checksum source files to make sure the same file was used by
all machines on the same build. When we distribute a binary ... we need to identify the source files that go with it, and a checksum is the best way.

We used to create packages on one side [of the Atlantic] and transfer them to the other, which is a big waste of time (and a [single point of failure]). We've upgraded our infrastructure so we can build packages on both sides, so if the trans-Atlantic network connection is down both offices can still do nightly builds with the same sources -- easily verified with checksums.

Having two different checksums for the same contents adds confusion and does not fit in well with [this] new infrastructure [scheme]
Steven Schweda
Honored Contributor

Re: VMS Unzip doesn't adjust for GMT

> You may gather that we are going to have to
> agree to disagree on this issue,

Perhaps, but as I said in my posting of
Nov 21, 2009 13:45:32 GMT, I don't see why,
with suitable time-zone fiddling, two
UNIX(-like) systems couldn't create identical
Zip archives which _include_ UTC date-time
data, and those data should solve the bad
date-time values on a VMS system.

> and find a
> way to workaround it (we have one in mind).

Now you're scaring me, but that may be caused
by my overactive imagination combined with a
lack of actual information.
Douglas Rupp
Advisor

Re: VMS Unzip doesn't adjust for GMT

Could you please elaborate on the "time zone fiddling" suggestion?
Hoff
Honored Contributor

Re: VMS Unzip doesn't adjust for GMT

>Could you please elaborate on the "time zone fiddling" suggestion?

I read that to have been a comment indicating that it appeared likely that two Unix boxes could be configured to encounter the same md5 mismatch; that the same issue being chasing here between the Unix box and the OpenVMS box appears possible between two Unix boxes. That the same md5 checksum mismatches can arise between two Unix systems and without OpenVMS involved here. That the md5 mismatches can arise between most any systems due to differences in the timekeeping settings between arbitrary systems.

And it could well be possible to see this md5 stuff get stomped on OpenVMS due to a DST switch-over, for instance.
Steven Schweda
Honored Contributor

Re: VMS Unzip doesn't adjust for GMT

> Could you please elaborate on the "time
> zone fiddling" suggestion?

Hey. It wasn't _my_ idea. You started it:

Nov 20, 2009 19:26:34 GMT
> [...]
> On Unix, it is possible to get the proper
> timestamp back by setting TZ
> variable before calling unzip. [...]

What I don't understand is this:

Nov 21, 2009 02:49:26 GMT
> The zip files don't match unless the extra
> timestamp info is stripped.

My claim (subject to contrary evidence) is
that if you're on a UNIX(-like) system, then
the UTC date-time data stored in a Zip
archive should always be the same, so if,
when you _create_ the archive, you fiddle
with the time zone to get the local times to
match, then all the date-time data in the
archive should match. I don't see what
stripping out the UTC data buys you (other
than trouble, when you unpack the stuff on a
VMS system).

I think that you should be able to:

1. Create a (tiny test) archive in Europe,
on a UNIX(-like) system, without using "-X".

2. Ship it to the US, and extract the files
on a UNIX(-like) system.

3. Re-create the archive in the US, on a
UNIX(-like) system, using the same Zip
version, from those extracted files, but with
an artificially European time-zone setting.

I don't see why the date-time data in this
re-created archive should differ from those
in the original archive. Show me actual
data to persuade me otherwise.

An interesting problem arises involving the
date-time data for directories. Currently,
UnZip will restore the date-time for a
directory, even though the new directory is
actually created anew, not restored from a
directory which was stored in the archive.
Consider, for example, a case where only some
of the files in a directory are stored in an
archive. When this archive is expanded, the
re-created directory will have contents which
differ from those of the original directory,
so it makes little sense for the date-time of
the re-created directory to match that of the
original directory. (I'm trying to persuade
the UnZip maintainer to add an option to fix
this, but it's not in there yet.) Note that
VMS BACKUP (not /IMAGE) gives you new-looking
directories, not date-time-restored
directories, which makes more sense than the
current behavior of UnZip (or, I believe,
"tar").


Now, there are many other data in a Zip
archive which might cause trouble in a
situation like this. The Zip archive format
is described here:

http://www.pkware.com/documents/casestudies/APPNOTE.TXT

and it includes many things related to the
host system type, the file system type, the
Zip program version, file ownership and
permissions, and so on, but these could be
controlled, probably adequately.

That said, basing such a data integrity check
on the potentially volatile fine print in a
Zip archive would probably not be my choice.
Nothing particularly superior is leaping to
mind, however.
Willem Grooters
Honored Contributor

Re: VMS Unzip doesn't adjust for GMT

Stupid idea, perhaps:

If you zip /unzip on either side of the big pond on systems that have the SAME timezone (UTC, GMT, CET, WCT - whatever), just for distribution: would that not bypass the whole problem?
Willem Grooters
OpenVMS Developer & System Manager
Steven Schweda
Honored Contributor

Re: VMS Unzip doesn't adjust for GMT

> [...[ would that not bypass the whole
> problem?

I claim that it would solve it, assuming
UNIX(-like) systems, which are based on UTC.
I'm waiting for some evidence to the
contrary.

If you have a VMS system where local time
equals UTC, then I'd expect it to match a
UNIX(-like) system with a UTC time-zone, too,
but that's a more restrictive set of
requirements.