Operating System - HP-UX
1833734 Members
2499 Online
110063 Solutions
New Discussion

Re: binary files differ by time

 
Jun Zhang_4
Regular Advisor

binary files differ by time

The following sequence shows the dilema on hpux 11.00 and 11i for both cc and gcc. One of my application procedure requires that cksum give the same result. Is there a way to conform?

Jun

cc hello.c
mv a.out b.out
cc hello.c
diff a.out b.out
Binary files a.out and b.out differ
# cksum a.out
1115578088 20732 a.out
# cksum b.out
2899230371 20732 b.out

Food lover
11 REPLIES 11
A. Clay Stephenson
Acclaimed Contributor

Re: binary files differ by time

This is not easily "doable". Cksum and sum don't care about the file's metadata (ownership, group, mode, and timestamps); these commands simply don't even look at the metadata in the inode. However, your problem does have to do with timestamps --- just not the ones that you are able to see with ls.
You can convince yourself of this by doing this:
cp a.out b.out
cksum a.out b.out

Eventhough the files have different timestamps (and even the owner/group could be different, if you like), the cksums are identical. Again, cksum, sum, and diff don't care about a file's metadata.


The timestamp that is giving you problems is embedded in the executable (and indeed in the object files that built the executable).
There is a struct within (file_time) the file header of every object file that carries
the time of assembly or linking. Because to cksum this is DATA rather than METADATA, your task is hopeless. You could write a c program to modify this file_time data but that is the only way to accomplish what you are trying to do.

Man a.out for details.
If it ain't broke, I can fix that.
Jun Zhang_4
Regular Advisor

Re: binary files differ by time

Thank you for your reply.
So far I saw only hpux has this problem.
Do you think compile cc or gcc from source make a difference? Or some insight about what HP did to make cksum see different things will be helpful.
Also what the difference between an elf file and the PA-RISC executable, or can one generate elf file instead of PA-RISC on hpux?
Same procedure on a linux machine I got perfect agreement for diff and cksum.

Jun
Food lover
A. Clay Stephenson
Acclaimed Contributor

Re: binary files differ by time

This is not really a "problem". Cksum is just doing what it is supposed to do. The files are different. It is common for many implementaions to leave the two-word sys_clock struct with zero'ed data indicating that the data are unused and that is why "cksum" works for other implementations. I don't know about ELF formats nor have I tried gcc. The problem is that you are using a tool that looks for any differences and you really want it to look for only significant differences. Cksum, sum, diff, et al have no way of doing that.
Moreover, even if the executables are identical, that still does not mean that the run-times are identical because different versions of shared libraries might be employed.

If it ain't broke, I can fix that.
Laurent Menase
Honored Contributor

Re: binary files differ by time

just try a

cmp -l a.out b.out

you will see the 4 words of modification.
Laurent Menase
Honored Contributor

Re: binary files differ by time

you can do a
odump -auxheader a.out |grep when_linked
when_linked = Wed Mar 10 2004 06:15:31.000000000 MET
when_linked = Wed Jun 05 2002 19:39:30.000000000 METDST

odump -compunit a.out |grep Time
odump -compunit b.out |grep Time

The best to differenciate executables is to use a what string:
char ident[] = "@(#)identification information";
then you can differenciate with
what a.out

Jun Zhang_4
Regular Advisor

Re: binary files differ by time

Since now we can differentiate the executables compiled at different times using the same source, command line options and user environment (appreciate it!),
is there a way that the difference could be eliminated? either after or during the compilation? This is actually my purpose when raised the question.

Jun

Food lover
A. Clay Stephenson
Acclaimed Contributor

Re: binary files differ by time

In principle, you could rewrite the filehdr which begins at offset 0 and then examine each aux header which begin at a file offset specified by the aux_header_location field in the header and "fixup" each aux_header.
Note that aux_headers come in many different "flavors" so this it a bit tricky. I suspect, though I haven't bother to test this, is that the real differences are in the aux header because using od (and assuming I can count bytes) it appears that the file_time values are zero'ed in the header.

Is this potentially dangerous? You bet. Is it in any sense useful? Beats me. Do the risks outweigh the benefits? I doubt it.

You need to man a.out and read it this time and then look at /usr/include/filehdr.h. That will get you started.

If this were me, I would put in the ident[] global string in one of the objects and trust it.

If you are determined to do this then I would rethink the whole deal. You need to come up with a "checkexe" command that is platform specific. On most platforms it might just be a cksum wrapper. On HP-UX, you might open the exe, read the header and from that learn how many bytes in aux_headers you have and skip to that offset. Read the file from the end of the aux_headers to EOF and pipe that output to cksum. Now you ought to have two cksums that do compare. This still sounds like far more trouble than it is worth and you are still not addressing shared libraries.

If it ain't broke, I can fix that.
Jun Zhang_4
Regular Advisor

Re: binary files differ by time

I didn't get the point why adding the line
char ident[] = "@(#)identification information";
into hello.c. After added it,
what a.out
just give me one more line, and same to b.out. Where is the further differentiation of the two?

Jun
Food lover
Laurent Menase
Honored Contributor

Re: binary files differ by time

you change the ident string eachtime you modify the source.
A. Clay Stephenson
Acclaimed Contributor

Re: binary files differ by time

ident serves as the "version of the file". It does require discipline on the part of the programmer to maintain it properly. Another method is to adopt the convention,

myexe -V

When -V is supplied print the version on stdout and exit. You have to supply the code to process the -V and adopt a convention for setting and maintaining a Version string.

The -V convention also has the advantage that you can simply ask the user to execute a filename -V and read to you what it says.

If it ain't broke, I can fix that.
Laurent Menase
Honored Contributor

Re: binary files differ by time

I just made a script to copy the compunit and aux_header from one file to a copy of the second one,
then cmp.
This may help for som files, but with no garanty

#!/usr/bin/ksh
F1=$1
F2=$2
F2TMP=${2}tmp$$
cp $F2 $F2TMP
odump -header $1 |grep -e aux_header -e compiler |while read a b c d
do
eval $(printf "dd bs=1 count=%d skip=%d of=%s if=%s\n" $c $b $a $F1)
eval $(printf "dd bs=1 seek=%d if=%s 1<>%s\n" $b $a $F2TMP)
done

if cmp $F1 $F2TMP
then
echo same
else
cmp -l $F1 $F2TMP
fi
rm $F2TMP

Laurent