Operating System - HP-UX
1819805 Members
3010 Online
109607 Solutions
New Discussion юеВ

cat vs. dd, gzip file size

 
SOLVED
Go to solution
Gilbert Standen_1
Frequent Advisor

cat vs. dd, gzip file size

I have a script which does a hot backup of an oracle database. I currently use:
cat dbfile | gzip > dbfile.gz
Recently saw that an alternate way is to use:
dd if=dbfile | gzip > dbfile.gz
It was said that using cat can cause corruption of the file. Is "dd" a better way to do this ? What is the file size limit using "dd" ? Does gzip gzip 1.2.4 (18 Aug 93) have a file size 2GB limit ?

Finally, my last question is, when unzipping, e.g. cat dbfile.gz | gunzip > dbfile
is there a way to prompt before overwriting if
the dbfile is already there in the destination ? And similarly, to use dd going the other way (unzip) would I use:

dd if=dbfile.gz | gunzip > dbfile

Are there yet other ways to do this that are
better ? Also, will one way be faster ? And if dd is better, are there switches on dd that can be used with Veritas file system that will make it faster ?
If I could take one thing with me into the next world it would be my valid login to HP ITRC Forums
9 REPLIES 9
Patrick Wallek
Honored Contributor

Re: cat vs. dd, gzip file size

I would bypass cat or dd altogether.

You could do:

# gzip < /path/to/dbfile > /path/to/compressed/dbfile.gz

This will take dbfile as input and output it to dbfile.gz.

Yes, gzip 1.2.4 has a 2GB limit. Gzip 1.2.4a and greater overcomes this if I recall correctly.

If you do the reverse when you gunzip the file:

# gunzip < /path/to/compressed/dbfile.gz > /path/to/dbfile

If you do the above, it will NOT prompt you if the output file already exists.
Steven E. Protter
Exalted Contributor

Re: cat vs. dd, gzip file size

gzip STILL has a 2 GB filesize limit and your methodology will not change that.

if [ -f $filename ] then
echo "file exists, remove it?"
read aa
if [ "$aa" = "Y" ]
then
rm $filename
fi
else
# do the gzip thing
fi

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Sridhar Bhaskarla
Honored Contributor

Re: cat vs. dd, gzip file size

Hi,

I don't know why you would want to use dd or cat unless you want to preserve the original file.

gzip dbfile

If you want to preserve the file then you would do

gzip -c dbfile > dbfile.gz

Similarly gunzip.

Look at the man page of gzip for more options.

Yes. Your gzip has 2 GB file limit. You can get a better one from the porting center.

http://hpux.cs.utah.edu/hppd/hpux/Gnu/gzip-1.3.5/

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: cat vs. dd, gzip file size

Cat or dd should NOT corrupt the file but there is really no need to do this, simply redirect stdin.

gzip < dbfile > dbfile.gz
BUT there is a downside to this in that the output and input files are on the same device, indeed the same filesystem. A better answer would be to read stdin on one device and write it on another device so that the I/O subsystem is not saturated.

e.g.

cd /u01/mydata
gzip < dbfile > /u02/mydata/dbfile.gz
If it ain't broke, I can fix that.
Gilbert Standen_1
Frequent Advisor

Re: cat vs. dd, gzip file size

All replies very much appreciated!

cd /u01/mydata
gzip < dbfile > /u02/mydata/dbfile.gz

Does 2GB file limit still apply ? It seems from the responses that it does!

Assume using EMC symmetrix/DMX hooked up to my HP rp7410 o L2000, does this sort of consideration still apply ? The manufacturers of products like EMC will say that the hardware/software of their product handles all I/O bottlenecks auto-magically but I still have the strong instinct to do as suggested above and offload some of the I/O to another VG.

The if/then solution is clear, I of course should have thought of that...nevertheless being sometimes of small brain, I always deeply appreciate those who point out the obvious which I often miss while clearly seeing the minutuae...

Gil
If I could take one thing with me into the next world it would be my valid login to HP ITRC Forums
Patrick Wallek
Honored Contributor

Re: cat vs. dd, gzip file size

Yes, the 2GB limit will still apply. If your gzip file hits 2GB your version of gzip will generate an error.
Bill Hassell
Honored Contributor

Re: cat vs. dd, gzip file size

If the files will ever be larger than 2Gb, you'll have to upgrade your version of gzip. You'll need version 1.2.4 with patches or 1.3. NOTE: a large file gzip file may not be able to be read on another system without upgrading gunzip.


Bill Hassell, sysadmin
Gilbert Standen_1
Frequent Advisor

Re: cat vs. dd, gzip file size

Ok, I will check into what is the most recent version of gunzip. It's appreciated !
Gil
If I could take one thing with me into the next world it would be my valid login to HP ITRC Forums
Gilbert Standen_1
Frequent Advisor

Re: cat vs. dd, gzip file size

Note, we upgraded to gzip 1.3.5 and the time required for the backup was cut in half ( ! ) with all other load on box same as before. It seems that improvements to the code have speeded gzip up !
If I could take one thing with me into the next world it would be my valid login to HP ITRC Forums