Operating System - HP-UX
1823904 Members
2965 Online
109666 Solutions
New Discussion юеВ

Time taken to do gunzip | tar -xvf

 
Rgomes
Valued Contributor

Time taken to do gunzip | tar -xvf

Dear All,

I have an environment of rx6600/HP-UX 11i v2 with internal SAS RAID1 having 2x 146GB SAS HDD. In this server, I have a mount point created on FC SAN storage.
Now while doing the below test (gunzip | tar of a 650MB file) on two different mount points, it gives much different observation:

Test#1: Internal Disk
command used: gunzip -c file1.tar.gz | tar -xvf -
Mount point Local : /internal/test1
Time taken to finish the Job: 28 Min

Test#2: External SAN Disk
command used: gunzip -c file1.tar.gz | tar -xvf -
Mount point Local : /external/test2
Time taken to finish the Job: 6 Min


Info on mount points:

/dev/vg00/lvol9 20971520 14536648 6032724 71% /internal
/dev/vgtest/extest 208896000 29403697 168274092 15% /external

/dev/vg00/lvol9 /internal vxfs ioerror=mwdisable,delaylog,nodatainlog,dev=4000000a 0 0
/dev/vgtest/extest /external vxfs ioerror=mwdisable,largefiles,delaylog,nodatainlog,dev=40020001 0 0

I want to know if time taken to do the same gunzip | tar operation on different mount points are normal or internal disk is taking too much time?

TIA.

Regards,
Richard
12 REPLIES 12

Re: Time taken to do gunzip | tar -xvf

28 minutes does seem a while for what is effectivaly just a 650MB sequential write...

But before jumping on that I'd consider:

i) What else was happening on the server when you ran it? Remember that the internal disks are the OS disks so any other OS activity happening at the time could conflict with this... presumably the external SAN disk had nothing else on it.

ii) gzip requires some CPU resource as well to uncompress data - any other heavy CPU activity at the time could cause this.

iii) What's behind the SAN disk? How many spindles does it have to do its work comapred to the 2 you have for the internal? If you are reading and writing to the same disk that will make a BIG difference.

iv) Its very bad practice to large sequential IOs to your system disks anyway - you'll find all your standard OS commands slow up *a lot*

HTH

Duncan

I am an HPE Employee
Accept or Kudo
TTr
Honored Contributor

Re: Time taken to do gunzip | tar -xvf

The output of the above command is buffered so you have bursts of i/o interlaced with CPU for uncompressing/untar-ing. And your SAN storage system's cache plays a big role here as well. Most storage systems have 1-2GB cache minimum so all the writes to the external disk were absorbed by the cache very quickly.

How is your internal disk HW? The rx6600 has a couple of RAID options or you could have an LVM mirror. As mentioned before, you should not put extra i/o on the system disks. And if you utilize the internal disk cage, get extra disks for non-OS storage and upgrade the internal raid controller to the more powerful p600.
Rgomes
Valued Contributor

Re: Time taken to do gunzip | tar -xvf

I was also considering those points, however I wanted to know other experience thoughts on the mentioned observation.

Thanks for your replies.

Rgds,
Richard

James R. Ferguson
Acclaimed Contributor

Re: Time taken to do gunzip | tar -xvf

Hi Richard:

> I was also considering those points, however I wanted to know other experience thoughts on the mentioned observation.

I think that Duncan and TTr have certainly covered all of the most important factors.

Duncan's first point is exteremly germane; viz. what else was running at the time of your measurement? You should take your measurements under identical (ideally quiescent) conditions.

Regards!

...JRF...
James R. Ferguson
Acclaimed Contributor

Re: Time taken to do gunzip | tar -xvf

Hi (again) Richard:

By the way, you did an excellent job of describing your environment --- something all to often lacking in postings. I just re-read this thinking that perhaps you SAN filesystem might have been configured with the Unix buffer cache bypassed using the VxFS mount options 'convosync=direct' and 'mincache=direct' whereas the internal disk mountpoint might not have been. This clearly isn't the case from your description.

Regards!

...JRF...
Rgomes
Valued Contributor

Re: Time taken to do gunzip | tar -xvf

Thank you JRF. Actually I wanted to express my observation as clearly as possible.

And, I am almost(:-)) convinced about the perf. result between internal and external disk after having all of your inputs . although I did not expect that tests would result that much of time difference.

Regards,
Richard
Steven E. Protter
Exalted Contributor

Re: Time taken to do gunzip | tar -xvf

Shalom,

This is going to take a lot of time and it strains disks, even raid 10 SAN disks.

gunzip makes a copy of the tar file. For a while both the tar and .gz exist

It is also fairly CPU intensive and can slow systems down.

Doing this on big files can lead to slowdowns due to a lack of temporary space.

Merely removing the gunzip stage can actually save a lot of time.

Based on your message, I think you are running out of space during the gunzip stage, thats the cause of the long run time and you may not get good results.

This will be a bit faster on a local disk, preferably not one you use for databases or swap.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Dennis Handly
Acclaimed Contributor

Re: Time taken to do gunzip | tar -xvf

>SEP: gunzip makes a copy of the tar file. For a while both the tar and .gz exist

Are you sure? I would think it only gunzips on the fly and only the bytes in the pipe are "duplicated".
OldSchool
Honored Contributor

Re: Time taken to do gunzip | tar -xvf

If you need to continue analyzing this issue, I'd recommend "iozone" for additional tests.

see:
http://www.iozone.org/
Rgomes
Valued Contributor

Re: Time taken to do gunzip | tar -xvf

Hi SEP,

"Doing this on big files can lead to slowdowns due to a lack of temporary space."

2 questions from my end:

Q1) "big files" what is the definition? Against what? No. of CPU,Amount of Phy. Memory or Disk Space? Or against a general rule, that is. >500MB is a big file?

Q2) Temporary Space: is it within gunzip & tar's pwd?

Regards,
Richard

Re: Time taken to do gunzip | tar -xvf

Richard,

I don't think SEP is correct on this one - a quick play with tusc to trace system calls soon shows that neither the tar, the gunzip or the underlying shell use temporary files.

The shell handles all the movement of data through a normal interprocess FIFO pipe ("man pipe" for more details)

Have a look at the attachment for details

HTH

Duncan

I am an HPE Employee
Accept or Kudo
TwoProc
Honored Contributor

Re: Time taken to do gunzip | tar -xvf

Where does file1.tar.gz live? Is it in a single location, or did you have it each file on a different file system before you ran the gzip. If you unzipped them both from a single file, then the second test came from cache.

And, even if you did have the two files on the different file systems, when you did copy the file? After the first test? If so, then certainly it came from cache, even though you copied it over to the new OS. In fact even if you copied it over to the other file system before the test, it was in cache from some place, if not multiple places, if the storage system wasn't used for anything else.

When you copied the file over the storage system, the file was placed in the OS buffer cache, in the cache for the storage system (which can be huge, mine are 64 GIG in size), and can also be cached all the way down to the hard drive mechanisms themselves, if nothing else was running.

Naturally, a mounted file system onto a storage server that isn't doing anything else would have the highest caching for your test around. Much more than a disk which is running your OS, certainly. And, even if your file system, OS, file server, etc was doing some things, you still could have some or all of your file coming from cache.

The answer, like many things out there, "it depends."

Lastly, I'll add that in most cases, a SAN environment is faster than direct attached. It's supposed to be, unless you're buying a little bitty small one.
We are the people our parents warned us about --Jimmy Buffett