1826925 Members
2129 Online
109705 Solutions
New Discussion

Re: UNIX tar limitation

 
Danny Fang
Frequent Advisor

UNIX tar limitation

Hi,

I encountered this particular scenario where there're about 10,000 files to be tar-ed. However, the tar utility just failed to tar-up these files.

I'm running the tar utility on a bash shell.

Could anyone tell me if there is a limitation to the number of files that the tar utility is capable of handling? Also, what tweaking is needed to the shell for the tar to have such capabilities?

Thanks in advance.
Danny
10 REPLIES 10
Stefan Schulz
Honored Contributor

Re: UNIX tar limitation

Hi,

i have allready used tar for more than 10,000 files.

Did you try to tar more than 2GB of data? There is a limitation to 2GB.

Or did you use wildcards in your tarcommand. With that much files, using wildcards will get you an error from the shell which tries to resolve the wildcard to a list of files. In that case try using tar with a directory, not a filelist.

instead of tar cf mytar.tar /your/dir/* use tar cf mytar /your/dir

Regards Stefan
No Mouse found. System halted. Press Mousebutton to continue.
Arunvijai_4
Honored Contributor

Re: UNIX tar limitation

Hi Danny,

Default tar has a limitation. You can install PHCO_28992 and check it works or not. Otherwise, get GNU's tar

http://hpux.connect.org.uk/hppd/hpux/Gnu/tar-1.15.1/
http://www2.itrc.hp.com/service/patch/patchDetail.do?patchid=PHCO_28992&sel={hpux:11.11,}&BC=main|search|

-Arun

"A ship in the harbor is safe, but that is not what ships are built for"
Kenan Erdey
Honored Contributor

Re: UNIX tar limitation

hi;

with a tar patch you are capable of tar up to 8 GB. But i don't know whether there is a file number limitation.
Computers have lots of memory but no imagination
inventsekar_1
Respected Contributor

Re: UNIX tar limitation

HI Danny,

"Because of industry standards and interoperability goals, tar does not
support the archival of files larger than 8GB or files that have
user/group IDs greater than 2048k. Files with user/group IDs greater
than 2048k are archived and restored under the user/group ID of the
current process, unless the uname/gname exists (see tar(4))."

man tar
Be Tomorrow, Today.
Raj D.
Honored Contributor

Re: UNIX tar limitation

Danny,

You can use the GNU version of tar to overcome 2GB file size limitation.

Here it is:
http://hpux.cs.utah.edu/hppd/hpux/Gnu/tar-1.15.1/


[ Although tar is supplied with HP-UX, gtar has more options and handles absolute paths as relative paths, restores the original dates of directories when extracting and also supports large file systems. The gettext and libiconv packages should be installed first, prior to installing tar. ]


Cheers,
Raj.
" If u think u can , If u think u cannot , - You are always Right . "
Danny Fang
Frequent Advisor

Re: UNIX tar limitation

Hi,

Thanks to all whom had shared their knowledge in this thread.

However, the greater issue is with the number of files that tar can handle and not just the size.

The tar operation is done as in:
/usr/bin/tar -cEf "$GROUPID.tar" *"$GROUPID"*"statsfile.GPEH.xml"
Hence not the entire directory list.

Any ideas anyone?

Thanks
Danny
Stefan Schulz
Honored Contributor

Re: UNIX tar limitation

Hi again,

the problem is the use of wildcards. Your shell tries to expand those wildcards an creates an awful long commandline.

So you have to work around this. Use find to identify your files and hand the list over to tar using xargs. Or move the files to a seperate directory and then tar the whole directory.

Regards Stefan
No Mouse found. System halted. Press Mousebutton to continue.
Danny Fang
Frequent Advisor

Re: UNIX tar limitation

Hi Stefan,

When you mentioned that the shell attempts to expand those wildcards and created awful long commandline, may I know what was the error message that you obtained when you experimented with this?

May I know how does the use of "find" coupled with "xargs", or by moving the files to a seperate directory and then tar-ingthe whole directory would help in this issue?

Greatly appreciate it if you could shed some light in these questions.

Thanks
Danny
Bill Hassell
Honored Contributor

Re: UNIX tar limitation

The * character means nothing to tar. It is a special character for the shell and it is automatically replaced with all the filenames that match. If you type this command (after setting the variable GROUPID):

echo *"$GROUPID"*"statsfile.GPEH.xml"

the result is what tar will see on the command line. The command line is not infinitely long so that is why you must provide long lists of filenames with utilities like find and xargs.


Bill Hassell, sysadmin
Stefan Schulz
Honored Contributor

Re: UNIX tar limitation

Hi Danny,

using a wildcard in a command is not interpreted by the comamnd but by the shell and then passed to the command.

So if you do something like tar cf xxx.tar *.txt and you have 1.txt, 2.txt .... 100.txt the shell will interprete this wildcart and call a command tar cf xxx.tar 1.txt 2.txt 3.txt 4.txt ...... 100.txt

Unfortunately a command can not have infinite length.

If you move all files to another directory you can just tar this directory without using wildcards. So if you find and move all desired files there is no need to call tar using wildcards. (e.g. tar cf xxx.tar ./my_tar_directory)

Or use find with xargs to get this long list of files passed to tar. Something like:

find . -name "yourexpression" | xargs tar cf yourtarfile.tar

See man xargs for more information.

Regards Stefan
No Mouse found. System halted. Press Mousebutton to continue.