Operating System - HP-UX
1827876 Members
1439 Online
109969 Solutions
New Discussion

grep string from compressed file

 
SOLVED
Go to solution
lawrenzo_1
Super Advisor

grep string from compressed file

Hi,

I have created a 500mb file by concatenating lots of small files together:

file:abc.txt
data
data
data
:EOF

I have then compressed the file. What I would like to know is the command that can extract all the data between file:abc.txt and :EOF without uncompressing the file.

I have installed gnu.grep (zgrep) which can find the string in that file.

Thanks in advance.

Chris.
hello
12 REPLIES 12
TwoProc
Honored Contributor

Re: grep string from compressed file

How did you concatenate the files together?
"cat"?

If so, I doubt that there are any file header indicators in the file - so all you could use to key off of for the end of abc.txt would be whatever the last piece of text is in abc.txt file.

Re; the second requirement of not uncompressing. Well you're going to have to uncompress either via your own code or some code from a program that understands the compression for that file - but you're going to have to uncompress.

UNLESS, you mean you don't want to uncompress to a file - you could uncompress to a pipe. Is that the question here, how to retrieve the 3 data files from a pipe which has uncompressed the file ?

We are the people our parents warned us about --Jimmy Buffett
Patrick Wallek
Honored Contributor
Solution

Re: grep string from compressed file

What did you use to compress the file? gzip or compress?

If you used gzip, try the following:

# gzcat file.gz | grep whatever

If you used compress, try the following:

# zcat file.Z | grep whatever
James R. Ferguson
Acclaimed Contributor

Re: grep string from compressed file

Hi Chris:

# gzip -d -c file|perl -nle 'print if /abc.txt/../:EOF/'

...JRF...
Peter Godron
Honored Contributor

Re: grep string from compressed file

Chris,
can't you use something like:
zgrep -v -e'^file:' -e'^:EOF' input > output
lawrenzo_1
Super Advisor

Re: grep string from compressed file

ok thanks guys.

to create the files I ran a small script

for aa in `ls dir`
echo "file:$aa">>$outfile
cat $aa >> $outfile
echo ":EOR" >>$outfile
done

I will try gzcat.
hello
lawrenzo_1
Super Advisor

Re: grep string from compressed file

ok lots more ideas to try here ....

I am playing with the gzip utility which works well to extract the string I require however the string is a unique number and will only appear once therefore once the string has been found then the search should stop.

here is what I require:

string to search is BGM001025176958

I need to recreate the file that was orginally on the server:

file:abc.txt
data
data
BGM001025176958
data
:EOR

so once the string has been identified then I want to create the file abc.txt containing

data
data
BGM001025176958
data

I have an idea howevr my awk syntax ain't quite correct.

Thanks
hello
Hein van den Heuvel
Honored Contributor

Re: grep string from compressed file


Here is some awk that would do what you desire:

awk -F: '/BGM001025176958/{p=1} /^:EOR/ && (p) {while (i name} exit} {a[n++]=$0} /^file /{name=$2; n=0}'

C:\Temp>type abc.txt
file aaa.txt
data
BGM001025176958
data

With comments....

/BGM001025176958/{p=1} ## Desired pattern? Set print flag!

/^:EOR/ && (p) ## Line starting with end mark? Also have print flag set? then...
{while (i{print a[i++] > name} ## Print onto fresh file with remembered name
exit} ## and done

{a[n++]=$0}' ## Remember every line in its own place.

/^file:/ {name=$2; n=0} ## Line starting with new file? remember name part and reset number of lines.

Enjoy!
Hein.
lawrenzo_1
Super Advisor

Re: grep string from compressed file

The command isn't returning anything ....

any other idea's?

Thanks
hello
Dennis Handly
Acclaimed Contributor

Re: grep string from compressed file

>The command isn't returning anything

I got an error message showing the problem:
awk: A print or getline function must have a file name.

I then initialized name to "bad_news" and got the output.

The problem is an extra space after the "e" in file. I fixed it here:
/^file/{name=$2; n=0}

It might be easier to read as:
awk -F: '
BEGIN { name="bad_news" }
/BGM001025176958/{p=1}
/^:EOR/ && (p) {while (i name } exit}
{a[n++]=$0}
/^file/{name=$2; n=0}'
Hein van den Heuvel
Honored Contributor

Re: grep string from compressed file

> >The command isn't returning anything

Ooops, like Dennis indicates i made a cut & paste error. I tried it on a Windoze box and the awk had trouble with specifying a ":" as field seperator. So I changed the data to use spaces, for the test replacing them in the posted string.. except for the /^file:/ match. Sorry.

And, You need to feed (pipe) it the input stream of course,

Cheers,
Hein.
lawrenzo_1
Super Advisor

Re: grep string from compressed file

that now works a treat!

Thanks guys.


Chris
hello
lawrenzo_1
Super Advisor

Re: grep string from compressed file

ty ty ty
hello