1824937 Members
3982 Online
109678 Solutions
New Discussion юеВ

faster grep

 
Hunki
Super Advisor

faster grep


I have a few big log files for which I need to grep for a certain set of characters, but grep takes a long time finding them . Can it be done faster with another tool.

I need a faster equivalent of :

grep "log message" *

thanks,
hunki
8 REPLIES 8
Peter Godron
Honored Contributor

Re: faster grep

Hunki,
most of the system utils (such as grep) are already pretty fast!

But, try an exact match:
If you are on 11.11 onwards
grep -E "(^|[^[:alnum:]_])()([^[:alnum:]_]|$)"
or gnu grep -w option

Another way would be to split the file and then run parallel greps against the segments, but I suspect the split would actually take longer than your grep.

A. Clay Stephenson
Acclaimed Contributor

Re: faster grep

You can force grep to use a less powerful but faster string matching algorithm by using the -F grep option. This emulates the behavior of the old fgrep (fast grep) command. Man grep for details.

Now the smarter way to do what you are trying to do is to keep up with the last position searched in the file and do your searches from 1 character past the old end of file to the current end of file. You then record the current end of file and it becomes the new starting position for the next search. If the current end position is greater than the stored end position then the file has been rewritten and you need to start at the beginning. This would be rather straightforward in Perl.
If it ain't broke, I can fix that.
Bill Hassell
Honored Contributor

Re: faster grep

And an even smarter way to grep through logfiles is to keep them small and archived. If your logfile is more than a few megs, make a copy of the logfile, then cat /dev/null into the logfile and compress your copy. Most sysadmins will keep multiple archives of logfiles.


Bill Hassell, sysadmin
Matti_Kurkela
Honored Contributor

Re: faster grep

When you use grep -F, you are searching as fast as your system allows.

Most likely the bottleneck is I/O capacity, and more specifically the sustained read speed of your disk(s): if you want it done faster, you'll need to have the logs stored on a faster storage system.

Another possible bottleneck would be CPU, if you have an old server with a fast FibreChannel storage.

MK
MK
gstonian
Trusted Contributor

Re: faster grep

Grep is pretty fast but you could use either of the two options:

Split the log files so you get smaller log files so you searching through only the specific information you are after
or
try keeping track of the size of the log file and only monitor new lines. See attached file for example of this.






Ralph Grothe
Honored Contributor

Re: faster grep

Sticking to Bill's suggestion,
I would employ logrotate for that job,
which does keep as many rotated generations
as you specify and compresses them or does other post processing that you deem necessary.
Also you are free to define any pre or post processing commands that may be appropiate to make the log writing process release the log and continue with the new one.
Madness, thy name is system administration
Hunki
Super Advisor

Re: faster grep

Thanks all , I was searching through some archived logs files as well.
Dennis Handly
Acclaimed Contributor

Re: faster grep

>Peter: most of the system utils (such as grep) are already pretty fast!

One user was complaining that grep on HP-UX was slower than on OpenVMS. Years ago, someone from the commands group was telling us how they made grep about as fast as fgrep with some new algorithms. (Match from right to left.) So I don't know how HP-UX is slower?

>But, try an exact match: grep -E

This will only make things slower, unless you are worried about the time to output the matches. As Clay says, fgrep should be faster.

>I was searching through some archived logs files as well.

Well, you could do:
$ find . -mtime -2 -exec grep "log message" +
(Search for all files modified in the last 2 days.)