Operating System - HP-UX
1838682 Members
4849 Online
110128 Solutions
New Discussion

Re: How to capture end of a file by size & by no. of lines

 
Declan Mc Kay
Occasional Advisor

How to capture end of a file by size & by no. of lines

Hi,
OS HP-UX10.20, D-Class Server

If I have a 100Mb log file what Cmd can use to capture the last say 1Mb of that file into a seperate file.
OR
How can I capture by last say 100,000,000 of a 100Mb log file into a seperate file.
Appreciate Any replies
16 REPLIES 16
Pete Randall
Outstanding Contributor

Re: How to capture end of a file by size & by no. of lines

Number of lines is easy: tail -n, where n is the number of lines you want.

Pete

Pete
Pete Randall
Outstanding Contributor

Re: How to capture end of a file by size & by no. of lines

And tail -c for number of bytes.

Pete

Pete
Sridhar Bhaskarla
Honored Contributor

Re: How to capture end of a file by size & by no. of lines

Hi,

Pete has the answer.

tail -c 1,000,000 file > /tmp/file1

This will capture last 1 MB of file into file1.

Man tail for more details.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Jean-Luc Oudart
Honored Contributor

Re: How to capture end of a file by size & by no. of lines

to get the last 1% of you file :

#!/bin/sh
typeset -i NB
NB=` wc -l yourfile | cut -f1 -d " "`
NB2=`expr $NB \/ 100`

tail -${NB2} yourfile > yournewfile

Jean-Luc
fiat lux
Declan Mc Kay
Occasional Advisor

Re: How to capture end of a file by size & by no. of lines

I have tried tail already and for some strange reason I only can extract the last 324 lines of the file. Exact same result when I tail by no. of lines and by size. I check the output of each by wc -l...? Any other Ideas..?
Robin Wakefield
Honored Contributor

Re: How to capture end of a file by size & by no. of lines

Hi

tail does have a buffer limit. If it's not capturing everything, try the long-winded approach for a line-based capture:

# wc -l filename|read n
# (( m=n-9999 ))

# awk "NR>=$m && NR<=$n{print}" filename

or

# sed -n "$m,${n}p" filename

Rgds, Robin
Martin Johnson
Honored Contributor

Re: How to capture end of a file by size & by no. of lines

Are you up to date with your patches?

HTH
Marty
S.K. Chan
Honored Contributor

Re: How to capture end of a file by size & by no. of lines

Robin is correct. Tail has a 20KB bufer limit.
Declan Mc Kay
Occasional Advisor

Re: How to capture end of a file by size & by no. of lines

Jean-Luc,

I have tried your script but still only 324 lines are output into the output file....seems to suggest that ny tail cmd will only read 324 lines...?
Pete Randall
Outstanding Contributor

Re: How to capture end of a file by size & by no. of lines

Wild Guess:

How about something with dd?
dd if=george of=ralph skip=nn count=nn

Maybe?

Pete

Pete
Jean-Luc Oudart
Honored Contributor

Re: How to capture end of a file by size & by no. of lines

I'm afraid you will have to go with the awk solution for the *BIG* files.

the buffer is limited :-(

check Robin's solution.

Jean-Luc
fiat lux
doug hosking
Esteemed Contributor

Re: How to capture end of a file by size & by no. of lines

Pete's dd idea should work, but has the
unfortunate limitation that you might
get partial lines if the file doesn't
have fixed length records. If that's
an acceptable restriction, just add
a reasonable 'bs=' value to the dd
line for performance reasons and it
should work fine.

Yes, tail has buffer limits that are a
pain.

echo '$-20000,$p' | ex big_log_file >
new_log_file

or the 'split' or 'sed' commands might be helpful depending on exactly what you want
to do.

Be careful not to fall into the next
trap:
If a process is writing to that log file
at the time you split it, you might
have to send a SIGHUP or similar to that
process to make it close and reopen the
log file. Otherwise you might delete
the big log file, only to find out
that the space it consumed hasn't really
gone away when expected. Details here
vary based on the application writing
to the log file, but just beware that
UNIX doesn't actually free disk space
associated with a deleted file until the
last user of that file has closed it. Not
realizing this is a common cause of
head scratching about why the disk is
full but du, etc. say there's only a small
amount of disk space in use.
Tore_1
Regular Advisor

Re: How to capture end of a file by size & by no. of lines

Perhaps the gnu version of tail doesnt have this buffer limit?http://hpux.connect.org.uk/hppd/hpux/Gnu/textutils-2.0.20/
Carlos Fernandez Riera
Honored Contributor

Re: How to capture end of a file by size & by no. of lines

Tail is restricted to 2k chars ( see man tail).

You dont need to read/treat the whole file with awk/wc or otehr commands like this.


The best and faster solution is use of dd

dd if=file.orig of=cuted_file bs=1024k skip=99

This will skip 99MB !! and read the remaining.


unsupported
doug hosking
Esteemed Contributor

Re: How to capture end of a file by size & by no. of lines

> The best and faster solution is use of dd
> dd if=file.orig of=cuted_file bs=1024k
> skip=99
>
> This will skip 99MB !! and read the
> remaining.

This works great if you really know the file
is 100 MB. More often, all you probably know
is that the file is 'large.' So you would
have to do either manual or automatic
calculation of the right values for dd,
possibly dealing with the race condition of the
file growing while you're calculating the
offsets to use.

The benefit of the 'ex' method above is
that you don't need to know in advance the
exact size of the file. Whether it's
more important to keep things simple or make
them fast would depend on the specific
application. In most cases I would guess
that it would take quite a while to collect
a 100 MB log file, and that a few extra seconds
of compute/IO time wouldn't be a big deal.

In any case, it looks like there are several
simple, valid suggestions. At least one of them should be an appropriate solution for the
problem.


Bill Thorsteinson
Honored Contributor

Re: How to capture end of a file by size & by no. of lines

I use a small perl script to tail the file since the last
tail. This is a variation of
the logtail command included
with logcheck. I added the
ability to deal with rotated
logs, and a list of logs.

The first parameter is a list
of log files. The optional
second parameter is the extention of the rotated log
files.

You can pass the output to a filter or redirect to a file.