1745824 Members
3978 Online
108722 Solutions
New Discussion юеВ

Long text file truncate

 
SOLVED
Go to solution
Nicolau Roca
Advisor

Long text file truncate

I'm interested in only the first 100000 lines of a 15GB text file

Is there any way to easily copy only this 100000 lines to another file?

(Editing the file with TPU and trying to delete lines from 100000 onwards displays the message "Too many records")

Thank you
9 REPLIES 9
Steven Schweda
Honored Contributor

Re: Long text file truncate

Choose a language, then write a program to
open the input file and the output file,
read a line and write it N times, and stop?
I assume that this could be done in C, DCL,
Fortran, or practically anything else you
have.
Hakan Zanderau ( Anders
Trusted Contributor
Solution

Re: Long text file truncate

$ open infile filename
$ open/write outfile filename
$ num = 0
$10:
$ read/end_of-file=eof infile str
$ write outfile str
$ num = num + 1
$ if num .lt. 100000 then goto 10
$eof:
$ close infile
$ close outfile
Don't make it worse by guessing.........
Volker Halle
Honored Contributor

Re: Long text file truncate

Nicolau,

try this:

$EDT/EDT/NOCO big-file
*WRITE filename 1:100000
*quit

This should write the first 100000 lines from file big-file to filename

Volker.

Jess Goodman
Esteemed Contributor

Re: Long text file truncate

Another option if the freeware EXTRACT program.
http://vms.process.com/scripts/fileserv/fileserv.com?EXTRACT
I have one, but it's personal.
Bill Hall
Honored Contributor

Re: Long text file truncate

You should be able to use SEARCH with a combination of /MATCH=EQV /LIMIT=1000000 and /OUTPUT qualifiers to get what you want.

$search my_big_file.txt "something"/match=eqv/limit=100000/output=my_small_file.txt

Bill
Bill Hall
Hein van den Heuvel
Honored Contributor

Re: Long text file truncate


$ perl -pe "last if $. > 99999" big.dat > small.dat

$ gawk /out=small.dat "(NR > 99999){exit} {print $0}" big.dat

Hein.
Joseph Huber_1
Honored Contributor

Re: Long text file truncate

If the purpose is to get rid of the space occupied by the 14 GB garbage, I would be brave and do:

set file/attr=EBK:15600 file
set file/truncate file

if the the file has average lines of 80 characters.
(15600 = 100000/512*80)

Delay the /TRUNC until You have verified the new size is sufficient.
http://www.mpp.mpg.de/~huber
Hein van den Heuvel
Honored Contributor

Re: Long text file truncate

Hein> $ gawk /out=small.dat "(NR > 99999){exit} {print $0}" big.dat

Or even...

$ gawk /out=small.dat "(NR > 99999){exit} 1" big.dat

Joseph>If the purpose is to get rid of the space occupied by the 14 GB garbage, I would be brave and do:
>> set file/attr=EBK:15600 file
>> set file/truncate file

Right. But you _should_ try to also set the FFB, allthough it will work regardless.

>> if the the file has average lines of 80 characters.
(15600 = 100000/512*80)

That would be a great, IO free, first cut and likely good enough.
If you want to do it exactly then you can use DUMP/RECORD=(COUNT=1,START=100001) and use the RFA for the exact byte offset.

$ pipe dum/re=(co=1,st=10) tmp.tmp | perl -ne "if (/^Re.*A\((\d+),(\d+),(\d+)/) {printf qq(set file/att=(ebk:%d,ffb:%d)
),$1+0x1000*$2,$3}"

This will output something like:

set file/att=(ebk:1,ffb:18)

You can then apply that to the file (after first saving:
f$file("tmp.tmp","EOF")
f$file("tmp.tmp","FFB")

fwiw,
Hein.
Joseph Huber_1
Honored Contributor

Re: Long text file truncate

Wow Hein, true ...
on the other hand, if a runaway process is filling my disk with a 15GB logfile, I probably won't care to cut a record somewhere after the 100000th :-)
http://www.mpp.mpg.de/~huber