Operating System - HP-UX
1829913 Members
3069 Online
109993 Solutions
New Discussion

Fecth Based on Line Numbers

 
uform
Frequent Advisor

Fecth Based on Line Numbers

Hi ,

I have a list of line nos in a file and would like to fetch those lines directly in a file and write it to a new file.

Thanks

16 REPLIES 16
James R. Ferguson
Acclaimed Contributor

Re: Fecth Based on Line Numbers

Hi:

If you mean that you want to extract a range of lines from a file, then you can do, for example:

# perl -ne 'print if 3..5' /etc/hosts

Regards!

...JRF...
spex
Honored Contributor

Re: Fecth Based on Line Numbers

Hi,

# cat f1
one
two
three
four
five
six
seven
eight
nine
ten

# cat lines
2p;5,8p

# sed -n "$(cat lines)" f1
two
five
six
seven
eight

PCS
James R. Ferguson
Acclaimed Contributor

Re: Fecth Based on Line Numbers

Hi:

A mixture of line numbers and/or ranges can also be extracted thusly:

# perl -ne 'print if 3..5 or 1..1 or 9..9 or 17..eof' /etc/hosts

...thus lines 1, 3, 4, 5, 9, and 17 until the end-of-file are printed.

Regards!

...JRF...
Sandman!
Honored Contributor

Re: Fecth Based on Line Numbers

If the numbers are numeric then:

# grep '[0-9]' infile > outfile
uform
Frequent Advisor

Re: Fecth Based on Line Numbers

cat File1.txt [lines nos of File2.dat]
298038
458038
564678
984738
573937
238480
123483

Now i want to fetch above listed line no's records from File2.dat and write it to
File3.dat

uform
Frequent Advisor

Re: Fecth Based on Line Numbers

cat File1.txt [lines nos of File2.dat]
298038
458038
564678
984738
573937
238480
123483

Now i want to fetch above listed line no's records from File2.dat and write it to
File3.dat (file3.dat should have only 7 records in it)



James R. Ferguson
Acclaimed Contributor

Re: Fecth Based on Line Numbers

Hi:

# grep -f ./File1.txt ./File2.dat > ./File3.dat

See the manpages for 'grep(1)'.

Regards!

...JRF...
spex
Honored Contributor

Re: Fecth Based on Line Numbers

Hello again,

# cat extract_lines.sh
#!/usr/bin/sh
f1=$1
f2=$2
for line in $(cat ${f1})
do
sed -n "${line}p;" < ${f2}
done
exit

# ./extract_lines.sh ./File1.txt ./File2.dat > ./File3.dat

PCS
Hein van den Heuvel
Honored Contributor

Re: Fecth Based on Line Numbers

Should the output in file3 appear in any order?

In file2 order, using AWK and a HELPER script:


> awk "{print "(NR==" $1 ")"}" file1.tmp > helper.awk
> awk -f helper.awk file2.tmp
three
five
eight
hrtherh

Using perl, reading first file into an array, grepping that array whilre reading second file:

------------ extract.pl ----------------
$file = shift @ARGV;
open FILE,"<$file";
@lines = ;
close FILE; # reset line number
$file = shift @ARGV;
open FILE,"<$file";
while () {
print if grep /^$.$/,@lines;
}

Using perl, in order of file1
------------ extract_ordered.pl --------
$file = shift @ARGV;
open FILE,"<$file";
@lines = ;
chomp @lines;
close FILE; # reset line number

$file = shift @ARGV;
open FILE,"<$file";
while () {
$data{"$."}=$_ if grep /^$.$/,@lines;
}
foreach (@lines) {
print $data{$_};
}


Using perl but with associatve array to remember desired lines:
------------- extract.pl ----------
$file = shift @ARGV;
open FILE,"<$file";
while () {chomp; $lines{$_}++}
close FILE; # reset line number
$file = shift @ARGV;
open FILE,"<$file";
while () { print if $lines{$.} }


Enjoy!
Hein.
Hein van den Heuvel
Honored Contributor

Re: Fecth Based on Line Numbers

oops, meant to show my sample data. See below.
The '14' is there to make sure line 1 nor 4 matched against it 'by accident'.

> cat file1.tmp
5
3
14
8
>cat file2.tmp
one
two
three
four
five
six
seven
eight
nine
ten
asfawer
wergwergtwr
rtgherther
hrtherh
erher
her
hj
etyjtrj

Re: Fecth Based on Line Numbers

Hello,

Try this :

cat -n File2.dat | egrep -f File1.txt > File3.dat

Regards,

JPH
Hein van den Heuvel
Honored Contributor

Re: Fecth Based on Line Numbers

JRF,

Your suggested solution seems to assume that the line number is part of the line in file2, and does not otherwise randomly appear in lines.

JPH,

Nice tweak, with "cat -n". It 'looks' much the same solution as JRF proposed but it will add line numbers to match on.
- One must hope the -n number formatting matches the file1 number presentation.
- One would need to feed the output to cut or awk to remove the added linenumbers from file3
- One might want to use a more complete (anchored) search expression to avoid triggering on random numbers anywat on a data line which happen to match a line number.


Cheers,
Hein.
Peter Nikitka
Honored Contributor

Re: Fecth Based on Line Numbers

Hi,

since your file you lookup for line numbers may be large (your line numbers imply this), I would prefer to scan the file only once.
My solutions generates two awk program fragments. When put together, they give you the lines in the same order as defined in your line numbers file.

l=`wc -l awk '{i++;print "NR=="$1 " {a["i"]=$0}"}' /tmp/num | sort -n -k 1.5 -k 2 >/tmp/prog.awk
print "END {for(j=1;j<=$l;j++) print a[j]}" >>/tmp/prog.awk

awk -f /tmp/prog.awk file_to_lookup

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
Peter Nikitka
Honored Contributor

Re: Fecth Based on Line Numbers

Sorry for the followup post,

but I wanted to add an example of the generated awk program:

cat /tmp/num
2
5
79
589
234
56
8
444

cat /tmp/prog.awk
NR==2 {a[1]=$0}
NR==5 {a[2]=$0}
NR==8 {a[7]=$0}
NR==56 {a[6]=$0}
NR==79 {a[3]=$0}
NR==234 {a[5]=$0}
NR==444 {a[8]=$0}
NR==589 {a[4]=$0}
END {for(j=1;j<= 8;j++) print a[j]}

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"

Re: Fecth Based on Line Numbers

Hello,

Hein :

To take your relevant remarks into account, I suggest :

1) to modify File1.txt so its strings be translated into anchored regular expressions

awk '{ print "^ *" $0 "\t" }' File1.txt > File1.regexp

Notice there is a SPACE character (ASCII 0x20) between ^ and * because line numbers generated by cat -n are padded with spaces to fit a 6-digit format number. When the number of lines exceeds 999,999 line numbers are not padded with spaces anymore. Whatever the case, there's always one tabulation between the line number and the beginning of the line.

2) to prevent line numbers from appearing in File3.dat with cut as you proposed.

Hence, the final command would be :

cat -n File2.dat | egrep -f File1.regexp | cut -f' ' -f2- > File3.dat

The separator to pass to the cut command through the -f option is a tabulation.

Uform :

If you're convinced that File2.dat does not contain any sequence of digits that looks like a line number you can even use the fgrep command which is, at least, twice faster than egrep but does not allow regular expressions :

cat -n File2.dat | fgrep -f File1.txt | cut -f' ' -f2- > File3.dat

Regards,

JPH



john korterman
Honored Contributor

Re: Fecth Based on Line Numbers

Hi,

maybe this script can be used:

#!/usr/bin/sh
cat File1.txt | sort| while read line
do
nl File2.txt | awk -v first=$line '$1 ~ first' | cut -f2-
done


regards,
John K.
it would be nice if you always got a second chance