1830899 Members
2711 Online
110017 Solutions
New Discussion

multiple variable search

 
Edgar_8
Regular Advisor

multiple variable search

Hi,

we have a file with x amount of lines, on each line are numbers(ie. 1234 7896) and we need to search through all occurrences
of when these 2 numbers appear? So both numbers MUST appear in each record. Our script can only read one variable/number
and not both. Does anyone know how we could read 2 variables into a file search?

cat numbers.list | while read num1 num2

do

gzcat filename*.gz | grep $num1 > data_extract.log

done

Thanks!
12 REPLIES 12
john korterman
Honored Contributor

Re: multiple variable search

Hi,
a bit primitive, change the line:
gzcat filename*.gz | grep $num1 > data_extract.log
to
gzcat filename*.gz | grep "$num1" | grep "$num2" > data_extractlog

regards,
John K.

it would be nice if you always got a second chance
RAC_1
Honored Contributor

Re: multiple variable search


grep -i "1234" input_file > /tmp/file.txt
grep -i "7896" input_file >> /tmp/file.txt

for i in `cat /tmp/file.txt`
do
echo $i|grep "1234"|grep "7896"
if [ $? -eq 0]
then
echo $i > /tmp/final.txt
exit
There is no substitute to HARDWORK
RAC_1
Honored Contributor

Re: multiple variable search

OR the onliner

cat input_file|grep -i "1234"|grep -i "7896"
There is no substitute to HARDWORK
john korterman
Honored Contributor

Re: multiple variable search

perhaps you can be a little nore specific as to why it did not work out. Are you looking for exact matches with spaces on each side of the numbers or?

regards,
John K.
it would be nice if you always got a second chance
John Palmer
Honored Contributor

Re: multiple variable search

For efficiency, I suggest that you gunzip the file that you are searching once only to a temporary file (provided space is not an issue).

Something like:

TMPF=/tmp/whatever.${$}
rm -f data_extract.log

gzcat filename > ${TMPF}

cat numbers.list | {
while read num1 num2
do
grep "${num1}.*${num2}" ${TMPF} >> data_extract.log
done
}
rm ${TMPF}

Regards,
John
Edgar_8
Regular Advisor

Re: multiple variable search

Hi John,


We have an input file (input.lst) containing numbers & we have approx. 200 comma separated files.What we want to do
is search for the occurence of each number in the input file in the comma separated file & extract the entire record.
The input file looks like:

12345,6789
43212,09878
.
.
.
.
.
.
.
.
6543218,4253759

etc. Any ideas how this extract could be achieved?
John Palmer
Honored Contributor

Re: multiple variable search

If the numbers you want to read are seperated by a comma then do this

IFS=,
cat input.lst | {
while read num1 num2
...

Need some more info and examples of what you want to do though.
Elmar P. Kolkman
Honored Contributor

Re: multiple variable search

The greps can be consolidated into one egrep:
egrep "${num1}.*${num2}|${num2}.*${num1}"

Or if the file is comma seperated:
egrep "${num1},${num2}|${num2},${num1}"
Every problem has at least one solution. Only some solutions are harder to find.
Michael Schulte zur Sur
Honored Contributor

Re: multiple variable search

Hi,

try this

cat numbers.list | while read num1 num2

do

gzcat filename*.gz | grep -E ".*$num1.*$num2|.*$num2.*$num1" > data_extract.log

done

greetings,

Michael

Michael Schulte zur Sur
Honored Contributor

Re: multiple variable search

Hi,

in case the numbers are as show, you can use

gzcat filename*.gz | grep -E ".*$num1,$num2|.*$num2,$num1" > data_extract.log

greetings,

Michael
Graham Cameron_1
Honored Contributor

Re: multiple variable search


Edgar

Not sure if I've understood.
My solution looks nothing like any other, so probably not.

In your initial post you are looking for pairs of numbers, later you talk about them coming from input.lst

I would try awk, using the BEGIN clause to read in your numbers file, then processing all other files, comparing against the lists.

It is horribly inefficient.

Create a file, say edgar.awk, containing
--
BEGIN {
pairs=0
do {
eof = getline < "input.lst"
if (eof == 1) {
n=split ($0,pair,",")
if (n==2) {
pairs++
number1 [pairs]=pair[1]
number2 [pairs]=pair[2]
}
}
} while (eof == 1)
}

{
for (i=1;i<=pairs;i++) {
if (($0 ~ number1[$i]) && ($0 ~ number2[$i])) {
print
next
}
}
}
--

and invoke with awk -f edgar.awk *.csv (where *.csv represents your comma separated files.

I think from your earlier example that this might be
gzcat filename*.gz | awk -f edgar.awk

-- Graham
Computers make it easier to do a lot of things, but most of the things they make it easier to do don't need to be done.
Michael Schulte zur Sur
Honored Contributor

Re: multiple variable search

Hi,

if the problem is solved, could you spare some points from the endless supply of points, HP has, for those, who could help you? ;-)

greetings,

Michael