Operating System - Linux
1828032 Members
1922 Online
109973 Solutions
New Discussion

Re: To get script to Grep particular Pattern

 
Aakaash
Occasional Contributor

To get script to Grep particular Pattern

Hi Experts,

I am trying to select records from file (inputfile) based on a file that contains string patterns. The UNIX script must look for these all the patterns in the file in a particular location.

e.g.

Sample Input File

AAAAAA8301010707010703015
C1$$$1107019999990403266F706F70C 4398 C 5220 C 5227 C 5401 C 5423 C 5515 C 5771 C 578402C 5789 C 5796 C 5807 C 5821 C 582501C 5859 C 6003 C 6225 C 7292 C 9770 C 990903C 993009
C1$$$2107019999990403266E476E47C 408703C 4398 C 5220 C 5227 C 5401 C 5423 C 5515 C 5771 C 578402C 5789 C 5796 C 5807 C 5821 C 582501C 5859 C 6003 C 6225 C 6502 C 7292 C 9770 C 990903C 993009
C1$$$3107019999990403266E486E48C 408703C 6502
C1$$$7107019999990403266F706F70C 4398 C 5220 C 5227 C 5401 C 5423 C 5515 C 5769 C 577501C 5815 C 5821 C 5857 C 6003 C 6225 C 7292 C 9770 C 990903C 993009
C1$$$8107019999990403266E476E47C 408703C 4398 C 5220 C 5227 C 5401 C 5423 C 5515 C 5769 C 577501C 5815 C 5821 C 5857 C 6003 C 6225 C 6502 C 7292 C 9770 C 990903C 993009
C1$$$9107019999990403266E486E48C 408703C 6502
C1***49707019999999809156F706F70C 3315 C 4084 C 4398 C 496109C 511303C 5220 C 5227 C 5423 C 5821 C 6003 C 7021 C 729201C 9770 C 990903C 993009
C1***59707019999999809156E476E47C 3315 C 4084 C 408703C 4398 C 496109C 511303C 5220 C 5227 C 5423 C 5821 C 6003 C 6502 C 7021 C 729201C 9770 C 990903C 993009
C1***69707019999999809156F706F70C 3315 C 4084 C 4398 C 496207C 511304C 5220 C 5227 C 5423 C 5821 C 6003 C 7021 C 729201C 9770 C 990903C 993009
C1***79707019999999809156E476E47C 3315 C 4084 C 408703C 4398 C 496207C 511304C 5220 C 5227 C 5423 C 5821 C 6003 C 6502 C 7021 C 729201C 9770 C 990903C 993009
C1*TT19907019999990206036F706F70C 3315 C 4084 C 4397 C 4398 C 496001C 515106C 5220 C 5227 C 5401 C 5423 C 5821 C 6003 C 7021 C 729201C 9770 C 990903C 993009
C1*TT29907019999990206036E476E47C 3315 C 4084 C 408703C 4397 C 4398 C 496001C 515106C 5220 C 5227 C 5401 C 5423 C 5821 C 6003 C 6502 C 7021 C 729201C 9770 C 990903C 993009
C1*TT39907019999990206036E486E48C 408703C 6502
C1*1AL9807019999990608176F706F70C 1220 C 2503 C 2851 C 3920 C 4398 C 5401 C 5537 C 993009
C1*1CS9707019999990608156E476E47C 408703C 4398 C 5401 C 5515 C 5769 C 5859 C 6225 C 6502 C 993009
C1*1DS9807019999990608216E476E47C 1700 C 3315 C 4084 C 408703C 4398 C 496109C 5401 C 551808C 6502 C 993009
C1*1JG9507019999990608156E476E47C 408703C 4398 C 5401 C 5515 C 6225 C 6502 C 993009


Sample Pattern File:

528H
528J
528K
528L
528M

I need to search for 528H in location 3 through 7 only and write the whole record to another file if there is a match.

What I have so far:

egrep -f pattern bi3_forms.txt > inbi3_forms.txt

Problem: This command finds patterns all over the record not just from location 3 through 7.

awk 'substr($0,3,4) == "528H"' bi3_forms.txt > inbi3_forms.txt

Problem: This command finds the pattern location 3 through 7 but does not take a pattern file as input.

Any help is greatly appreciated.
7 REPLIES 7
blah2blah
Frequent Advisor

Re: To get script to Grep particular Pattern

awk 'BEGIN{
while ((getline < "pattFile") > 0)
{a[$0]=1;}
}
{
if ( a[substr($0,3,4)]) print;
}' inputFile > outFile
Hein van den Heuvel
Honored Contributor

Re: To get script to Grep particular Pattern

That's rather lousy sample data!
No good nor bad sample match present.

Is the list always as nice as the example?
If so, you could capture it in a regexpr.

awk 'match(substr($0,3,4),/528[HJLM]/)' data


prior solution as 1 liner and a little shorter:

$ awk 'BEGIN{while ((getline < "list") > 0){a[$0]=1;} } a[substr($0,3,4)]' data

fwiw,
Hein.
Dennis Handly
Acclaimed Contributor

Re: To get script to Grep particular Pattern

You might be able to simply your request by first grepping the whole record for your keys then use awk to make sure. That's assuming you want 528 as a prefix:
grep -f pattern bi3_forms.txt | awk 'substr($0,3,3) == "528"'
Sandman!
Honored Contributor

Re: To get script to Grep particular Pattern

If the pattern consists of only the data provided then use:

# grep '^..528[HJKLM].*$' infile > outfile

...above will locate the pattern at cols 3-7 only OR to be somewhat generic use:

# grep '^..[0-9]\{3\}[HJKLM].*$' infile > outfile

...above will locate the pattern that has 3 digits followed by either H/J/K/L/M in cols 3-7. Or you could use the one below to locate the pattern that has 3 digits followed by an uppercase alphabet:

# grep '^..[0-9]\{3\}[A-Z].*$' infile > outfile

Or to locate the pattern that has 3 digits followed by an upper/lower case alphabet use:

# grep '^..[0-9]\{3\}[A-Za-z].*$' infile > outfile

~hope it helps
Arturo Galbiati
Esteemed Contributor

Re: To get script to Grep particular Pattern

Hi,
use this sample pattern file:
..528H
..528J
..528K
..528L
..528M

grep -f pattern b13_forms.txt

and teh grep will look for the pattern starting in position 3.

HTH,
Art
Dennis Handly
Acclaimed Contributor

Re: To get script to Grep particular Pattern

>Art: use this sample pattern file: ..528H
>grep will look for the pattern starting in position 3.

You need to anchor each pattern otherwise it will find it in other columns:
^..528H
Nitin Kumar Gupta
Trusted Contributor

Re: To get script to Grep particular Pattern

Use

If you want to collect 528[HJKLM] from 3rd char or from 4th char or 7th char etc, you can use


grep -e '^..528[HJKLM]' -e '^...528[HJKLM]' -e '^....528[HJKLM]' -e '^.....528[HJKLM]' -e '^......528[HJKLM]'


Rgds
-NKG-