Operating System - Linux
1827870 Members
1169 Online
109969 Solutions
New Discussion

extracting data from string

 
SOLVED
Go to solution
Chartier Jerome
Frequent Advisor

extracting data from string

Hi All,

I need to extract some tagged line from a file formated like this:

flag_start
line1
line2
...
...
Serial_number1
linea
lineb
...
...
flag_stop
flag_start
line1
line2
...
...
Serial_number2
linea
lineb
...
...
flag_stop

In the file, there are several flagged patern with Serial_number1, and I would like to extract the whole pattern and write it in a new file ...

Can someone help me with this

Thanks in advance for your help

Best Regards

Jérôme

J@Y
15 REPLIES 15
Doug O'Leary
Honored Contributor

Re: extracting data from string

Hey;

My understanding of your request is that you want everything from "Serial_number1" to flag_stop in a different file. Assuming that's the case, it's easy:

sed -n -e '/^Serial_number1/,/^flag_stop/p' ${in_file} > ${new_out_file}

HTH;

Doug

------
Senior UNIX Admin
O'Leary Computers Inc
linkedin: http://www.linkedin.com/dkoleary
Resume: http://www.olearycomputers.com/resume.html
Leif Halvarsson_2
Honored Contributor

Re: extracting data from string

Hi,
Not sure about exact what you want to do but, awk is a very useful tool for processing text files. Search the forum for "awk" and you will find a lot of examples how to use awk.
Chartier Jerome
Frequent Advisor

Re: extracting data from string

Hi all,

Thanks for your answers, in fact, In each file, the pattern is flagged beetween flag_start and flag_stop.
In each file, I recognize the pattern by Serial_number, and there are several per file.
What I would like is to extract all the patterns (beetween flag_start and flag_stop) that have the same Serial_number.


Thanks in advance for your help

Jérôme C
J@Y
Simon Hargrave
Honored Contributor

Re: extracting data from string

Hmm, so you want your logic to say: -

If the "serial number" pattern within this particular flagged block matched X, then extract the whole block?

If so then try something like:

awk -F"\n" 'BEGIN { matched=0 ; block="" } \
$1 !~ "^flag_s*" { block = block$1"\n" } \
$1 == "Serial_number1" { matched = 1 } \
$1 == "flag_stop" { if ( matched == 1 ) { print block } ; block="" ; matched=0 }' infile

Basically it sets a flagged called "matched" to zero. Then reads each line in turn, and appends it to a variable. If it gets to a matching serial number, it sets the "matched" flag to 1. Then when it comes to "flag_stop" it knows it's finished a block, and if at this time "matched" is 1, it prints the block, otherwise it doesn't.

This suit?
Kent Ostby
Honored Contributor

Re: extracting data from string

Jerome --

Can you give us like 10 to 20 lines of a real file.

The problem I'm having is figuring out if Serial_numberN appears multiple times or what.

The following script will print ALL of the data between flag_start and flag_stop. I suspect that's not what you want, but I thought I would post that much while I was asking my questions.

Create a file called useme.awk with the following in it:

BEGIN{daflag=0;}
/flag_start/ {daflag=1;next}
/flag_stop/ {daflag=0;next}
daflag==1 {print $0}

Run:

awk -f useme.awk < inputfile > outputfile


"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Simon Hargrave
Honored Contributor

Re: extracting data from string

Oh and my example assumed you wanted to strip the flag_start and flag_stop entries. If you want to keep them, simply change the second line to be:

{ block = block$1"\n" } \

and thus removing the ^flag_s* ignorance.
Rajesh SB
Esteemed Contributor

Re: extracting data from string

Hi,

I got some idea to extract the flagged patern to share with you.
Using combination of commands.

grep -n filename

where -n option give you line no. of startflag.
Extract the pattern using the head or tail command option to -/+.

Hope this hint may give some more idea.

Regards,
Rajesh

Hein van den Heuvel
Honored Contributor

Re: extracting data from string


How do you recognize a serial number?
Does it always start with "Serial" in the begin of a line?
You wanted the lines BETWEEN start and stop right, not inclusive right?
If so ten I think the perl below is what you need.
You'll need to adapt the 'Serial' match to the actual pattern, and you probably want a different algoritme for the file name generation ($file = ...)

hth,
Hein.

while (<>) {
if (/^flag_start/) {
$save = 1;
next;
}
if (/^flag_stop/) {
$save = 0;
if ($file) {
open (FILE,">> $file") or die "could not append to $file";
print FILE $line while ($line = shift @lines);
close FILE;
}
}
next unless $save;
push @lines, $_;
if (/^Serial/) {
chop;
$file = $_ . ".tmp" ;
}
}
Chartier Jerome
Frequent Advisor

Re: extracting data from string

Thanks everyone ...

I would like to include the lines.
I am sorry, I am very bad in scripting so ..
Where do I put the input filename .. in the script?

Thanks in advance


Best Regards


Jérôme
J@Y
Simon Hargrave
Honored Contributor

Re: extracting data from string

Who's script do you refer to? If you mean mine (the use of awk), then the input filename is after the code, eg you will see "infile" at the end of my post, that is the name of the input file. If you want the output in an output file then append >outfile at the end of the last line.
Hein van den Heuvel
Honored Contributor

Re: extracting data from string


Still not too clear.
But here is an other PERL variation.
This one takes the serial number desired as input variable.

If you cut & paste the perl below into a file called for example 'extract.p' then you execute as:

#perl extract.p Serial_number1 < all.data > extracted.data

Between my first example and this tweak you should have all possible needs covered (the second outputs start and stop... I'm sure you can figure out how)

My first example just execute as:

#perl extract.p < all.data

it will create: Serial_number1.tmp and Serial_number2.tmp


Good luck!
Hein

$serial = shift @ARGV or die "please provide serial number to select on";
while (<>) {
if (/^flag_start/) {
$save = 1;
$file = 0;
undef @lines;
}
if (/^flag_stop/) {
$save = 0;
push @lines, $_;
if ($file) {
print while ($_ = shift @lines );
}
print "\n";
}
push @lines, $_ if $save;
$file++ if /$serial/;
}
Chartier Jerome
Frequent Advisor

Re: extracting data from string

Hello all,

Thanks for your answers.
Hein, your second script seems to run .. congratulations and thanks for all.
A last question Hein, If I want to do this on all files in the same directory?
For example files starting with a patern ABC.

Thanks again

Jérôme C
J@Y
Chartier Jerome
Frequent Advisor

Re: extracting data from string

Hi All,

Thanks a lot for your help.
Is it possible to extend the second perl script to pick every file in a directory starting with a common pattern and store all the results in one file?

Thanks all again

Best Regards

Jérôme
J@Y
Muthukumar_5
Honored Contributor
Solution

Re: extracting data from string

You can execute perl script as,


perl extract.pl ABC*.log

It will check all the files with ABC*.log pattern.

You can also use this script as,

for file in `ls `
do
perl -ne '{if(/^Serial_number1/){undef @arr;next;}if(/^flag_stop/){exit;}push @arr,$_;}END{print @arr;}' $file
done

For example:

test.log test1.log test2.log

for file in `ls test*.log`
do
perl -ne '{if(/^Serial_number1/){undef @arr;next;}if(/^flag_stop/){exit;}push @arr,$_;}END{print @arr;}' $file
done > Serial_number1.tmp

hth.
Easy to suggest when don't know about the problem!
Hein van den Heuvel
Honored Contributor

Re: extracting data from string

Jérôme, Good to see you are all set now.
I kanda had read the requirement for multiple input files, but wanted to leave somethign for you to do :-).

Muthukumar, I discarded a one liner along the lines you showed because it was my understanding that the 'serial number' was after an arbitrary number of data lines that would also be needed. So you need to start remembering at flag_start, not just when you see you are reading data for the right serial number.

Cheers,
Hein.