Operating System - Microsoft
1748197 Members
2676 Online
108759 Solutions
New Discussion юеВ

Re: Extract pattern recursively to new file

 
SOLVED
Go to solution
Gyan
New Member

Extract pattern recursively to new file


Uno punto
Dos punto
transaction


punto
Dos
transaction


Uno punto
Dos punto
transaction


I want to extract each pattern between and to a new file with the filename as the id of the transactiion element

I want to know how this can be done using awk or sed.Kindly input your thoughts

Regards,
Gyan
9 REPLIES 9
Mel Burslan
Honored Contributor
Solution

Re: Extract pattern recursively to new file

I am sure there is a more elaborate way of doing this with perl, but my way uses the quick and dirty shell script (any quasi posix compliant shell, likes of bash, ksh, hpux's sh) way of doing it.


startline=`grep -n "transaction id" myfile.xml | head -1`
startlineno=`echo ${startline}|cut -d: -f1`
transname=`echo ${startline}|cut -d"=" -f2|sed -e "1,1s/\"//g"
endlineno=`grep -n "/transaction" myfile.xml | head -1 |cut -d: -f1`
(( sl=${startlineno}+1 ))
(( el=${endlineno}-1 ))
cat myfile.xml | sed -n ${sl},${el}p > ${transname}
cat myfile.xml | sed -e "${startlineno},${endlineno}d" > tmpxmlfile
cat tmpxmlfile > myfile.xml


at this point, you have extracted the first xml block out of the file and placed in the transaction named file and deleted the extracted portion from the original file.

Now, for the rest of your homework studies, you need to build a while/do construct to go through the file until there is nothing left in it. It is only fair if you do part of the homework yourself :) Isn't it ?
________________________________
UNIX because I majored in cryptology...
Hein van den Heuvel
Honored Contributor

Re: Extract pattern recursively to new file


There is probably some XML tool to do this for you.

But if it is for a one-off, controlled, situation then yo may try for example a quick PERL solution :

# perl -ne 'close F if m~^$1.tmp") if /^
explanation

# perl -ne ' ## start perl looping over input
close F if m~^print F if F; ## Print any line while file handle defined.
open (F,">$1.tmp") ## Open a file F for output with name from variable $1
if /^

Full example below.
hth,
Hein


# ls *.tmp
x.tmp
# perl -ne 'close F if m~^$1.tmp") if /^# ls -l *.tmp
-rw-r--r-- 1 Administrator None 70 Apr 8 10:34 dos.tmp
-rw-r--r-- 1 Administrator None 80 Apr 8 10:34 tres.tmp
-rw-r--r-- 1 Administrator None 80 Apr 8 10:34 uno.tmp
-rw-r--r-- 1 Administrator None 355 Apr 8 10:34 x.tmp
# cat dos.tmp
2 punto
2 Dos
transaction
#
# cat x.tmp

1 Uno punto
1 Dos punto
transaction


2 punto
2 Dos
transaction


3 Uno punto
3 Dos punto
transaction
#


Hein van den Heuvel
Honored Contributor

Re: Extract pattern recursively to new file

Forgot one line explanation detail

if /^
Broken down

^ ## at begin of line
(.+)"/ ## remember anything up to the closing " in variable $1
' x.tmp ## close the program text, and specify input.

You surely noticed how I changed you input file example to be able to distiguish lines by id.

Hein.
Gyan
New Member

Re: Extract pattern recursively to new file

I tried to modify it slightly as below
startline=`grep -n "transactiion id" myfile.xml | head -1`
startlineno=`echo ${startline}|cut -d: -f1`
transname=`echo ${startline}|cut -d"=" -f2|sed -e "1,1s/\"//g"|sed -e "s/>//g"`
endlineno=`grep -n "/transactiion" myfile.xml | head -1 |cut -d: -f1`
s2=`expr "${startlineno}"`
s1=`echo $s2 | awk '{print $1 + 1}'`
e1=`echo $endlineno | awk '{print $1 - 1}'`
cat myfile.xml | sed -n ${sl},${el}p > $transname

but it is throwing exception sed: -e expression #1, char 1: unknown command: `,'
i think its because of the last line cat myfile.xml | sed -n ${sl},${el}p > $transname

I have still not tested it in perl as i am having problems with cygwin setup with perl
Hein van den Heuvel
Honored Contributor

Re: Extract pattern recursively to new file


>> not tested it in perl as i am having problems with cygwin setup with perl


fwiw... I tested my answer under cygwin.

Hein
Gyan
New Member

Re: Extract pattern recursively to new file

AS my environment is supporting only bash scripting thats why i wrote in my initial mail

"I want to know how this can be done using awk or sed."

Hein van den Heuvel
Honored Contributor

Re: Extract pattern recursively to new file


contemplation:

Hmmm, you were entering your question in an HPUX forum. That suggests that PERL is available as just an other tool.

You really want to get perl in your toolbox... it is handy.

condemnation:

Anyway, the same perl method maps easily onto AWK:

awk '/^<\/tra/ {F=""} F {print $0>F} /^
explanation:

/^<\/tra/ {F=""} ## mark file as close at end of transaction
F {print $0>F} ## IF there is a filename then print on that file.

/^F=a[2] ".tmp" } ## compose output file name from split array

demonstration:

# cp x.txt x.tmp
# dir *.tmp
x.tmp
# awk '/^<\/tra/ {F=""} F {print $0>F} /^# dir *.tmp
dos.tmp tres.tmp uno.tmp x.tmp
# cat dos.tmp
2 punto
2 Dos
transaction
# cat x.tmp

1 Uno punto
1 Dos punto
transaction


2 punto
2 Dos
transaction


3 Uno punto
3 Dos punto
transaction

#

salutation:

Regards,
Hein.
Mel Burslan
Honored Contributor

Re: Extract pattern recursively to new file

Well, the solution I provided, works pretty fine including sed and all else on HPUX /bin/sh shell. cygwin or any windows emulation of unix shell is subject to their particular interpretation and implementation of these tools and compatibility between the two is not always guaranteed.

I am sure there are countless number of cygwin related forims, albeit not as active as this one and if you answer your question in one of those, sooner or later, you will get a response.
________________________________
UNIX because I majored in cryptology...
Gyan
New Member

Re: Extract pattern recursively to new file

The awk stuff works, maybe i need to purchase a copy of hp-ux to get things moving :) Thanks everybody