Operating System - HP-UX
1838889 Members
3450 Online
110131 Solutions
New Discussion

Re: sorting and formating script - for scripting champions!

 
SOLVED
Go to solution
Bill McNAMARA_1
Honored Contributor

sorting and formating script - for scripting champions!

Hi,
I've got an xml file as follows:


StagsHead

BigTown

Main St.

5

johnny walker



1

Stout

5

4





With lots of Drink sections.

I'd like to format all this data as follows:

Bar: StagsHead

Drink === i d === i d === i d ===
0-31 --- yes --- no --- yes ---
31-64 --- no --- yes --- yes ---

and this way too:

Drink: Stout

Taps === StagsHead === NagsHead === Chocolate
5 --- 5 --- X --- X
4 --- X --- 4 --- X
3 --- X --- X --- 3

etc..

Complicated.. so where to start.

I'm thinking for a start to have an option to the script -perbar and -perdrink

Anyway,
Interested to see what comes up!
All replies rewarded.

Later,
Bill
It works for me (tm)
10 REPLIES 10
harry d brown jr
Honored Contributor

Re: sorting and formating script - for scripting champions!

Bill,

It's time to jump on the perl beerwagon:

http://wwwx.netheaven.com/~coopercc/xmlparser/Parser.html

live free or die
harry
Live Free or Die
Bill McNAMARA_1
Honored Contributor

Re: sorting and formating script - for scripting champions!

never used perl Harry,
can you give me an example..
I have it installed for what it's worth!
(except for that module..)

Thanks,
Bill
It works for me (tm)
Steven Gillard_2
Honored Contributor

Re: sorting and formating script - for scripting champions!

Perl is definitely the way to go, I gave up on shell scripting a long time ago :) If you know a bit of C, perl isn't hard to pick up at all.

There are perl modules available on www.cpan.org for parsing XML, so you won't have to re-invent the wheel completely.

Cheers,
Steve
Paula J Frazer-Campbell
Honored Contributor

Re: sorting and formating script - for scripting champions!

Hi Bill
For a start grep out each section to a file.

ie.
cat file.xml | grep Name > /tmp/bars
cat file.xml | grep Town > /tmp/town

Etc

search and replace (sed) the < with spaces.
Awk out the name field and output to files
one for each bit of data.


Paula




Paula
If you can spell SysAdmin then you is one - anon
Sridhar Bhaskarla
Honored Contributor

Re: sorting and formating script - for scripting champions!

Seems we need to do quite some twists.

1. Get all the names

grep "" thisfile |awk '{FS=">";print $2}'

2. Now get each block categorized under this above name.

For i in "the above"
do
sed -n '/'$i'/,/\<\/Bar\>/p' bar > /tmp/name$$
call_another_function on /tmp/name$$
done

3. This another_function will need to filter out drinks like we filtered out Names in the above and gets properties of each drink. It would be cumbersome if we don't use sed and awk's here.Then a simple printf statement would format the results.

You will definitely need a couple of coffees and may be some drinks mentioned in your xml before you can get this working.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
David Lodge
Trusted Contributor
Solution

Re: sorting and formating script - for scripting champions!

Eek! Doing this via shell is probably not the best way to do this - the best way would be to use some form of relational database (eg mysql) or use an XML parser for either perl *spit* or C.

If you *have* to use shell then I'd suggest using awk to convert to a halfway house type format of file and then using normal shell tools to do the rest.

You could use awk similar to below to get the fields out:
awk '
/\/,/\<\/bar\>/
{
do {
if ( $0 ~ "" ) print "Name: " substr($0, index($0,"")+6);
getline;
} while ($0 != "
)
}' bar.xml

Of course this will fail the xml is like:
The aardvark's head

dave

dave
Marco Paganini
Respected Contributor

Re: sorting and formating script - for scripting champions!

Hello Bill,

A task for perl + Bundle::XML.

This Bundle downloads a lot of XML modules from CPAN and allows you to manipulate XML data (to be honest, I started writing your script, but timing issues here prevented me from completing...).

To install Bundle::XML

perl -MCPAN -e "install Bundle::XML"

(as root)

Then, man XML::Parser to see an example on how to do it.

Regards,
Paga
Keeping alive, until I die.
Rodney Hills
Honored Contributor

Re: sorting and formating script - for scripting champions!

Find attached a 50 line perl routine that will parse (without the XML module) your data file and generate one of the reports shown.

The programs assumes that the syntax of the xml file is correct and that the tags are all unique.

The data is slurpped into a hash variable and it should be relativelly easy to reformat the data into whatever report you would like..

-- Rod Hills
There be dragons...
Rodney Hills
Honored Contributor

Re: sorting and formating script - for scripting champions!

I didn't do a full test of the data. I created some addition data for a test and found a couple bugs in the program.

Find attached the corrected program.

-- Rod Hills
There be dragons...
Robin Wakefield
Honored Contributor

Re: sorting and formating script - for scripting champions!

Hi Bill,

I don't know what Taps is meant to be, so this may not be correct. I'd go the perl route, but I know how much you like awk ;-)

Run it with:

awk -F\> -f file.awk file.xml

=========================================
BEGIN{i=0}
/////<\/Drink/{
bardrink[name" "id]=1
drinktap[id" "name" "tap]=1
}
END{
for (j=0;j print "\nBar: "pubs[j]
print
printf("Drink")
for (k=0;k<32;k++)
printf(" === i d")
printf("\n0-31 ")
for (k=0;k<32;k++) {
if (bardrink[pubs[j]" "k]==1)
printf("--- yes")
else
printf("--- no ")
}
printf("\n32-63 ")
for (k=32;k<64;k++) {
if (bardrink[pubs[j]" "k]==1)
printf("--- yes")
else
printf("--- no ")
}
print
}
for (drink in drinks) {
printf("\n\nDrink: %s\n\n",drink)
printf("Taps")
for (j=0;j printf(" === %s",pubs[j])
for (tap in taps) {
printf("\n%d ",tap)
for (j=0;j if (drinktap[drinks[drink]" "pubs[j]" "tap]==1)
printf("--- %d",tap)
else
printf("--- X")
}
}
}
print
}

=========================================

Rgds, Robin