- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: PERL for HTML file parsing
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 04:51 PM
тАО02-27-2007 04:51 PM
attached..name "input html.doc").also its source is attached in "report
source code.txt"
i just want to seperate the datas like in first line it should be..
NHTEST-3848498958-NHTEST-10.2-no-baloo a
and so on for whole report
i have a perl script.its also attached ,named-"perl coding for
parsing.txt".It can give the required output.
now suppose i have more than 1 file,ie 20 report in html format.and i have
to compare different values of all the tables from different report files
(ie,to compare buffer cache values from different report file).
so how to do that..plss give me some ideas.
i need a script to do this in unix or perl..can you help me in this
regards.
waitin for ur reply
i have used :
sed -n "s/.*Buffer Cache:<\/TD><[^>]*> *\([0-9,]*[A-Za-z]*\)<\/TD><[^>]*>
*\([0-9,]*[A-Za-z]*\).*/\1 \2/p" report.txt
its giving correct values for "buffer cache" but due to tag differences it
can't give correct values for "Redo Size".i think only by help of a script
i can do this...so pls help
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 06:07 PM
тАО02-27-2007 06:07 PM
Re: PERL for HTML file parsing
I don't just get - if you have text file why you strugle with HTML? Text file have no tags and formatting info - just grep out needed values ("Redo sizes") and compare them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 06:21 PM
тАО02-27-2007 06:21 PM
Re: PERL for HTML file parsing
then how to compare different values from different text files
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 06:41 PM
тАО02-27-2007 06:41 PM
Re: PERL for HTML file parsing
1 to 1 is good - just grep out needed value from all files and compare them. For example, you can write a script that process one file. Output of this script is a line that contains needed values ("Redo size","Logical reads" and so on) separated by '\t' or comma or what-ever-you-want. Then run this script against all text files and collect output in another file.
IE:
#!/bin/sh
OUTPUT='./output.txt'
cat /dev/null > $OUTPUT
for FILE in `find . -name "*.txt"`;
do
script_process $FILE >> $OUTPUT
done;
In this example "processing_script" - perl script that greps out needed values.
That's all.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 08:23 PM
тАО02-27-2007 08:23 PM
Re: PERL for HTML file parsing
what is script_process $FILE >>OUTPUT
as you wrote "processing_script" as the perl script name.
also if i have to write the required item in -name???
can you just give comments over ur script so that it will be little easy for me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 08:42 PM
тАО02-27-2007 08:42 PM
Re: PERL for HTML file parsing
Yes I mean,
script_process is a processing script written in perl that takes argument - file name to process, greps values and this script's output redirected to file $OUTPUT
And also you should point path to processing script. Correct version is:
#!/bin/sh
OUTPUT='./output.txt'
cat /dev/null > $OUTPUT
for FILE in `find . -name "*.txt"`;
do
./script_process $FILE >> $OUTPUT
done;
Command find . -name "*.txt" outputs list of txt files in current directory, you can point another dir - it is just example of how you can tell your script what files to process.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 09:00 PM
тАО02-27-2007 09:00 PM
Re: PERL for HTML file parsing
but i actually need that perl script by which i can grep out the values from text file
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 09:03 PM
тАО02-27-2007 09:03 PM
Re: PERL for HTML file parsing
If "report source code.txt" is html you must convert it to text - It can be done so:
for each table in html doc
match string that contain entire table
elminate tags and , tags and replace with "\t" and "\n" respectivly. Of coure cut off tag pair
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 09:38 PM
тАО02-27-2007 09:38 PM
Re: PERL for HTML file parsing
then it will be easy for me...and i can parse any html file giving as an argument only
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-27-2007 10:38 PM
тАО02-27-2007 10:38 PM
Re: PERL for HTML file parsing
#!/usr/local/bin/perl
#lets open file
open SRC, "$ARGV[0]";
#set line delim to undef
#thus we can treat file as a string
$/= undef;
#read data
$data=
#close file
close SRC;
#take table part string
$data =~ /$.*
$data = $&;
#get read of html
$data =~ s/
$data =~ s/
/\n/ig;
$data =~ s/<\/table>//ig;
$data =~ s///ig;
$data =~ s/><\/th>/>Column\t/ig;
$data =~ s/<\/th>/\t/ig;
$data =~ s/<\/TD><\/TR>//ig;
$data =~ s/<\/td>/\t/ig;
$data =~ s/<\/tr>//ig;
$data =~ s/
$data =~ s/
$data =~ s/\x20{2,}/\t/ig;
$data =~ s/ /\t/ig;
$data =~ s/\t{2,}/\t/ig;
#for example we want redo size
$data =~ /redo size:\s{1,}([\d\.\,]{1,})\s{1,}([\d\.\,]{1,}).*/is;
#output result
print $1, "\t", "$2";
open SRC, "$ARGV[0]";
$/= undef;
$data=
close SRC;
$html = HTML::TokeParser->new($data);