Operating System - HP-UX
1748213 Members
2975 Online
108759 Solutions
New Discussion юеВ

Re: HTML Data Extraction By perl scripting

 
SOLVED
Go to solution
Dodo_5
Frequent Advisor

HTML Data Extraction By perl scripting

i have a HTML report file..its in attachment(a part of the whole report is attached)
i just want to seperate the datas like in first line it should be..

NHTEST-3848498958-NHTEST-10.2-no-baloo a
and so on for whole report

how to seperate the datas from tables that kind of format with the use of perl(or unix )scripting.

please help guys..write the script as a whole pls.otherwise it will be difficult to understand for me
its urgent plsss...
9 REPLIES 9
Oviwan
Honored Contributor

Re: HTML Data Extraction By perl scripting

Hey

You can make an other snapshot:
9i: @?/rdbms/admin/spreport
10g: @?/rdbms/admin/awrrpt.sql

then choose text as output format, this is easier to modify.

Regards
Peter Godron
Honored Contributor

Re: HTML Data Extraction By perl scripting

Hi,
surely TCS has the resource/experience to do this ?!

Take the report in html format, pull out the table rows (marked by TR and /TR).
Then remove all HTML markers and what you have left is the table data without HTML markers.

Quck check on the web:
http://www.thescripts.com/forum/thread49414.html
http://www.wdvl.com/Authoring/Languages/Perl/PerlfortheWeb/summarizer.html
http://www.unix.org.ua/orelly/perl/cookbook/ch20_07.htm
http://cpan.uwinnipeg.ca/htdocs/HTML-Strip/HTML/Strip.html


Please also read:
http://forums1.itrc.hp.com/service/forums/helptips.do?#33 on how to reward any useful answers given to your questions.

Dodo_5
Frequent Advisor

Re: HTML Data Extraction By perl scripting

when i tried to run the scripts then it shows as:
Can't locate HTML/TableExtract.pm in @INC (@INC contains: /usr/lib/perl5/5.8.5/i386-linux-thread-multi

actually i dont have admin rights on machine.
can you please help writing a perl script to extract datas from the table in a html file which exixts in my pc(not like a URL,taking it as a file in pc)
Dodo_5
Frequent Advisor

Re: HTML Data Extraction By perl scripting

go through the source of the html file...pls send me solution.its urgent..
its a part of whole report.
James R. Ferguson
Acclaimed Contributor

Re: HTML Data Extraction By perl scripting

Hi:

> ...how to seperate the datas from tables that kind of format with the use of perl(or unix )scripting. i dont have administrator rights in my pc.so pls send script without having such commands(which needs admin privelege)...please help guys..write the script as a whole pls. otherwise it will be difficult to understand for me
its urgent plsss...

Without payment for doing your job, I don't think anyone is going to write a solution that you earn you your pay.

use Perl;

That said, however, you don't need administrator rights to install modules locally in directories with which you have write-access.

http://www.cpan.org/modules/INSTALL.html

As for parsing the HTML, you should look at modules like: HTML::Parser, HTML::FormatText, HTML::LinkExtor just to name a few. Fetch what you need from CPAN:

http://www.cpan.org/

Regards!

...JRF...

Ralph Grothe
Honored Contributor

Re: HTML Data Extraction By perl scripting

Hi,

I'd like to second Jame's statement about doing your chores.
Parsing tagged markup is a bit more involved, especially if it's not well formed.
But there exist standard Perl HTML parsers for the task.
Basically there seem to be two avenues.
Either use HTML::TreeBuilder
http://search.cpan.org/~petek/HTML-Tree-3.23/lib/HTML/TreeBuilder.pm
or HTML::TokeParser
http://search.cpan.org/~gaas/HTML-Parser-3.56/lib/HTML/TokeParser.pm
If you can afford I would suggest trying both modules two get the different idea how HTML can be treated.
Of course almost every scripting language should have HTML parsers for this purpose.
Madness, thy name is system administration
Dodo_5
Frequent Advisor

Re: HTML Data Extraction By perl scripting

thanks but expecting a little bit more from you experts.
Ralph Grothe
Honored Contributor
Solution

Re: HTML Data Extraction By perl scripting

Ok Dodo,
without any guarantee if this will be of any value I tinkered up the tiny attached script which uses Perl and the HTML::TokeParser module.
You will have to check for yourself what exactly your HTML looks like and what you need to really parse.
E.g. my script produces this:

$ ./shp.pl
NHTEST
3848498958
NHTEST
1
10.2.0.2.0
NO
baloo_a
Begin Snap:
1728
02-Feb-07 20:00:35
20
3.1
Madness, thy name is system administration
Dodo_5
Frequent Advisor

Re: HTML Data Extraction By perl scripting

really amazing...you have done a great job.very much thankfull to you.
for now its really enough...
i think i will get more job on this script.keep in touch.
thanks to all...