<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PERL for HTML file parsing in Operating System - HP-UX</title>
    <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952587#M290506</link>
    <description>This is sample script. I am not strong in HTML::TokeParser so I used regexp to get rid of HTML.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#!/usr/local/bin/perl&lt;BR /&gt;&lt;BR /&gt;#lets open file&lt;BR /&gt;open SRC, "$ARGV[0]";&lt;BR /&gt;#set line delim to undef &lt;BR /&gt;#thus we can treat file as a string&lt;BR /&gt;$/= undef;&lt;BR /&gt;#read data&lt;BR /&gt;$data=&lt;SRC&gt;;&lt;BR /&gt;#close file&lt;BR /&gt;close SRC;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#take table part string&lt;BR /&gt;$data =~ /$.*&lt;TABLE.&gt;.*&amp;lt;\/table&amp;gt;.*/igs;&lt;BR /&gt;$data = $&amp;amp;;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#get read of html&lt;BR /&gt;$data =~ s/&lt;TABLE.&gt;//ig;&lt;BR /&gt;$data =~ s/&lt;P&gt;/\n/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/table&amp;gt;//ig;&lt;BR /&gt;$data =~ s///ig;&lt;BR /&gt;$data =~ s/&amp;gt;&amp;lt;\/th&amp;gt;/&amp;gt;Column\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/th&amp;gt;/\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/TD&amp;gt;&amp;lt;\/TR&amp;gt;//ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/td&amp;gt;/\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/tr&amp;gt;//ig;&lt;BR /&gt;$data =~ s/&lt;H3.&gt;//ig;&lt;BR /&gt;$data =~ s/&lt;T&gt;//ig;&lt;BR /&gt;$data =~ s/\x20{2,}/\t/ig;&lt;BR /&gt;$data =~ s/&amp;nbsp;/\t/ig;&lt;BR /&gt;$data =~ s/\t{2,}/\t/ig;&lt;BR /&gt;&lt;BR /&gt;#for example we want redo size&lt;BR /&gt;&lt;BR /&gt;$data =~ /redo size:\s{1,}([\d\.\,]{1,})\s{1,}([\d\.\,]{1,}).*/is;&lt;BR /&gt;&lt;BR /&gt;#output result&lt;BR /&gt;print $1, "\t", "$2";&lt;BR /&gt;&lt;BR /&gt;open SRC, "$ARGV[0]";&lt;BR /&gt;$/= undef;&lt;BR /&gt;$data=&lt;SRC&gt;;&lt;BR /&gt;close SRC;&lt;BR /&gt;$html = HTML::TokeParser-&amp;gt;new($data);&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SRC&gt;&lt;/T&gt;&lt;/H3.&gt;&lt;/P&gt;&lt;/TABLE.&gt;&lt;/TABLE.&gt;&lt;/SRC&gt;</description>
    <pubDate>Wed, 28 Feb 2007 06:38:19 GMT</pubDate>
    <dc:creator>Maxim Yakimenko</dc:creator>
    <dc:date>2007-02-28T06:38:19Z</dc:date>
    <item>
      <title>PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952578#M290497</link>
      <description>i have a HTML report file..its in attachment(a part of the whole report is &lt;BR /&gt;&lt;BR /&gt;attached..name "input html.doc").also its source is attached in "report &lt;BR /&gt;&lt;BR /&gt;source code.txt"&lt;BR /&gt;&lt;BR /&gt;i just want to seperate the datas like in first line it should be..&lt;BR /&gt;&lt;BR /&gt;NHTEST-3848498958-NHTEST-10.2-no-baloo a&lt;BR /&gt;and so on for whole report&lt;BR /&gt;&lt;BR /&gt;i have a perl script.its also attached ,named-"perl coding for &lt;BR /&gt;&lt;BR /&gt;parsing.txt".It can give the required output.&lt;BR /&gt;&lt;BR /&gt;now suppose i have more than 1 file,ie 20 report in html format.and i have &lt;BR /&gt;&lt;BR /&gt;to compare different values of all the tables from different report files &lt;BR /&gt;&lt;BR /&gt;(ie,to compare buffer cache values from different report file).&lt;BR /&gt;&lt;BR /&gt;so how to do that..plss give me some ideas.  &lt;BR /&gt;i need a script to do this in unix or perl..can you help me in this &lt;BR /&gt;&lt;BR /&gt;regards.&lt;BR /&gt;waitin for ur reply&lt;BR /&gt;&lt;BR /&gt;i have used :&lt;BR /&gt;&lt;BR /&gt;sed -n "s/.*Buffer Cache:&amp;lt;\/TD&amp;gt;&amp;lt;[^&amp;gt;]*&amp;gt; *\([0-9,]*[A-Za-z]*\)&amp;lt;\/TD&amp;gt;&amp;lt;[^&amp;gt;]*&amp;gt; &lt;BR /&gt;&lt;BR /&gt;*\([0-9,]*[A-Za-z]*\).*/\1 \2/p" report.txt&lt;BR /&gt;&lt;BR /&gt;its giving correct values for "buffer cache" but due to tag differences it &lt;BR /&gt;&lt;BR /&gt;can't give correct values for "Redo Size".i think only by help of a script &lt;BR /&gt;&lt;BR /&gt;i can do this...so pls help</description>
      <pubDate>Wed, 28 Feb 2007 00:51:31 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952578#M290497</guid>
      <dc:creator>Dodo_5</dc:creator>
      <dc:date>2007-02-28T00:51:31Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952579#M290498</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;I don't just get - if you have text file why you strugle with HTML? Text file have no tags and formatting  info - just grep out needed values ("Redo sizes") and compare them.</description>
      <pubDate>Wed, 28 Feb 2007 02:07:09 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952579#M290498</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-02-28T02:07:09Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952580#M290499</link>
      <description>i have got the text file for 1 html file..for more than one file i will get more text file.&lt;BR /&gt;then how to compare different values from different text files</description>
      <pubDate>Wed, 28 Feb 2007 02:21:14 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952580#M290499</guid>
      <dc:creator>Dodo_5</dc:creator>
      <dc:date>2007-02-28T02:21:14Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952581#M290500</link>
      <description>So, what is the problem? For getting several values for comparision you must process several html files. Insteed of it process several text files.&lt;BR /&gt;1 to 1 is good - just grep out needed value from all files and compare them. For example, you can write a script that process one file. Output of this script is a line that contains needed values ("Redo size","Logical reads" and so on) separated by '\t' or comma or what-ever-you-want. Then run this script against all text files and collect output in another file.&lt;BR /&gt;&lt;BR /&gt;IE:&lt;BR /&gt;&lt;BR /&gt;#!/bin/sh&lt;BR /&gt;OUTPUT='./output.txt'&lt;BR /&gt;cat /dev/null &amp;gt; $OUTPUT &lt;BR /&gt;for FILE in `find . -name "*.txt"`;&lt;BR /&gt;do&lt;BR /&gt;   script_process $FILE &amp;gt;&amp;gt; $OUTPUT&lt;BR /&gt;done;&lt;BR /&gt;&lt;BR /&gt;In this example "processing_script" - perl script that greps out needed values.&lt;BR /&gt;&lt;BR /&gt;That's all. &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 28 Feb 2007 02:41:06 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952581#M290500</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-02-28T02:41:06Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952582#M290501</link>
      <description>i have a little doubt...&lt;BR /&gt;&lt;BR /&gt;what is script_process $FILE &amp;gt;&amp;gt;OUTPUT&lt;BR /&gt;as you wrote "processing_script" as the perl script name.&lt;BR /&gt;&lt;BR /&gt;also if i have to write the required item in -name???&lt;BR /&gt;&lt;BR /&gt;can you just give comments over ur script so that it will be little easy for me.&lt;BR /&gt;</description>
      <pubDate>Wed, 28 Feb 2007 04:23:22 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952582#M290501</guid>
      <dc:creator>Dodo_5</dc:creator>
      <dc:date>2007-02-28T04:23:22Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952583#M290502</link>
      <description>Oh :) I err&lt;BR /&gt;&lt;BR /&gt;Yes I mean, &lt;BR /&gt;script_process is a processing script written in perl that takes argument  - file name to process, greps values and this script's output redirected to file $OUTPUT&lt;BR /&gt;And also you should point path to processing script. Correct version is:&lt;BR /&gt;&lt;BR /&gt;#!/bin/sh&lt;BR /&gt;OUTPUT='./output.txt'&lt;BR /&gt;cat /dev/null &amp;gt; $OUTPUT &lt;BR /&gt;for FILE in `find . -name "*.txt"`;&lt;BR /&gt;do&lt;BR /&gt;./script_process $FILE &amp;gt;&amp;gt; $OUTPUT&lt;BR /&gt;done;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Command find . -name "*.txt" outputs list of txt files in current directory, you can point another dir - it is just example of how you can tell your script what files to process.&lt;BR /&gt;</description>
      <pubDate>Wed, 28 Feb 2007 04:42:19 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952583#M290502</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-02-28T04:42:19Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952584#M290503</link>
      <description>i inderstand what have you told...that script will get values (like  buffer cache,redo size etc) from text file.but forr that i have to run script_process.pl script.&lt;BR /&gt;but i actually need that perl script by which i can grep out the values from text file</description>
      <pubDate>Wed, 28 Feb 2007 05:00:32 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952584#M290503</guid>
      <dc:creator>Dodo_5</dc:creator>
      <dc:date>2007-02-28T05:00:32Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952585#M290504</link>
      <description>Addon&lt;BR /&gt;If "report source code.txt" is html you must convert it to text - It can be done so:&lt;BR /&gt;for each table in html doc&lt;BR /&gt;    match string that contain entire table&lt;BR /&gt;    elminate tags  and , tags  and  replace with "\t" and  "\n" respectivly. Of coure cut off tag pair &lt;TABLE&gt;&lt;/TABLE&gt;</description>
      <pubDate>Wed, 28 Feb 2007 05:03:39 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952585#M290504</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-02-28T05:03:39Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952586#M290505</link>
      <description>can you modify my perl script (attached)...for accepting html file as argument&lt;BR /&gt;then it will be easy for me...and i can parse any html file giving as an argument only</description>
      <pubDate>Wed, 28 Feb 2007 05:38:32 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952586#M290505</guid>
      <dc:creator>Dodo_5</dc:creator>
      <dc:date>2007-02-28T05:38:32Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952587#M290506</link>
      <description>This is sample script. I am not strong in HTML::TokeParser so I used regexp to get rid of HTML.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#!/usr/local/bin/perl&lt;BR /&gt;&lt;BR /&gt;#lets open file&lt;BR /&gt;open SRC, "$ARGV[0]";&lt;BR /&gt;#set line delim to undef &lt;BR /&gt;#thus we can treat file as a string&lt;BR /&gt;$/= undef;&lt;BR /&gt;#read data&lt;BR /&gt;$data=&lt;SRC&gt;;&lt;BR /&gt;#close file&lt;BR /&gt;close SRC;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#take table part string&lt;BR /&gt;$data =~ /$.*&lt;TABLE.&gt;.*&amp;lt;\/table&amp;gt;.*/igs;&lt;BR /&gt;$data = $&amp;amp;;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#get read of html&lt;BR /&gt;$data =~ s/&lt;TABLE.&gt;//ig;&lt;BR /&gt;$data =~ s/&lt;P&gt;/\n/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/table&amp;gt;//ig;&lt;BR /&gt;$data =~ s///ig;&lt;BR /&gt;$data =~ s/&amp;gt;&amp;lt;\/th&amp;gt;/&amp;gt;Column\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/th&amp;gt;/\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/TD&amp;gt;&amp;lt;\/TR&amp;gt;//ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/td&amp;gt;/\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/tr&amp;gt;//ig;&lt;BR /&gt;$data =~ s/&lt;H3.&gt;//ig;&lt;BR /&gt;$data =~ s/&lt;T&gt;//ig;&lt;BR /&gt;$data =~ s/\x20{2,}/\t/ig;&lt;BR /&gt;$data =~ s/&amp;nbsp;/\t/ig;&lt;BR /&gt;$data =~ s/\t{2,}/\t/ig;&lt;BR /&gt;&lt;BR /&gt;#for example we want redo size&lt;BR /&gt;&lt;BR /&gt;$data =~ /redo size:\s{1,}([\d\.\,]{1,})\s{1,}([\d\.\,]{1,}).*/is;&lt;BR /&gt;&lt;BR /&gt;#output result&lt;BR /&gt;print $1, "\t", "$2";&lt;BR /&gt;&lt;BR /&gt;open SRC, "$ARGV[0]";&lt;BR /&gt;$/= undef;&lt;BR /&gt;$data=&lt;SRC&gt;;&lt;BR /&gt;close SRC;&lt;BR /&gt;$html = HTML::TokeParser-&amp;gt;new($data);&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SRC&gt;&lt;/T&gt;&lt;/H3.&gt;&lt;/P&gt;&lt;/TABLE.&gt;&lt;/TABLE.&gt;&lt;/SRC&gt;</description>
      <pubDate>Wed, 28 Feb 2007 06:38:19 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952587#M290506</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-02-28T06:38:19Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952588#M290507</link>
      <description>Correction&lt;BR /&gt;Sample script ends with&lt;BR /&gt;#output result&lt;BR /&gt;print $1, "\t", "$2";&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Theese lines:&lt;BR /&gt;open SRC, "$ARGV[0]";&lt;BR /&gt;$/= undef;&lt;BR /&gt;$data=&lt;SRC&gt;;&lt;BR /&gt;close SRC;&lt;BR /&gt;$html = HTML::TokeParser-&amp;gt;new($data);&lt;BR /&gt;are example on how to get string from file and create parser object over this string.&lt;BR /&gt;&lt;/SRC&gt;</description>
      <pubDate>Wed, 28 Feb 2007 06:45:14 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952588#M290507</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-02-28T06:45:14Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952589#M290508</link>
      <description>i have used &lt;BR /&gt;&lt;BR /&gt;#!/usr/local/bin/perl&lt;BR /&gt;use strict;&lt;BR /&gt;use HTML::TokeParser;&lt;BR /&gt;&lt;BR /&gt;then i run it as :&lt;BR /&gt;&lt;BR /&gt;perl html_parse.pl html&lt;BR /&gt;&lt;BR /&gt;where the script name is "html_parse.pl"&lt;BR /&gt;and the "html" is the name of my report file.&lt;BR /&gt;&lt;BR /&gt;still it gives compilation error....please make required change in ur script  to avoid error..&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;error:&lt;BR /&gt;Global symbol "$html" requires explicit package name at html_parse.pl line 49.&lt;BR /&gt;Execution of html_parse.pl aborted due to compilation errors.&lt;BR /&gt;</description>
      <pubDate>Wed, 28 Feb 2007 06:57:11 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952589#M290508</guid>
      <dc:creator>Dodo_5</dc:creator>
      <dc:date>2007-02-28T06:57:11Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952590#M290509</link>
      <description>Look at my previos message&lt;BR /&gt;Working script is:&lt;BR /&gt;&lt;BR /&gt;#!/usr/local/bin/perl&lt;BR /&gt;&lt;BR /&gt;#lets open file&lt;BR /&gt;open SRC, "$ARGV[0]";&lt;BR /&gt;#set line delim to undef &lt;BR /&gt;#thus we can treat file as a string&lt;BR /&gt;$/= undef;&lt;BR /&gt;#read data&lt;BR /&gt;$data=&lt;SRC&gt;;&lt;BR /&gt;#close file&lt;BR /&gt;close SRC;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#take table part string&lt;BR /&gt;$data =~ /$.*&lt;TABLE.&gt;.*&amp;lt;\/table&amp;gt;.*/igs;&lt;BR /&gt;$data = $&amp;amp;;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#get read of html&lt;BR /&gt;$data =~ s/&lt;TABLE.&gt;//ig;&lt;BR /&gt;$data =~ s/&lt;P&gt;/\n/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/table&amp;gt;//ig;&lt;BR /&gt;$data =~ s///ig;&lt;BR /&gt;$data =~ s/&amp;gt;&amp;lt;\/th&amp;gt;/&amp;gt;Column\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/th&amp;gt;/\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/TD&amp;gt;&amp;lt;\/TR&amp;gt;//ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/td&amp;gt;/\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/tr&amp;gt;//ig;&lt;BR /&gt;$data =~ s/&lt;H3.&gt;//ig;&lt;BR /&gt;$data =~ s/&lt;T&gt;//ig;&lt;BR /&gt;$data =~ s/\x20{2,}/\t/ig;&lt;BR /&gt;$data =~ s/&amp;nbsp;/\t/ig;&lt;BR /&gt;$data =~ s/\t{2,}/\t/ig;&lt;BR /&gt;&lt;BR /&gt;#for example we want redo size&lt;BR /&gt;&lt;BR /&gt;$data =~ /redo size:\s{1,}([\d\.\,]{1,})\s{1,}([\d\.\,]{1,}).*/is;&lt;BR /&gt;&lt;BR /&gt;#output result&lt;BR /&gt;print $1, "\t", "$2";&lt;BR /&gt;&lt;/T&gt;&lt;/H3.&gt;&lt;/P&gt;&lt;/TABLE.&gt;&lt;/TABLE.&gt;&lt;/SRC&gt;</description>
      <pubDate>Wed, 28 Feb 2007 07:04:48 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952590#M290509</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-02-28T07:04:48Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952591#M290510</link>
      <description>ok...now ur script is giving the correct value of "redo size" from the html report...but how to get buffer cache value or Memory Usage %(actually other values..).i have changed the $data variable,but it's not working.&lt;BR /&gt;u have defined the method for getting "redo size" but it's not valid for other parameters(actually tags are different in different cases).so that values can't be obtained.&lt;BR /&gt;  so how to make a generalised script.i can't run different script for getting different parameters.there should be only one script (by which the different parameter value can be obtained.</description>
      <pubDate>Wed, 28 Feb 2007 07:13:17 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952591#M290510</guid>
      <dc:creator>Dodo_5</dc:creator>
      <dc:date>2007-02-28T07:13:17Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952592#M290511</link>
      <description>U dont have to write separate script, just add regexp to match another params and add print statements to output then, of course it would be lengty - you must write it for every needed value.&lt;BR /&gt;&lt;BR /&gt;Find in my script this line:&lt;BR /&gt;$data =~ /redo size:\s{1,}([\d\.\,]{1,})\s{1,}([\d\.\,]{1,}).*/is;&lt;BR /&gt;&lt;BR /&gt;Block before elimantes HTML, so when things go to this line variable $data contain plain text, and u just have to write expressions to match another values. &lt;BR /&gt;&lt;BR /&gt;Example for redo size says to regexp engine:&lt;BR /&gt;find words "redo size:" &lt;BR /&gt;after this words would be some spaces&lt;BR /&gt;then sequense of digits,commas and dots&lt;BR /&gt;then - spaces again&lt;BR /&gt;then sequense of digits,commas and dots&lt;BR /&gt;I have enclosed sequense of digits,commas and dots in round brackets - this means that matched patern goes to predefined perl vars $1, $2 and so on - first seq to $1 and second to $2 &lt;BR /&gt;&lt;BR /&gt;For this look at how to extract matches with perl - google and u'll find a lot of about this.&lt;BR /&gt;&lt;BR /&gt;In script you can take a var for holding result:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#!/usr/local/bin/perl&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;open SRC, "$ARGV[0]";&lt;BR /&gt;$/= undef;&lt;BR /&gt;$data=&lt;SRC&gt;;&lt;BR /&gt;close SRC;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#take table string&lt;BR /&gt;$data =~ /$.*&lt;TABLE.&gt;.*&amp;lt;\/table&amp;gt;.*/igs;&lt;BR /&gt;$data = $&amp;amp;;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#get read of html&lt;BR /&gt;$data =~ s/&lt;TABLE.&gt;//ig;&lt;BR /&gt;$data =~ s/&lt;P&gt;/\n/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/table&amp;gt;//ig;&lt;BR /&gt;$data =~ s///ig;&lt;BR /&gt;$data =~ s/&amp;gt;&amp;lt;\/th&amp;gt;/&amp;gt;Column\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/th&amp;gt;/\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/TD&amp;gt;&amp;lt;\/TR&amp;gt;//ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/td&amp;gt;/\t/ig;&lt;BR /&gt;$data =~ s/&amp;lt;\/tr&amp;gt;//ig;&lt;BR /&gt;$data =~ s/&lt;H3.&gt;//ig;&lt;BR /&gt;$data =~ s/&lt;T&gt;//ig;&lt;BR /&gt;$data =~ s/\x20{2,}/\t/ig;&lt;BR /&gt;$data =~ s/&amp;nbsp;/\t/ig;&lt;BR /&gt;$data =~ s/\t{2,}/\t/ig;&lt;BR /&gt;&lt;BR /&gt;#for example we want redo size&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;$result=""&lt;BR /&gt;&lt;BR /&gt;#match redo size Per Second  Per Transaction &lt;BR /&gt;$data =~ /redo size:\s{1,}([\d\.\,]{1,})\s{1,}([\d\.\,]{1,}).*/is;&lt;BR /&gt;$result=$result."\t".$1."\t".$2;&lt;BR /&gt;#match Soft Parse %&lt;BR /&gt;$data =~ /Soft Parse %:\s{1,}([\d\.\,]{1,}).*/is;&lt;BR /&gt;$result=$result."\t".$1;&lt;BR /&gt;&lt;BR /&gt;# and so on&lt;BR /&gt;&lt;BR /&gt;#&lt;BR /&gt;#&lt;BR /&gt;#add matching for another values here&lt;BR /&gt;#&lt;BR /&gt;#&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#output result&lt;BR /&gt;print $result;&lt;BR /&gt;&lt;/T&gt;&lt;/H3.&gt;&lt;/P&gt;&lt;/TABLE.&gt;&lt;/TABLE.&gt;&lt;/SRC&gt;</description>
      <pubDate>Wed, 28 Feb 2007 07:45:54 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952592#M290511</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-02-28T07:45:54Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952593#M290512</link>
      <description>first of all many many thanks for ur interest &amp;amp; constant helping...&lt;BR /&gt;&lt;BR /&gt;i got ur point....but &lt;BR /&gt; 1) first thing is using ur script is lengthy(its ok ,no prob)...but also to set all the  parameters for all the different values from table is not a good programming practice.&lt;BR /&gt;&lt;BR /&gt;  2)but can we print the values of "redo size" from 20 html files simultaneouly???&lt;BR /&gt;its the main requirement...then only i can compare the values in different reports.&lt;BR /&gt;i have to get "redo size" values from all html report in output by running the script.&lt;BR /&gt;&lt;BR /&gt;so pls look into the matter...</description>
      <pubDate>Wed, 28 Feb 2007 08:00:59 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952593#M290512</guid>
      <dc:creator>Dodo_5</dc:creator>
      <dc:date>2007-02-28T08:00:59Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952594#M290513</link>
      <description>1) Of cource it is not good programming practice - but it works, anyway - you have no choise, tables are not the same so you must explictly specify what to get and how to name it.&lt;BR /&gt;&lt;BR /&gt;2) I'm trying to tell you that from the begining I look into the question matter :)&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;You have script that takes specified values out specified file - so run this script for every file in set and save result somewhere - thus you will have ur redo size, extracted out 20 files simult :) strictly speaking "serialy" :) but in one place&lt;BR /&gt;you can write batch to do this&lt;BR /&gt;like this:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#!/bin/sh&lt;BR /&gt;OUTPUT='./output.txt'&lt;BR /&gt;cat /dev/null &amp;gt; $OUTPUT &lt;BR /&gt;for FILE in `find . -name "*.txt"`;&lt;BR /&gt;do&lt;BR /&gt;./script_process $FILE &amp;gt;&amp;gt; $OUTPUT&lt;BR /&gt;done;&lt;BR /&gt;&lt;BR /&gt;after completing file output.txt will contain values from files.&lt;BR /&gt;I wrote it in shell, but this batch can be written in perl too.&lt;BR /&gt;&lt;BR /&gt;Idea is that - write script to process one file and write second script and to call first one for every file in set.&lt;BR /&gt;</description>
      <pubDate>Wed, 28 Feb 2007 08:25:44 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952594#M290513</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-02-28T08:25:44Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952595#M290514</link>
      <description>can you just tell how to run the scripts in sequence...&lt;BR /&gt;tell the command line statements&lt;BR /&gt;i think i have to write first a PERL script named "script_process.pl"&lt;BR /&gt;then to write the shell script named "file_process.sh"&lt;BR /&gt;then if the name of report file is report.html &lt;BR /&gt;then just tell me how to execute one by one for getting correct answer...</description>
      <pubDate>Thu, 01 Mar 2007 01:37:00 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952595#M290514</guid>
      <dc:creator>Dodo_5</dc:creator>
      <dc:date>2007-03-01T01:37:00Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952596#M290515</link>
      <description>I have already told you what to do.&lt;BR /&gt;If you did not get about it, lets try another way&lt;BR /&gt;Suppose you have 3 reports -&lt;BR /&gt;report1.txt&lt;BR /&gt;report2.txt&lt;BR /&gt;report3.txt&lt;BR /&gt;then you run:&lt;BR /&gt;&lt;BR /&gt;create empty output.txt&lt;BR /&gt;./script_process.pl  report1.txt &amp;gt;&amp;gt; output.txt&lt;BR /&gt;./script_process.pl  report2.txt &amp;gt;&amp;gt; output.txt&lt;BR /&gt;./script_process.pl  report3.txt &amp;gt;&amp;gt; output.txt&lt;BR /&gt;&lt;BR /&gt;after it each line in output.txt will contain values extracted from reportN.txt.&lt;BR /&gt;Second script that you called file_process.sh do just that - it finds files and calls script_process.pl for each report file one by one. Read more about perl and shell. Then you open output.txt in Excel or another spredsheet processor that understands tab-delimted files and compare what you wish, build diagrams and so on.</description>
      <pubDate>Thu, 01 Mar 2007 02:07:15 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952596#M290515</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-03-01T02:07:15Z</dc:date>
    </item>
    <item>
      <title>Re: PERL for HTML file parsing</title>
      <link>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952597#M290516</link>
      <description>Read attentivly scripts I have gave you and try to understand what each script do. Don't just copy them blindely.</description>
      <pubDate>Thu, 01 Mar 2007 02:12:12 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-hp-ux/perl-for-html-file-parsing/m-p/3952597#M290516</guid>
      <dc:creator>Maxim Yakimenko</dc:creator>
      <dc:date>2007-03-01T02:12:12Z</dc:date>
    </item>
  </channel>
</rss>

