Languages and Scripting
cancel
Showing results for 
Search instead for 
Did you mean: 

Extracting data in between strings...using perl

 
SOLVED
Go to solution
Highlighted
Honored Contributor

Re: Extracting data in between strings...using perl

What Hein said, but then parsing HTML with regexes is still NOT safe.

Have a look at the HTML::TreeBuilder module.

--8<--- test.pl
#!/pro/bin/perl

use strict;
use warnings;

use HTML::TreeBuilder;

my $tree = HTML::TreeBuilder->new;

my $content = <<'EOH';

This page is used to hold your data while you are being authorized for your
request.

You will be forwarded to continue the authorization process. If
this does not happen automatically, please click the Continue button below.
STARThttps://test.ip.com/siteminderagent/forms/login.fcc?TYPE=33554433&REALMOID=06-00034bb7-e037-116f-8241-808d67a50008&GUID=&SMAUTHREASON=0&METHOD=POST&SMAGENTNAME=$SM$8OJVwItP%2fV8GXRhL%2fhch6KJt3EvC2AWLQ7%2bWLfTgx3%2bWD7k%2buJc3dVSFPOr1jTxg&TARGET=$SM$%2fEND="HIDDEN"
NAME="SMPostPreserve"
VALUE="S1NJbjNmby81VzRqMmo0cTNuWm9NdFo3cVpZSlF6enpMc2laNWZrcnRudlhWVEUzM0xUTHVPR1Y3REpwNnUwM1ZVd1IySFdQZkRDRmpUQldrV01ybk9pcEFBZnpzNmg4RG1yQ0lRQUNzbTFMekdiUG9Eck02M2NUcis4RG5YQ3l2dkZHOGp4WDRPbHJJTFdJOXUvbnFBPT0END="SUBMIT"
VALUE="Continue">

EOH

$tree->parse_content ($content);
# print $tree->as_HTML (undef, " ", {});

foreach my $f ($tree->look_down (_tag => "form")) {
# print "FORM:\n", $f->as_HTML (undef, " ", {});
$f->as_HTML (undef, " ", {}) =~ m{\bstart(.*?)end}i and
print "START in FORM: $1\n";
}
-->8---

# perl test.pl
START in FORM: https://test.ip.com/siteminderagent/forms/login.fcc?type="33554433&REALMOID=06-00034bb7-e037-116f-8241-808d67a50008&GUID=&SMAUTHREASON=0&METHOD=POST&SMAGENTNAME=$SM$8OJVwItP%2fV8GXRhL%2fhch6KJt3EvC2AWLQ7%2bWLfTgx3%2bWD7k%2buJc3dVSFPOr1jTxg&TARGET=$SM$%2f

Enjoy, Have FUN! H.Merijn [ who does not think this is clean HTML ]
Enjoy, Have FUN! H.Merijn
Highlighted
Super Advisor

Re: Extracting data in between strings...using perl

I wound up using regular expressions butam looking forwardto messing withthe treebuilder but it is very very difficult.