- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Help with perl
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2011 03:53 AM - edited 09-14-2011 04:18 AM
09-14-2011 03:53 AM - edited 09-14-2011 04:18 AM
Help with perl
I'm new to perl scripting and I need to limit to 60 characters the following String in this pattern in a xml file:
<CompanyName>String</CompanyName>
But perl doesn't seem to recognize the instr(big, little) function:
#!/usr/bin/perl
use strict;
use warnings;
my @a = ();
my @b = ();
my @c = ();
my @d = ();
while (<>) {
if (m{<CompanyName>}..m{</CompanyName>}) {
if (m{</?CompanyName>}) {
push @a, instr($_,">");
push @b, instr($_,"</");
push @c, substr(substr($_,@a+1,@b - @a-1),0,60);
push @d, substr($_,0,@a)||@c||substr($_,@b,14);
print @d;
@a = ();
@b = ();
@c = ();
@d = ();
next;
}
print;
}
print;
}
1;
Eric
- Tags:
- Perl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2011 04:38 AM
09-14-2011 04:38 AM
Re: Help with perl
Now, it erases the entire pattern:
#!/usr/bin/perl
use strict;
use warnings;
my @a = ();
#my @b = ();
#my @c = ();
#my @d = ();
while (<>) {
if (m{<CompanyName>}..m{</CompanyName>}) {
if (m{<CompanyName>}) {
push @a, $_;
next;
}
if (m{>}+1..m{</-1}) {
push @a, substr($_,0,60);
next;
}
if (m{</CompanyName>}) {
push @a, $_;
next;
}
print @a;
@a = ();
next;
}
print;
}
1;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2011 04:45 AM
09-14-2011 04:45 AM
Re: Help with perl
Now, it is almost working but it still doesn't limit the string to 60 characters:
#!/usr/bin/perl
use strict;
use warnings;
#my @d = ();
while (<>) {
if (m{<CompanyName>}..m{</CompanyName>}) {
if (m{<CompanyName>}) {
print;
next;
}
if (m{>}+1..m{</-1}) {
print substr($_,0,60);
next;
}
if (m{</CompanyName>}) {
print;
next;
}
}
print;
}
1;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2011 05:34 AM - edited 09-14-2011 05:35 AM
09-14-2011 05:34 AM - edited 09-14-2011 05:35 AM
Re: Help with perl
Why jump through diffucult hoops?
while (<>) {
s{(<(CompanyName)>)(.{0,60}).*?</\1>}{<$1>$2</$1>};
}
If the content for this tag is longer than 60 characters, truncate to 60
(/me still thinks you should use XML::Parser
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2011 09:05 AM
09-14-2011 09:05 AM
Re: Help with perl
Hi Merijn,
With your last script I get an empty file.
This is working but just for the first occurence:
#!/usr/bin/perl
use strict;
use warnings;
while (<>) {
if (m{<CompanyName>}..m{</CompanyName>}) {
if (m{</?CompanyName>}) {
if (length($_) gt 87) {
print substr($_,0,73);
print "</CompanyName>\n";
next;
}
print;
next;
}
}
print;
}
1;
Eric
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2011 10:19 AM
09-14-2011 10:19 AM
Re: Help with perl
@Eric Antunes wrote:Hi Merijn,
With your last script I get an empty file.
Hi Eric:
Try this:
#!/usr/bin/perl use strict; use warnings; while (<>) { s{(<(CompanyName)>)(.{0,60}).*?</\2>}{<$1>$3</$2>}; print; } 1;
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2011 01:52 PM
09-14-2011 01:52 PM
Re: Help with perl
*I* was just "missing" the print. *You* overcomplicate the regex and generate invalid XML :)
$1 already includes < and >, so you'll end up with
<<CompanyName>>Whatever</CompanyName>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2011 04:52 PM
09-14-2011 04:52 PM
Re: Help with perl
@H.Merijn Brand (procura wrote:*I* was just "missing" the print. *You* overcomplicate the regex and generate invalid XML :)
$1 already includes < and >, so you'll end up with
<<CompanyName>>Whatever</CompanyName>
Yes, my friend, I missed the doubled angle backets :-( and needlessly complicated the regex :-((
Yes, too, the missing print was obvious.
BUT, your original version did not limit the string :-(
You had:
s{(<(CompanyName)>)(.{0,60}).*?</\1>}{<$1>$2</$1>};
whereas I should have used:
s{<(CompanyName)>(.{0,60}).*?</\1>}{<$1>$2</$1>};
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-14-2011 11:12 PM
09-14-2011 11:12 PM
Re: Help with perl
That is what one gets if not testing code :/
I indeed obviously had one pair of parens too many.
For completeness sake - we both made too many simple mistakes -, here is the full version:
$ cat modify.pl use strict; use warnings; while (<>) { s{<(CompanyName)>(.{0,60}).*?</\1>}{<$1>$2</$1>}; # other modifications here print; } $ perl -wc modify.pl modify.pl syntax OK $ perl modify.pl myfile.xml > modified.xml
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-15-2011 07:49 AM
09-15-2011 07:49 AM
Re: Help with perl
Exactly Merijn, you just posted the right script.
Although I didn't understand the s{} part, It worked wonderfuly!
But I will try to understand it.
Thank you,
Eric
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-15-2011 08:16 AM
09-15-2011 08:16 AM
Re: Help with perl
lemme (try to) explain:
s{<(CompanyName)>(.{0,60}).*?</\1>}{<$1>$2</$1>};
make that more readable and still legal:
s{ <(CompanyName)> # Search for the opening tag (keep tag name in $1) (.{0,60}) .*? # Keep 0 to 60 characters in $2, ignore rest to </\1> # The closing tag (\1 == $1 in the match part) }{<$1>$2</$1>}x; # Replacement pattern
all between parens is "captured". The first cature goes to $1, the next to $2 etc. If captures are nested, the outermost capture gets the lowest index: the index of the capture is the number of opening paren found. (unless you use (?|...) in newer perls, but we do not use that here).
So after "<CompanyName>" matched, $1 now contains "CompanyName".
The next line captures .{0,60}, which means "any character between 0 and 60 times". The patter .*? means a non-greedy match on any number of characters until the next part of the match which prevails over the otherwise greedy .* when we would not add the ?
as that is not in parens, it is just forgotten
The last part of the match is matching </\1> where \1 is the content of $1. We cannot use $1 there, as we are still inside the matching part. </\1> in this case is essentially the same as matching on </CompanyName>, which is more typing and more error-prone.
after the closing } of the match, the substitution pattern puts it all together again. The x after the last } enables us to split up the matching pattern over several lines and add whitespace and comments.