- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Performance improvement on perl script
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-07-2010 11:18 AM
тАО11-07-2010 11:18 AM
I have a perl script that reads oracle loader log file and extract feedfile to generate an output file. The script is attached. If the size of the feedfile and logfile is limited it works fine. However, when the feedfile is very large in terms of 25GB or 30GB, it gives out of memory error.
The details are
ops_generate_exception.sh takes in parameter of logfile.
Within the shell perl script process_exception.pl is called with parameter of formatted logfile and feedfile to generate outputfile called OR-C-DR02-RB002-01-8-HST-EST-20101107-01.
Possibly the perl script reads the entrie feedfile that it gives out of memory in prod environment. Need help to fine tune the perl script please.
Thanks,
Srikanth A
Solved! Go to Solution.
- Tags:
- Perl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-07-2010 02:57 PM
тАО11-07-2010 02:57 PM
Solution> when the feedfile is very large in terms of 25GB or 30GB, it gives out of memory error.
So is sounds like you are slurping (reading) the whole file into memory.
Please provide your attachment as a simple text one, not as a "*.rar. archive which is non-standard.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-07-2010 03:27 PM
тАО11-07-2010 03:27 PM
Re: Performance improvement on perl script
For very large files, I would try the
following options (if you cannot rewrite
it in C):
a) Use Tie::File module. The files is not
loaded into memory and it should work even
for "gigantic" files.
b) Split your input file into smaller files
and process them individually. The only bad
side is that you need more temporary disk
space.
There are other possibilities too,
but I would start with the above ones.
Cheers,
VK2COT
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-07-2010 08:50 PM
тАО11-07-2010 08:50 PM
Re: Performance improvement on perl script
Yeah, use ZIP if you must, but plain TEXT is preferred by most I guess.
RAR is a MAC alternative to ZIP.
I could see/extract the contents on windows using the free Stuffit expander:
http://www.stuffit.com/win-expander-download.html
JRF> So is sounds like you are slurping (reading) the whole file into memory.
Good guess... Here is the core:
-----
{
local $/ = undef;
@pieces = split( /(?=Record\s+\d+:)/, <> );
}
for $line (@pieces) {
if ( $line =~ m{(Record\s+(\d+)).+?(ORA.+?)\s}s ) {
print $1, "|", $3, "|", $lookup[ $2 - 1 ], "|",$up_amt[ $2 - 1 ],"\n";
}
}
----
From the perl userguide:
"You may set it ($\ = $RS = $INPUT_RECORD_SEPARATOR in english) to a multi-character string to match a multi-character terminator, or to undef to read through the end of file."
So just change that to a single loop processing records at a time.
A little more programming, infinitely more scaling.
If you want to tease readers here to help with that then yo may want to provide a TEXT :-) attachment with a hand full of lines from such log-file containing an example of a record being searched for, and a few lines before and after for good measure.
Btw... why process the data 3 times over?
1) strip blank
2) Look for special record
3) perl.
Perl can do that all in one sweep.
Finally... a pet-peeve of mine:
feedfile=`cat ${formatted_logfile}|grep "Data File"|awk -F':' '{print $2}'`
Yuck! It will be a moot point when you teach the perl script to do it all, but as a general concept why involve CAT and GREP when AWK can do it all?
Carpenters... Learn to use your tools!
feedfile=$(awk -F':' '/Data File/{print $2}' ${formatted_logfile})
Good luck!
Hein
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-08-2010 03:25 AM
тАО11-08-2010 03:25 AM
Re: Performance improvement on perl script
I am attaching each files instead of RAR file. If I am not able to send it in one message, will do it in multiple messages. Please help me in this regard.
Thanks,
Srikanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-08-2010 03:31 AM
тАО11-08-2010 03:31 AM
Re: Performance improvement on perl script
In my earlier mail, I sent the perl script. I will attach the logfile (passed as first parameter to script now).
Thanks,
Srikanth A
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-08-2010 03:34 AM
тАО11-08-2010 03:34 AM
Re: Performance improvement on perl script
This is the last attachment. Herewith, I am attaching the feedfile (I have taken only 4 recrds of entire feedfile to avoid oversizeing). This is passed as second parameter to the perl script.
Thanks,
Srikanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-08-2010 04:13 AM
тАО11-08-2010 04:13 AM
Re: Performance improvement on perl script
Well it looks like it was *me* who wrote the original version of the script that doesn't scale :-(
http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=1435953
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-08-2010 05:03 AM
тАО11-08-2010 05:03 AM
Re: Performance improvement on perl script
I do understand that it was your bit of code. Since I started using this, I am simply addicted to it and am trying to come out of lions den. Please show your lion hearted skills to save me from trap.
Thanks,
Srikanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-08-2010 05:39 AM
тАО11-08-2010 05:39 AM
Re: Performance improvement on perl script
A quick look back to your original thread indicated that the logfile could be "chunked" into paragraphs as divided by blank lines. Given this, the original scriptt should be able to be rewritten to:
# cat ./myfilter
#!/usr/bin/perl
use strict;
use warnings;
my ( @pieces, $line, @lookup );
{
my $feedfile = 'myfeed';
open( my $fh, '<', $feedfile ) or die "Can't open '$feedfile': $!\n";
while (<$fh>) {
chomp;
push( @lookup, substr( $_, 0, 14 ) );
}
}
{
local $/ = "";
while (<>) {
if ( m{(Record\s+(\d+)).+?(ORA.+?)\sSQL}s ) {
print $1, "|", $3, "|", $lookup[ $2 - 1 ], "\n";
}
}
}
1;
Regards!
...JRF...