Operating System - HP-UX
1822034 Members
3452 Online
109639 Solutions
New Discussion юеВ

Re: Perl script to parse data - Dont know if it can be done.

 
SOLVED
Go to solution
Ratzie
Super Advisor

Perl script to parse data - Dont know if it can be done.

I thought I would throw this out there, I think it can not be done, but I am
not a guru.

This is the problem I get a file that I must pull the pertanent data out. I
has a header and footer, as well as page breaks, this is all in ASCII
format. I need to pull out just the columns.
I do this all manually (delete the header and footer, and well as all the
page breaks) there are also at times a 0 at the beginning of a record that I
do not want there as well.
The attachment show the file... Bere is mind there is more data columns then what is shown.



I manually disect this file to make it look like this... See file

I have manually removed the header, footer and page breaks. As well as there
always seems to be a 0 at start of the first record. I remove this as well.
I then run this perl script:

while (<>) {
chomp; # Will remove the leading , or new line
s,^\s+,,; #Remove leading spaces
my @cols=split m/\s{2,}/, $_, -1; # Split on two (or more) white space
characters
@cols == 2 and splice @cols, 1, 0, "";
print join (',',@cols)."\n";
}

And I get this: WHAT I NEED!
5555002,00 0 04 27,TELN NOT BILL
1555007,00 0 06 00,CUSTOMER HAS
2555010,00 0 12 10,CUSTOMER HAS

I want to try to eliminate as much manual intervention as I can.

5 REPLIES 5
Rodney Hills
Honored Contributor

Re: Perl script to parse data - Dont know if it can be done.

Since the data is in fixed columns then it should be quite easy-

perl -ne 'chomp; if (substr($_,7,7)=~/\d+/) { print join(",",substr($_,7,7),substr($_,39,10),substr($_,63,99)),"\n";}' inputfile

This one-liner identifies the lines you want data from by looking at the TELN column for 7 digits. If found then extract the data using substr().

HTH

-- Rod Hills
There be dragons...
Jean-Luc Oudart
Honored Contributor

Re: Perl script to parse data - Dont know if it can be done.

If I understand, you want the lines that contain a phone number except if the 1st character in the line is 0 ?

What about :
grep -v ^0 | sed -e 's/^ *//g' | grep ^[0-9][0-9]

you can then pipe into awk (or perl) to format the output.

Regards,
Jean-Luc
fiat lux
H.Merijn Brand (procura
Honored Contributor
Solution

Re: Perl script to parse data - Dont know if it can be done.

I saved your attachment in xx.txt, and then created xx.pl
For fixed columns, pack/unpack is usually the best (sometimes only) way to go:

lt09:/tmp 108 > cat xx.pl
#!/pro/bin/perl

use strict;
use warnings;

while (<>) {
s/\s+$//; # clip trailing whitespace
s/^0?\s+// or next;
m/^\d+/ or next;

my ($teln, $cutteln, $cutoen, $reason) = unpack "A15 A17 A24 A*", $_;
$teln =~ m/^\d+$/ or next;
$reason or next;
print "$teln,$cutoen,$reason\n";
}
lt09:/tmp 109 > perl xx.pl xx.txt
1555200,00 0 12 02,CUSTOMER HAS
2555206,00 0 05 01,CUSTOMER HAS
4555208,00 0 03 06,TELN NOT BILL
1555200,00 0 12 02,CUSTOMER HAS
2555206,00 0 05 01,CUSTOMER HAS
4555208,00 0 03 06,TELN NOT BILL
lt09:/tmp 110 >

Enjoy, Have FUN! H.Merijn [ who thinks you are very stubborn in thinking that chomp removes leading space or newlines ]
Enjoy, Have FUN! H.Merijn
Ratzie
Super Advisor

Re: Perl script to parse data - Dont know if it can be done.

WOW! I did not think it could be this easy!
I have taken a course and read books, but I still find perl hard to decifer. I am always fighting it!

Thank you.

I do have a question about unpack. I am trying to add comments to the script.

s/^0?\s+// or next;
This is an if statement? If line begins with 0 and a character strip off 0... If not continue...?

m/^\d+/ or next;
This as well is an if statement but does what?

My next question is in regards to unpack.
I know I am unpacking an ASCII character, (by the use of the A) but I can not get a handle on the numbers, they do not line up if I count the columns or the spaces inbtween. How did you come up with thes numbers...?

my ($teln, $cutteln, $cutoen, $reason) = unpack "A15 A17 A24 A*", $_;

$teln =~ m/^\d+$/ or next;
What does this do..?

As well what does the following line do?
$reason or next;
print "$teln,$cutoen,$reason\n";
}
H.Merijn Brand (procura
Honored Contributor

Re: Perl script to parse data - Dont know if it can be done.

I do have a question about unpack. I am trying to add comments to the script.

s/^0?\s+// or next;
This is an if statement? If line begins with 0 and a character strip off 0... If not continue...?

== s/// is substitute in the current line
== ^0?\s+ is an optional leading zero followed by whitespace
== if that is not the case skip to the next line

m/^\d+/ or next;
This as well is an if statement but does what?

== resulting (changed) line should start with digits, otherwise, skip to next line

My next question is in regards to unpack.
I know I am unpacking an ASCII character, (by the use of the A) but I can not get a handle on the numbers, they do not line up if I count the columns or the spaces inbtween. How did you come up with thes numbers...?

== A## takes ## positions, and strips the training spaces
== ASCII digits are ASCII characters as well
== no need to treat them special

my ($teln, $cutteln, $cutoen, $reason) = unpack "A15 A17 A24 A*", $_;

$teln =~ m/^\d+$/ or next;
What does this do..?

== It checks if $teln consistes of ONLY digits
== ^ is start of string, \d+ is any number of digits
== $ is end of string
== if not, skip to next line

As well what does the following line do?
$reason or next;

== check if $reason has a true value (anything other than
== undef, blank or "0"). Freely translated to
== check if $reason is filled (with anything) otherwise skip

== So we only print if all criteria are met

print "$teln,$cutoen,$reason\n";
}

Enjoy, Have FUN! H.Merijn
Enjoy, Have FUN! H.Merijn