Operating System - Linux
1748202 Members
2991 Online
108759 Solutions
New Discussion юеВ

help needed scripting urgent plzz

 
SOLVED
Go to solution
viseshu
Frequent Advisor

help needed scripting urgent plzz

hi all,

I am having a file with the following format.
"ABC",1809593008,"MYHOME",20061002,"SITON,theback",abcdef,...
There will be so many records in the file and 17 fields in every record. i hav 3 requirements.
1.I want to remove commas(,) if any encountered ONLY in " " in all records.

2.Each field is of specified field length (predefined which im having but i cant use it in a file, i need to hard code them in script). I want to check whether each field is of its predefined length or not.
3. If 15th field is present, then 14th 11th field should also be present. If not it should return the record number.

i need a function a function for the last 2
34 REPLIES 34
viseshu
Frequent Advisor

Re: help needed scripting urgent plzz

plz note : i want to replace the comma mentioned above with a space
Doug O'Leary
Honored Contributor

Re: help needed scripting urgent plzz

Hey;

Can you provide a short test file for us to work against? Maybe 100 lines or so?

Doug

------
Senior UNIX Admin
O'Leary Computers Inc
linkedin: http://www.linkedin.com/dkoleary
Resume: http://www.olearycomputers.com/resume.html
viseshu
Frequent Advisor

Re: help needed scripting urgent plzz

Iam atttaching a short file containing 8 fields in each record. As i dont hav any sample file of the same specification im attaching this. Please help
Hein van den Heuvel
Honored Contributor

Re: help needed scripting urgent plzz

Hmm, that's a bit lame not to have decend sample data. How will you verify your work?

Anyway, here is something to get you going.
It does not deal with the length requirements, but if you read and understand the split on double-quote, then you can do something like that spliting the line by commas and walkign the fields.

#cat test.pl
my (@quoted) = split /"/;
my ($i)=1;
while ($i < @quoted) {
$quoted[$i] =~ s/,/ /g;
$i += 2;
}
$_ = join "\"", @quoted ;
my ($one,$two,$three,$four,$five) = split /,/;
if ($four ne "" and $two eq "") {
print STDERR "Missing field X at line $.\n";
}

#cat x.txt
"ABC",1809593008,"MYHOME",20061002,"SITON,theback"
"DEF",1809593008,"MY,HOME",20061002,"SITON,theback"
"GHI",,"MYHOME",,"SITON,theback"
"JKL",,"MYHOME",20061002,"SITON,theback"
"MNO",1809593008,"MYHOME",20061002,"SITON,theback"

# perl -p test.pl x.tmp > y.txt
Missing field X at line 4

#cat y.txt
"ABC",1809593008,"MYHOME",20061002,"SITON theback"
"DEF",1809593008,"MY HOME",20061002,"SITON theback"
"GHI",,"MYHOME",,"SITON theback"
"JKL",,"MYHOME",20061002,"SITON theback"
"MNO",1809593008,"MYHOME",20061002,"SITON theback"

Good luck!
Hein.
viseshu
Frequent Advisor

Re: help needed scripting urgent plzz

Hein, sorry im not doing it in perl..
Peter Nikitka
Honored Contributor

Re: help needed scripting urgent plzz

Hi,

I asked myself, if I should send an answer to a thread, containing the message
>>
I'm not doing it in perl
<<
Why?
Is it allowed to use awk?

But nevertheless ...:
1.) I asume it is NOT allowed for a field to contain a single quote " only.
I use the quote as delimiter and substitue even field numbers only.
cat /tmp/a
"ABC",1809593008,"MYHOME",20061002,"SITON,the,back",abcdef,..,""

awk -F'"' '{printf $1;for(i=2;i<=NF;i++) {if(! (i%2)) gsub(","," ",$i);printf FS""$i}; printf"\n"}' /tmp/a
"ABC",1809593008,"MYHOME",20061002,"SITON the back",abcdef,..,""


2.) I assume your field delimiter is comma, and your data is clean in respect to this delimiter after 1.)

Set a variable containing the length data in the format
len='l1 l2 l3 .. ln'
That way you have built an array where
l[i] contains the length of the i-th field.
=> No hardcoding needed!

awk -F, -v len="$len" 'BEGIN {f=split(len,l," ")}
{for(i=1;i<=NF;i++) if(length($i)!=l[i]) printf("size mismatch in line %d, field %d\n",NR,i)}


3.) self explanatory - add to 2.)
{if($15 && !($14 || $11)) print NR}


It should be easy to combine the tasks together.

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
Sandman!
Honored Contributor

Re: help needed scripting urgent plzz

If you're okay with awk then try the script below. It removes embedded commas from fields that are alphabetic strings and are enclosed in double-quotes:

awk -F, '{
for (i=1;i<=NF;++i) {
if ($i~/^"[A-Za-z]+$/)
printf("%s ",$i)
else if ($i~/^[A-Za-z]+$/)
printf("%s ",$i)
else if ($i~/^[A-Za-z]+"$/)
printf((i else
printf((i }
}' infile

~hope it helps
viseshu
Frequent Advisor

Re: help needed scripting urgent plzz

Peter,
thanks a lot
1)replacing , with space is working fine but can u please xplain the concept of that even field, i did not get tht..{if(! (i%2)) what is this doing...im not gettting...plzzz explain clearly......:(
Peter Nikitka
Honored Contributor

Re: help needed scripting urgent plzz

Hi,

look at this example string:
"ABC",1809593008,"MYHOME",20061002,"SITON,the,back",abcdef,..,""

If you take the quote (") as delimiter, these are your records - I call then f1:
1
2 ABC
3 ,1809593008,
4 MYHOME
5 ,20061002,
6 SITON,the,back
7 abcdef
...

If you take your original delimiter, I call the records f2.

You see, that records in f1 which contain commata are only of interest, when the record number is even.
Odd record numbers of f1 containing commata consist of records of f2 containing NO COMMATA only.
So you must note, how important my assumption to solution 1 is, that there mustn't be fields of f2 containing a single quote only:
If that where the case you couldn't decide by algorithm, how records of f1 and f2 interact together.


So you must transform in even record numbers - exactly only in the even ones - your comma to space.

The % operator is the modulo function, so
(i%2) is zero for even and one for odd i, leading to the expression
(! (i%2)) which evaluates to true for even numbers.


IMHO some looking at the man page of awk could help...

mfG Peter

PS: I really think I have earned some points now :-)
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"