Operating System - Linux
1754359 Members
4605 Online
108813 Solutions
New Discussion юеВ

Re: script help, string matching form new file

 
SOLVED
Go to solution
patrick xi
Advisor

script help, string matching form new file

Dear friends,

I need a shell script to process several files, they are FileA, FileB, FileC, FileD,...

1.feature of the file
.each have 600 columns
.lines from 2nd line to bottom are useful
.from column 2-10 is P_name, column 60-80 is Description
2.demands
.trim each file to TA, TB, TC, ... formed by first unique P_name lines
.merge TA line to TB if P_name in that line of TA is not in TB, this make file M1, then M2 is between TB and TC, M3 is between TC and TD,...

what I need is M1, M2, M3, ... , could you help me?
Hope I explained well and thank you in advance.

Best regards,
Patrick
14 REPLIES 14
Muthukumar_5
Honored Contributor
Solution

Re: script help, string matching form new file

for file in `ls FILE*`
do


logfile="T$(echo $file | sed 's/.*\(.\)$/\1/')"
awk '(NR!=1) { for (i=2;i<=20;i++) { print $i; } }' $file | sort -u > $logfile

done

It will create TA, TB, TC files.

You can create M1, M2, M3 files as,

grep -vf TA < TB > M1

like that.

hth.
Easy to suggest when don't know about the problem!
patrick xi
Advisor

Re: script help, string matching form new file

Hello Muthukumar,
Thank you for taking the time.
but the result seems not what I want.
1. I need the whole line, not only the P_name.
2. The script is defined to work on column 2 to 10, but I found in result I got column 165 to 176, I don't know why. Too many columns?
3. How can I start from 4th or 6th line? ( words above are not very correct: ".lines from 2nd line to bottom are useful", because source file varied, I have to change the configure from time to time)

Best regards,
Patrick
patrick xi
Advisor

Re: script help, string matching form new file

I made a test file named fileA, content is:
1234567890123456789012345678901234567890
testfile1 desc trees
testfile1 desc apple
partnume1 desc orange
testfile1 desc table
partnume2 desc 12345
partnume9 desc na
partnume7 desc na

the result TA should be :
testfile1 desc trees
partnume1 desc orange
partnume2 desc 12345
partnume7 desc na
partnume9 desc na
Leif Halvarsson_2
Honored Contributor

Re: script help, string matching form new file

Hi,
If this problem should not be too complex you need to sort the file using the first field as key (is it any problem).
sort -k 1 >outfile
then , try awk

cat outfile |awk ' $1 != old { print ; old=$1}'
patrick xi
Advisor

Re: script help, string matching form new file

hi Leif, thank you for reply.

as in "demands", the P_name is from column 2 to 10, sometimes it contains several blanks, so several fields.
Leif Halvarsson_2
Honored Contributor

Re: script help, string matching form new file

Hi,

One method I have sometimes used is to translate blanks to a caracters, not used as field separators. In your case, a "qiuck and dirty" solution could look something like:
(not sure what you want to do with column 1)

cut -c 1 infile >tmp1
cut -c 2-10 infile >tmp2
cut -c 11- infile >tmp3
tr " " "_" tmp4
paste -d " " tmp1 tmp4 tmp2 >tmp5

Now, column 1 and 2-10 should be two field separated by blanks. Blanks in pos 2-10 in the original file is translated to "_".

Now you can try my previous solution (but using $2 instead of $1).

After that you can use the same method as above for translating "-" to blanks.
patrick xi
Advisor

Re: script help, string matching form new file

the sample file fileA is not displayed well, continuous blanks are displayed as one. so I put it in attachment...
Leif Halvarsson_2
Honored Contributor

Re: script help, string matching form new file

Hi,
The attached file gives an better idea about how your files look like.

I assume the first field ia always a blank so we can ignore it (it is easy to recreate in the final file if you want).
I also assume there is one or more blanks in pos 11- .
I am also afraid that we must sort the file if it should be possible to solve without using a full database tool. Is it OK ?

In the following ex. you should type a blank instead of . I have done so as blank dont show correct here.
Try the following on one of your files.

cut -c 2-10 infile >tmp1
cut -c 11- infile >tmp2
tr "" "_" tmp3
paste -d"\0" tmp3 tmp2 >tmp4
sort -k 1 tmp4 >tmp5
awk ' $1 != old { print ; old=$1}' |tr "_" ""


Does the output look like what you want in the TA file ?
Muthukumar_5
Honored Contributor

Re: script help, string matching form new file

First part of creating TA, TB files can be done as,

#!/bin/ksh

# Enter user input
LINESTART=2

for file in `ls FILE*`
do

logfile="T$(echo $file | sed 's/.*\(.\)$/\1/')"

for lno in `awk -v ln=$LINESTART '(NR!=ln) { print NR,$1 }' ${file} | sort -uk 2,20 | cut -d" " -f1 | sort`
do
sed -e "$lno!d" $file > $logfile
done

done

## Change LINESTART to specify from 2nd line to last or 6th line to last line.

hth.
Easy to suggest when don't know about the problem!