Re: script help, string matching form new file

patrick xi · ‎08-25-2005

Dear friends,

I need a shell script to process several files, they are FileA, FileB, FileC, FileD,...

1.feature of the file
.each have 600 columns
.lines from 2nd line to bottom are useful
.from column 2-10 is P_name, column 60-80 is Description
2.demands
.trim each file to TA, TB, TC, ... formed by first unique P_name lines
.merge TA line to TB if P_name in that line of TA is not in TB, this make file M1, then M2 is between TB and TC, M3 is between TC and TD,...

what I need is M1, M2, M3, ... , could you help me?
Hope I explained well and thank you in advance.

Best regards,
Patrick

Muthukumar_5 · ‎08-26-2005

for file in `ls FILE*`
do

logfile="T$(echo $file | sed 's/.*$.$$/\1/')"
awk '(NR!=1) { for (i=2;i<=20;i++) { print $i; } }' $file | sort -u > $logfile

done

It will create TA, TB, TC files.

You can create M1, M2, M3 files as,

grep -vf TA < TB > M1

like that.

hth.

Easy to suggest when don't know about the problem!

patrick xi · ‎08-26-2005

Hello Muthukumar,
Thank you for taking the time.
but the result seems not what I want.
1. I need the whole line, not only the P_name.
2. The script is defined to work on column 2 to 10, but I found in result I got column 165 to 176, I don't know why. Too many columns?
3. How can I start from 4th or 6th line? ( words above are not very correct: ".lines from 2nd line to bottom are useful", because source file varied, I have to change the configure from time to time)

Best regards,
Patrick

patrick xi · ‎08-27-2005

I made a test file named fileA, content is:
1234567890123456789012345678901234567890
testfile1 desc trees
testfile1 desc apple
partnume1 desc orange
testfile1 desc table
partnume2 desc 12345
partnume9 desc na
partnume7 desc na

the result TA should be :
testfile1 desc trees
partnume1 desc orange
partnume2 desc 12345
partnume7 desc na
partnume9 desc na

Leif Halvarsson_2 · ‎08-27-2005

Hi,
If this problem should not be too complex you need to sort the file using the first field as key (is it any problem).
sort -k 1 >outfile
then , try awk

cat outfile |awk ' $1 != old { print ; old=$1}'

patrick xi · ‎08-27-2005

hi Leif, thank you for reply.

as in "demands", the P_name is from column 2 to 10, sometimes it contains several blanks, so several fields.

Leif Halvarsson_2 · ‎08-27-2005

Hi,

One method I have sometimes used is to translate blanks to a caracters, not used as field separators. In your case, a "qiuck and dirty" solution could look something like:
(not sure what you want to do with column 1)

cut -c 1 infile >tmp1
cut -c 2-10 infile >tmp2
cut -c 11- infile >tmp3
tr " " "_" tmp4
paste -d " " tmp1 tmp4 tmp2 >tmp5

Now, column 1 and 2-10 should be two field separated by blanks. Blanks in pos 2-10 in the original file is translated to "_".

Now you can try my previous solution (but using $2 instead of $1).

After that you can use the same method as above for translating "-" to blanks.

patrick xi · ‎08-28-2005

the sample file fileA is not displayed well, continuous blanks are displayed as one. so I put it in attachment...

Leif Halvarsson_2 · ‎08-28-2005

Hi,
The attached file gives an better idea about how your files look like.

I assume the first field ia always a blank so we can ignore it (it is easy to recreate in the final file if you want).
I also assume there is one or more blanks in pos 11- .
I am also afraid that we must sort the file if it should be possible to solve without using a full database tool. Is it OK ?

In the following ex. you should type a blank instead of . I have done so as blank dont show correct here.
Try the following on one of your files.

cut -c 2-10 infile >tmp1
cut -c 11- infile >tmp2
tr "" "_" tmp3
paste -d"\0" tmp3 tmp2 >tmp4
sort -k 1 tmp4 >tmp5
awk ' $1 != old { print ; old=$1}' |tr "_" ""

Does the output look like what you want in the TA file ?

Muthukumar_5 · ‎08-28-2005

First part of creating TA, TB files can be done as,

#!/bin/ksh

# Enter user input
LINESTART=2

for file in `ls FILE*`
do

logfile="T$(echo $file | sed 's/.*$.$$/\1/')"

for lno in `awk -v ln=$LINESTART '(NR!=ln) { print NR,$1 }' ${file} | sort -uk 2,20 | cut -d" " -f1 | sort`
do
sed -e "$lno!d" $file > $logfile
done

done

## Change LINESTART to specify from 2nd line to last or 6th line to last line.

hth.

Easy to suggest when don't know about the problem!

Muthukumar_5 · ‎08-28-2005

Second part of merge TA line to TB if P_name in that line of TA is not in TB like that,

set -A ARR `ls -1 File*`

index=0
findex=1

while [[ $index -lt ${#ARR} ]]
do
grep -vf ${ARR[$index]} ${ARR[$index+1]} >M${findex}
let index=index+2
let findex=findex+1
done

# Add this and try it

hth.

Easy to suggest when don't know about the problem!

patrick xi · ‎09-14-2005

hi, Leif and Muthukumar,

thank you.

according to leif's solution, it did work and gave a better result, but not very exactly meet the demand, like how to start from the 2nd or 4th line to process?

according to Muthukumar's solution, I tested it with my sample file, the result is not what I want. But your script is what I'm expecting to be like. Hope this will work.

Best regards,
Patrick

James R. Ferguson · ‎09-14-2005

Hi Patrick:

If Leif's solution meets your needs and you only want to begin after (for instance, line #2) change his 'awk' to read:

# awk '{if (NR>2) {$1 !=old;print;old=$1}}' | tr "_" ""

'awk's NR variable provides the line-number of the file being processed.

Regards!

...JRF...

patrick xi · ‎09-14-2005

hi, JRF,
nice to meet you.

in Leif's script, the lines are already sorted before awk. So the line sequence is changed.

Best regards,
Patrick

patrick xi · ‎09-14-2005

also I need the omitted lines in the result.
patrick.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: script help, string matching form new file

script help, string matching form new file