Re: Script help

jackfiled · ‎07-12-2004

I need your help!!!

I need to know duplicated strings in two files.
One file has 1000line or so, and the other has 550 lines too.
The output that I want is printing the same string, so I will update new string.

For example One file has strings like

webbbs
ryujin
hanbangapple
yumso
sagua
nojuck
ryujin
gajossal
scfarm
kyulnara
dearpia
cwpodofarm
chungma
mealon
samyu...

The Other

andonghoney
ansungfarm
apeace
appletop
apsanjayoun
bawoofarm
bitgolfarm
hanbangapple
bonghwangfarm
celesti
chuksukfarm
chungmaewon
chungpoongfarm
chunmafarm
dearpia
...
so as you see hanbangapple and dearpia are both in two files

I could do know duplicated strings are hanbangapple , dearpia and I can erase them one of files.

what is the script for fit it? any tips..

Stuart Browne · ‎07-12-2004

So you want to remove words which are duplicated in the files from one of the files.

Using something like:

cat file1 file2 | sort | uniq -d

to list the same words.

Then use sed or awk or your favourite text manipulation tool to remove it.. i.e.

for WORD in $(cat file1 file2 | sort | uniq -d)
do
sed -e "/${WORD}/d" < file1 > file1.out
mv file1.out file1
done

or some such..

One long-haired git at your service...

Francisco J. Soler · ‎07-12-2004

Hi,
If you have enought memory (i think it is possible because the length of files is small), you can store all lines from the file with no modifications in an array, then with awk read the other file and write out this lines that are not in the array.

For example:

You can do
awk -f script.awk file1 file2 > file3

where file3 is the file2 without the strings that are in file1

---- script.awk -----------

BEGIN {
flag_file1=1
filename=" "
}
{
if (filename==" ")
filename=FILENAME
if (filename!=FILENAME)
flag_file1=0
if (flag_file1==1) {
a[NR]=$0
num_lin=NR
} else {
exists=0
for (i=1;i<=num_lin;i++) {
if ($0==a[i])
exists=1
}
if (exists==0)
print
}
}
------------- end script -------------

Cheers.
Frank.

Linux?. Yes, of course.

Muthukumar_5 · ‎07-12-2004

hai,

use grep to do this. Get one file whose line is less than another. Get line by line and grep that in two files. end, input file will be modified.

#!/usr/bin/ksh
# forum.ksh
set -x

file1=$1
file2=$2

input=""
newfile=/tmp/stringcheck.log

# Remove file if exists
cp -p $newfile
touch $newfile

if [[ $(cat $file1 | wc -l) -lt $(cat $file2 | wc -l) ]]; then
input=$file1
else
input=$file2
fi

while read line; do
grep -q $line $file1 $file2
if [[ $? -eq 0 ]]; then
echo "$line is in $file1 and $file2"
else
echo "$line" >> $newfile
fi
done < $input

# To make the file without same string
cp $newfile $input

## end ###

Regards,
Muthukumar

Easy to suggest when don't know about the problem!

Categories

Company

Local Language

Forums

Discussions

Knowledge Base

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Script help

Script help

Re: Script help

Re: Script help

Re: Script help