Simpler Navigation for Servers and Operating Systems
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
Languages and Scripting
cancel
Showing results for 
Search instead for 
Did you mean: 

sh script - find string in two different files and compare

Highlighted
Ratzie
Super Advisor

sh script - find string in two different files and compare

I think I am going to have a hard time explaining this one, so sorry in advance...

 

I have a file that contains multiple entries:

.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR
 
.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/52728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR

 

This goes on and on...

I have another file almost identical it may contain the same (UPDATE/7digits) it may not.

What I want to do it take the UPDATE/<7digits>

Look it up in 2nd file, and if it exists, compare the ACCOUNT and see if it is different.

 

I can get the TN part:

grep UPDATE * | awk '{print $2}'|sed 's/,,,//g' |sort -u > file

 

Then do a:

for tn in `cat file`
do

grep $tn second.file

...

 

But, I have no idea how to capture the ACCOUNT information from one file, and compare to second...

Appreciate the help.

6 REPLIES
Patrick Wallek
Honored Contributor

Re: sh script - find string in two different files and compare

What do you think of this:

 

# cat file1
.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/52728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR

# cat file2
.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/92728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR


# cat script
#!/usr/bin/sh

for UPDATE in $(grep UPDATE file1 | awk -F \/ '{print $2}' | sed 's/,,,//g')
do
FILE1ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file1 | awk -F \/ '{print $2}')
FILE2ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file2 | awk -F \/ '{print $2}')
if (( ${FILE1ACCT} == ${FILE2ACCT} )) ; then
   echo "The Account numbers are the same in FILE1 and FILE2 for update number ${UPDATE}"
   echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; FILE2 ACCT# = ${FILE2ACCT}"
   echo ""
else
   echo "The Account numbers are DIFFERENT in FILE1 and FILE2 for update number ${UPDATE}"
   echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; FILE2 ACCT# = ${FILE2ACCT}"
   echo ""
fi
done

 And here's what it looks like when the script is run:

 

# ./script
The Account numbers are the same in FILE1 and FILE2 for update number 5552166619
Update # = 5552166619 ; FILE1 ACCT# = 52727963 ; FILE2 ACCT# = 52727963

The Account numbers are DIFFERENT in FILE1 and FILE2 for update number 5552194161
Update # = 5552194161 ; FILE1 ACCT# = 52728912 ; FILE2 ACCT# = 92728912

 The key is the 'sed -n' statement above.

 

It searches through the file for the value of the UPDATE# (hopefully there is never more than 1 occurrence of any particular update number in a file) obtained from file1 and looks for the corresponding account numbers in both file1 and file2 by printing the 2nd line below the UPDATE #.  This also assumes that the Account number is always 2 lines below the Update number.

Ratzie
Super Advisor

Re: sh script - find string in two different files and compare

I will try, but the file2 is tricking me as I need to look the directory that has muliple files in it for the TN... Then pull the account and check.
Patrick Wallek
Honored Contributor

Re: sh script - find string in two different files and compare

Is the FILE1 file in same directory as the other files you need to check?

Patrick Wallek
Honored Contributor

Re: sh script - find string in two different files and compare

OK, file1 is the same as above and is in the /root/pw directory.

 

I have created 2 other files called file3 and file4 in the /root/pw/test directory.

 

Here are the files, the script and the results:

 

# pwd
/root/pw

# cat test/file3
.BEGIN
UPDATE/1234567890,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/92728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR


# cat test/file4
.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/2345678901,,,
.DELETE_ALL RELATED
ACCOUNT/92728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR


# cat script
#!/usr/bin/sh

for UPDATE in $(grep UPDATE file1 | awk -F \/ '{print $2}' | sed 's/,,,//g')
do
FILE1ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file1 | awk -F \/ '{print $2}')
UPDATEFILE=$(grep -l ${UPDATE} /root/pw/test/*)
FILE2ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" ${UPDATEFILE} | awk -F \/ '{print $2}')
if (( ${FILE1ACCT} == ${FILE2ACCT} )) ; then
   echo "The Account numbers are the same in FILE1 and ${UPDATEFILE} for update number ${UPDATE}"
   echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}"
   echo ""
else
   echo "The Account numbers are DIFFERENT in FILE1 and ${UPDATEFILE} for update number ${UPDATE}"
   echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}"
   echo ""
fi
done


# ./script
The Account numbers are the same in FILE1 and /root/pw/test/file4 for update number 5552166619
Update # = 5552166619 ; FILE1 ACCT# = 52727963 ; /root/pw/test/file4 ACCT# = 52727963

The Account numbers are DIFFERENT in FILE1 and /root/pw/test/file3 for update number 5552194161
Update # = 5552194161 ; FILE1 ACCT# = 52728912 ; /root/pw/test/file3 ACCT# = 92728912

 The 'grep -l' in the script searches through the files in /root/pw/test and returns the filename of the file with the same UPDATE number.  The sed statement for FILE2ACCT then looks for the ACCT# in the file returned by the 'grep -l' command.

Patrick Wallek
Honored Contributor

Re: sh script - find string in two different files and compare

I have just added a check so that is an UPDATE # from file1 is NOT found in any files in the /root/pw/test directory, then the script will continue on.  My previous versions just hung.

 

NEW FILE1

# cat file1
.BEGIN
UPDATE/4567890123,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR

.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/52728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR



NEW SCRIPT

# cat script
#!/usr/bin/sh

for UPDATE in $(grep UPDATE file1 | awk -F \/ '{print $2}' | sed 's/,,,//g')
do
FILE1ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file1 | awk -F \/ '{print $2}')
UPDATEFILE=$(grep -l ${UPDATE} /root/pw/test/*)
if [[ ${UPDATEFILE} != "" ]] ; then
   FILE2ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" ${UPDATEFILE} | awk -F \/ '{print $2}')
   if (( ${FILE1ACCT} == ${FILE2ACCT} )) ; then
      echo "The Account numbers are the same in FILE1 and ${UPDATEFILE} for update number ${UPDATE}"
      echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}"
      echo ""
   else
      echo "The Account numbers are DIFFERENT in FILE1 and ${UPDATEFILE} for update number ${UPDATE}"
      echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}"
      echo ""
   fi
fi
done


# ./script
The Account numbers are the same in FILE1 and /root/pw/test/file4 for update number 5552166619
Update # = 5552166619 ; FILE1 ACCT# = 52727963 ; /root/pw/test/file4 ACCT# = 52727963

The Account numbers are DIFFERENT in FILE1 and /root/pw/test/file3 for update number 5552194161
Update # = 5552194161 ; FILE1 ACCT# = 52728912 ; /root/pw/test/file3 ACCT# = 92728912

 

Dennis Handly
Acclaimed Contributor

Re: sh script - find string in two different files and compare

Here is something a little easier to understand and is performant since it uses a hash and reads each file once:

 

awk -v master=file1 '
# finds the number after "/" and before any ","
function crack_number(field) {
   i = split(field, fields, "[/,]")
#   print "found", i, "fields:", fields[2]
   return fields[2] ""  # make sure it is a string
}
BEGIN {
# create a map from update # to account #
while (getline < master > 0) {
   if ($1 ~ "UPDATE") {
      update = crack_number($1)
      continue
   }
   if ($1 ~ "ACCOUNT") {
      account = crack_number($1)
#      print update "|" account
      map[update] = account
      continue
   }
}
close(master)
}
/UPDATE/ {
   update = crack_number($1)
   next
}
/ACCOUNT/ {
   account = crack_number($1)
   if (update == "") {
      print "No update # for account", account
      next
   }
   account_m = map[update]
   if (account_m == "") {
#      print "update number", update, "in", FILENAME, "skipped"
      update = ""
      next
   }
   if (account == account_m) {
      print "The Account numbers are the same in FILE1 and", FILENAME, "for update number", update
      print "Update # =", update "; FILE1 ACCT# =", account_m, "; FILE2 ACCT# =", account
   } else {
      print "The Account numbers are DIFFERENT in FILE1 and", FILENAME, "for update number", update
      print "Update # =", update "; FILE1 ACCT# =", account_m, "; FILE2 ACCT# =", account
   }
   print ""
   update = ""
}' file3 file4