- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- sh script - find string in two different files and...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2014 09:03 AM
01-10-2014 09:03 AM
sh script - find string in two different files and compare
I think I am going to have a hard time explaining this one, so sorry in advance...
I have a file that contains multiple entries:
.BEGIN
UPDATE/5552166619,,,
.DELETE_ALL RELATED
ACCOUNT/52727963
.INSERT_RELATED/
RELATED/myemail@email.net
.END_INSERT
.EOR
.BEGIN
UPDATE/5552194161,,,
.DELETE_ALL RELATED
ACCOUNT/52728912
.INSERT_RELATED/
RELATED/diffemail@myemail.net
.END_INSERT
.EOR
This goes on and on...
I have another file almost identical it may contain the same (UPDATE/7digits) it may not.
What I want to do it take the UPDATE/<7digits>
Look it up in 2nd file, and if it exists, compare the ACCOUNT and see if it is different.
I can get the TN part:
grep UPDATE * | awk '{print $2}'|sed 's/,,,//g' |sort -u > file
Then do a:
for tn in `cat file`
do
grep $tn second.file
...
But, I have no idea how to capture the ACCOUNT information from one file, and compare to second...
Appreciate the help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2014 01:56 PM
01-10-2014 01:56 PM
Re: sh script - find string in two different files and compare
What do you think of this:
# cat file1 .BEGIN UPDATE/5552166619,,, .DELETE_ALL RELATED ACCOUNT/52727963 .INSERT_RELATED/ RELATED/myemail@email.net .END_INSERT .EOR .BEGIN UPDATE/5552194161,,, .DELETE_ALL RELATED ACCOUNT/52728912 .INSERT_RELATED/ RELATED/diffemail@myemail.net .END_INSERT .EOR # cat file2 .BEGIN UPDATE/5552166619,,, .DELETE_ALL RELATED ACCOUNT/52727963 .INSERT_RELATED/ RELATED/myemail@email.net .END_INSERT .EOR .BEGIN UPDATE/5552194161,,, .DELETE_ALL RELATED ACCOUNT/92728912 .INSERT_RELATED/ RELATED/diffemail@myemail.net .END_INSERT .EOR # cat script #!/usr/bin/sh for UPDATE in $(grep UPDATE file1 | awk -F \/ '{print $2}' | sed 's/,,,//g') do FILE1ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file1 | awk -F \/ '{print $2}') FILE2ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file2 | awk -F \/ '{print $2}') if (( ${FILE1ACCT} == ${FILE2ACCT} )) ; then echo "The Account numbers are the same in FILE1 and FILE2 for update number ${UPDATE}" echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; FILE2 ACCT# = ${FILE2ACCT}" echo "" else echo "The Account numbers are DIFFERENT in FILE1 and FILE2 for update number ${UPDATE}" echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; FILE2 ACCT# = ${FILE2ACCT}" echo "" fi done
And here's what it looks like when the script is run:
# ./script The Account numbers are the same in FILE1 and FILE2 for update number 5552166619 Update # = 5552166619 ; FILE1 ACCT# = 52727963 ; FILE2 ACCT# = 52727963 The Account numbers are DIFFERENT in FILE1 and FILE2 for update number 5552194161 Update # = 5552194161 ; FILE1 ACCT# = 52728912 ; FILE2 ACCT# = 92728912
The key is the 'sed -n' statement above.
It searches through the file for the value of the UPDATE# (hopefully there is never more than 1 occurrence of any particular update number in a file) obtained from file1 and looks for the corresponding account numbers in both file1 and file2 by printing the 2nd line below the UPDATE #. This also assumes that the Account number is always 2 lines below the Update number.
- Tags:
- sed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2014 02:24 PM
01-10-2014 02:24 PM
Re: sh script - find string in two different files and compare
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2014 02:30 PM
01-10-2014 02:30 PM
Re: sh script - find string in two different files and compare
Is the FILE1 file in same directory as the other files you need to check?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2014 02:45 PM
01-10-2014 02:45 PM
Re: sh script - find string in two different files and compare
OK, file1 is the same as above and is in the /root/pw directory.
I have created 2 other files called file3 and file4 in the /root/pw/test directory.
Here are the files, the script and the results:
# pwd /root/pw # cat test/file3 .BEGIN UPDATE/1234567890,,, .DELETE_ALL RELATED ACCOUNT/52727963 .INSERT_RELATED/ RELATED/myemail@email.net .END_INSERT .EOR .BEGIN UPDATE/5552194161,,, .DELETE_ALL RELATED ACCOUNT/92728912 .INSERT_RELATED/ RELATED/diffemail@myemail.net .END_INSERT .EOR # cat test/file4 .BEGIN UPDATE/5552166619,,, .DELETE_ALL RELATED ACCOUNT/52727963 .INSERT_RELATED/ RELATED/myemail@email.net .END_INSERT .EOR .BEGIN UPDATE/2345678901,,, .DELETE_ALL RELATED ACCOUNT/92728912 .INSERT_RELATED/ RELATED/diffemail@myemail.net .END_INSERT .EOR # cat script #!/usr/bin/sh for UPDATE in $(grep UPDATE file1 | awk -F \/ '{print $2}' | sed 's/,,,//g') do FILE1ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file1 | awk -F \/ '{print $2}') UPDATEFILE=$(grep -l ${UPDATE} /root/pw/test/*) FILE2ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" ${UPDATEFILE} | awk -F \/ '{print $2}') if (( ${FILE1ACCT} == ${FILE2ACCT} )) ; then echo "The Account numbers are the same in FILE1 and ${UPDATEFILE} for update number ${UPDATE}" echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}" echo "" else echo "The Account numbers are DIFFERENT in FILE1 and ${UPDATEFILE} for update number ${UPDATE}" echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}" echo "" fi done # ./script The Account numbers are the same in FILE1 and /root/pw/test/file4 for update number 5552166619 Update # = 5552166619 ; FILE1 ACCT# = 52727963 ; /root/pw/test/file4 ACCT# = 52727963 The Account numbers are DIFFERENT in FILE1 and /root/pw/test/file3 for update number 5552194161 Update # = 5552194161 ; FILE1 ACCT# = 52728912 ; /root/pw/test/file3 ACCT# = 92728912
The 'grep -l' in the script searches through the files in /root/pw/test and returns the filename of the file with the same UPDATE number. The sed statement for FILE2ACCT then looks for the ACCT# in the file returned by the 'grep -l' command.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2014 02:49 PM
01-10-2014 02:49 PM
Re: sh script - find string in two different files and compare
I have just added a check so that is an UPDATE # from file1 is NOT found in any files in the /root/pw/test directory, then the script will continue on. My previous versions just hung.
NEW FILE1 # cat file1 .BEGIN UPDATE/4567890123,,, .DELETE_ALL RELATED ACCOUNT/52727963 .INSERT_RELATED/ RELATED/myemail@email.net .END_INSERT .EOR .BEGIN UPDATE/5552166619,,, .DELETE_ALL RELATED ACCOUNT/52727963 .INSERT_RELATED/ RELATED/myemail@email.net .END_INSERT .EOR .BEGIN UPDATE/5552194161,,, .DELETE_ALL RELATED ACCOUNT/52728912 .INSERT_RELATED/ RELATED/diffemail@myemail.net .END_INSERT .EOR NEW SCRIPT # cat script #!/usr/bin/sh for UPDATE in $(grep UPDATE file1 | awk -F \/ '{print $2}' | sed 's/,,,//g') do FILE1ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" file1 | awk -F \/ '{print $2}') UPDATEFILE=$(grep -l ${UPDATE} /root/pw/test/*) if [[ ${UPDATEFILE} != "" ]] ; then FILE2ACCT=$(sed -n "/${UPDATE}/{n;n;p;}" ${UPDATEFILE} | awk -F \/ '{print $2}') if (( ${FILE1ACCT} == ${FILE2ACCT} )) ; then echo "The Account numbers are the same in FILE1 and ${UPDATEFILE} for update number ${UPDATE}" echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}" echo "" else echo "The Account numbers are DIFFERENT in FILE1 and ${UPDATEFILE} for update number ${UPDATE}" echo "Update # = ${UPDATE} ; FILE1 ACCT# = ${FILE1ACCT} ; ${UPDATEFILE} ACCT# = ${FILE2ACCT}" echo "" fi fi done # ./script The Account numbers are the same in FILE1 and /root/pw/test/file4 for update number 5552166619 Update # = 5552166619 ; FILE1 ACCT# = 52727963 ; /root/pw/test/file4 ACCT# = 52727963 The Account numbers are DIFFERENT in FILE1 and /root/pw/test/file3 for update number 5552194161 Update # = 5552194161 ; FILE1 ACCT# = 52728912 ; /root/pw/test/file3 ACCT# = 92728912
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-10-2014 05:58 PM - edited 01-10-2014 06:50 PM
01-10-2014 05:58 PM - edited 01-10-2014 06:50 PM
Re: sh script - find string in two different files and compare
Here is something a little easier to understand and is performant since it uses a hash and reads each file once:
awk -v master=file1 '
# finds the number after "/" and before any ","
function crack_number(field) {
i = split(field, fields, "[/,]")
# print "found", i, "fields:", fields[2]
return fields[2] "" # make sure it is a string
}
BEGIN {
# create a map from update # to account #
while (getline < master > 0) {
if ($1 ~ "UPDATE") {
update = crack_number($1)
continue
}
if ($1 ~ "ACCOUNT") {
account = crack_number($1)
# print update "|" account
map[update] = account
continue
}
}
close(master)
}
/UPDATE/ {
update = crack_number($1)
next
}
/ACCOUNT/ {
account = crack_number($1)
if (update == "") {
print "No update # for account", account
next
}
account_m = map[update]
if (account_m == "") {
# print "update number", update, "in", FILENAME, "skipped"
update = ""
next
}
if (account == account_m) {
print "The Account numbers are the same in FILE1 and", FILENAME, "for update number", update
print "Update # =", update "; FILE1 ACCT# =", account_m, "; FILE2 ACCT# =", account
} else {
print "The Account numbers are DIFFERENT in FILE1 and", FILENAME, "for update number", update
print "Update # =", update "; FILE1 ACCT# =", account_m, "; FILE2 ACCT# =", account
}
print ""
update = ""
}' file3 file4
- Tags:
- awk