- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Finding duplicate filenames
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 02:53 AM
тАО10-15-2001 02:53 AM
Many thanks in advance.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 03:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 03:24 AM
тАО10-15-2001 03:24 AM
Re: Finding duplicate filenames
Give this a try
#!/usr/bin/ksh
if [ $# -ne 1 ]
then
echo "$0: full path to the directory/filesystem"
exit 1
fi
DIR=$1
if [ ! -d $DIR ]
then
echo "$DIR no such directory"
exit 1
fi
if [ -f duplicates ]
then
rm duplicates
fi
find $DIR -type f > list$$
awk '{FS="/";print $NF}' list$$ |sort |uniq -d >> files$$
for FILE in `cat files$$`
do
grep $FILE list$$ >> tmp$$
for ENTRY in `cat tmp$$`
do
ONE=`echo $ENTRY|awk '{FS="/";print $NF}'`
if [ $ONE = $FILE ]
then
echo $ENTRY >> duplicates
fi
done
done
rm tmp$$ files$$ list$$
echo "Check the file "duplicates" in the current dir"
-Sri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 03:26 AM
тАО10-15-2001 03:26 AM
Re: Finding duplicate filenames
The find command looks like this :
find
With 'sed', you can covert this file to a list of filenames without path and the patch itself on the same line. Then sort this file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 03:29 AM
тАО10-15-2001 03:29 AM
Re: Finding duplicate filenames
You can cut and paste my script. It gives the full paths to the duplicate files in a file called "duplicates" in the current directory.
-Sri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 03:32 AM
тАО10-15-2001 03:32 AM
Re: Finding duplicate filenames
This will "GRAB" the file name from the path, put the file name first, and then the directory after it. That way you get what you waht, duplicate files in different directories.
find /var -type f|sed "s/\(^\/.*\/\)\(.*$\)/\2 \1/"|sort >filenames.out
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 03:51 AM
тАО10-15-2001 03:51 AM
Re: Finding duplicate filenames
And the winner is....Vrijhoeven
His answer handles spaces in file and director names (very important!) but needs some massaging to be complete;
1. cd
2. for i in $(find . -type f | awk -F / '{print $NF}'|sort|uniq -d)
do
grep "/${i}" /tmp/tempfile
done
Tested it, works great.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 03:59 AM
тАО10-15-2001 03:59 AM
Re: Finding duplicate filenames
$ cat ../doit
#! /usr/bin/sh
find . -type f |
{
while read filename
do
echo `dirname $filename` `basename $filename`
done
} | sort -k 2,2
$ ls -R
dir1 dir2 dir3
./dir1:
a c d
./dir2:
a b e
./dir3:
b c f
$ ../doit
./dir1 a
./dir2 a
./dir2 b
./dir3 b
./dir1 c
./dir3 c
./dir1 d
./dir2 e
./dir3 f
:;
sort -k 2,2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 04:02 AM
тАО10-15-2001 04:02 AM
Re: Finding duplicate filenames
Winner's script doesn't work in this scenario.
My directory structure is
.
./test1
./test1/test
./test1/sri
./test2
./test2/test
./test2/new
./test3
./test3/test
./test3/sri
./file
-Sri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 04:13 AM
тАО10-15-2001 04:13 AM
Re: Finding duplicate filenames
Please ignore the stuff after "./dir3 f", i.e. the silly ":;" and the standalone "sort -k 2,2 ".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 04:59 AM
тАО10-15-2001 04:59 AM
Re: Finding duplicate filenames
oops, change the last grep to;
grep "${i}$" /tmp/tempfile
This will find the filenames at the end of the line. Works with the above test1,2,3 directory structure now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 05:22 AM
тАО10-15-2001 05:22 AM
Re: Finding duplicate filenames
This method avoids having to run find twice:
===========================================
find . -type f | while read file ; do
echo $(dirname "$file")'\t'$(basename "$file")
done |
sort -t" " -k 2 - > /tmp/listfiles
uniq -f 1 -u /tmp/listfiles |
join -t" " -v 2 -1 2 -2 2 -o 2.1,2.2 - /tmp/listfiles |
sed 's+ +/+'
=============================================
Both join and sort commands have the tab character between the double quotes.
Rgds, Robin.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 05:27 AM
тАО10-15-2001 05:27 AM
Re: Finding duplicate filenames
you sure your script works with spaces in a directory and filename ? usually dirname and basename dont work with spaces in names.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 05:36 AM
тАО10-15-2001 05:36 AM
Re: Finding duplicate filenames
Seems OK, as long as it's quoted.
# basename /tmp/a b c
# NOTHING !!
# dirname "/tmp/a b c"
/tmp
# basename "/tmp/a b c"
a b c
I tried a few variations, and it seemed to behave itself.
Cheers, Robin.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 05:56 AM
тАО10-15-2001 05:56 AM
Re: Finding duplicate filenames
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-15-2001 06:04 AM
тАО10-15-2001 06:04 AM
Re: Finding duplicate filenames
Thanks for checking that Robin. OK, now we have 2 answers which do the business.
Preet - dont forget to assign points.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-17-2001 03:59 AM
тАО10-17-2001 03:59 AM
Re: Finding duplicate filenames
credits:
http://www.geocities.com/fcheck2000/download.html