- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: how to take out duplicate ones and keep the se...
Operating System - HP-UX
1753471
Members
4765
Online
108794
Solutions
Forums
Categories
Company
Local Language
юдл
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
юдл
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Go to solution
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-22-2007 08:55 AM
тАО05-22-2007 08:55 AM
Solution
For uniq(1) to work the repeated lines need to be adjacent. Moreover uniq(1) will not preserve the original order of the items in the input file. See the man page of uniq(1) for details. The awk construct below might work so give it a try:
# awk '{x[$1]++;if(x[$1]==1) print $1}' inputfile
~cheers
# awk '{x[$1]++;if(x[$1]==1) print $1}' inputfile
~cheers
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2007 01:17 AM
тАО05-23-2007 01:17 AM
Re: how to take out duplicate ones and keep the sequences in the file
Though I personally prefer a 1-line perl for such, I was intrigued to discover how easily this could be done in shell.
cat test.words |
grep -n .* |
sort -u -t: -k2 |
sort -t: -1n |
cut -d: -f2-
> test.words.sansdupes
1. Prefix a line number and : to each line
2. Sort by remainder of line and remove dupes.
3. Sort by line number
4. Remove line number
Interesting,
cat test.words |
grep -n .* |
sort -u -t: -k2 |
sort -t: -1n |
cut -d: -f2-
> test.words.sansdupes
1. Prefix a line number and : to each line
2. Sort by remainder of line and remove dupes.
3. Sort by line number
4. Remove line number
Interesting,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-23-2007 07:09 PM
тАО05-23-2007 07:09 PM
Re: how to take out duplicate ones and keep the sequences in the file
>drb: 1. Prefix a line number and : to each line
Yes, that's how I would do it. Except you can refine your steps:
$ nl -ba -s: -nrz test.words | sort -t: -u -k2,2 | sort -t: -n -k1,1 |
cut -d: -f2- > test.words.sansdupes
I'm not sure why you had sort -1n? It worked but you would be hard pressed to prove it was legal from sort(1).
The problem with Ivan and Clay's solutions is that it will be real slow if there are lots of lines, because it searches each line against all others.
>Clay: # Copy stdin to a temp file
This can be done with cat - > file
>echo "\c" > ${DUPS} # null file
This can be done with just: > ${DUPS}
> grep -q "${X}" ${UNIQUES}
The only advantage over Ivan's is that the uniques file is smaller.
Sandman's solution trades off memory for speed, so would be good for small files.
Yes, that's how I would do it. Except you can refine your steps:
$ nl -ba -s: -nrz test.words | sort -t: -u -k2,2 | sort -t: -n -k1,1 |
cut -d: -f2- > test.words.sansdupes
I'm not sure why you had sort -1n? It worked but you would be hard pressed to prove it was legal from sort(1).
The problem with Ivan and Clay's solutions is that it will be real slow if there are lots of lines, because it searches each line against all others.
>Clay: # Copy stdin to a temp file
This can be done with cat - > file
>echo "\c" > ${DUPS} # null file
This can be done with just: > ${DUPS}
> grep -q "${X}" ${UNIQUES}
The only advantage over Ivan's is that the uniques file is smaller.
Sandman's solution trades off memory for speed, so would be good for small files.
- « Previous
-
- 1
- 2
- Next »
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
News and Events
Support
© Copyright 2024 Hewlett Packard Enterprise Development LP