Re: scripting question

Dan Copeland · ‎10-03-2002

How can you sort on a column of a text file and remove the duplicates from that column?

I assume I can use the sort command, but I haven't had much success w/ it.

attached is a sample output of the file. I want to sort on the Serial # column and remove duplicates.

tia,
Frank

Scott Van Kalken · ‎10-03-2002

you can use sort -k 5

but I think it may spit it because some of the across the page do not have anything where others have BCV.

So essentially, some have less fields.

For removing the duplicates, you can use uniq

Scott.

Uncle Liew · ‎10-03-2002

Hi Frank,

I think the best you use Microsoft Excel.

Ftp the file to your PC.
In your PC,
Remove the Headers:

Device Product Device
----------------------- --------- --------------------- ------------------
Name Type Vendor ID Rev Ser Num Cap (KB)
----------------------- --------- --------------------- ------------------

Later you can add them back in.

Open your Ms Excel.

Under File --> Open, choose your .txt file.

After you have successfully import the data, cut & paste the Serial# Column & paste it in the 1st Excel Column

Highlight all the columns.

Under Data --> Sort.

That's it .....

Hope this helps.

Patrick Chim · ‎10-03-2002

Hi,

I think there is a little bit difficult to use sort because there are different columns in each row. As I see in your file, there are blank values in the TYPE field and when you using SORT some of the field will shifted.

I'll try my best whether there is any other method to do so or other experts here can do that with SORT ! :)

Regards,
Patrick

H.Merijn Brand (procura · ‎10-03-2002

# perl -ne '1..5 and print,next;$snr=substr($_,56,9);$x{$snr}||=$_;END{print$x{$_}for(sort keys%x)}' xx.dta

Enjoy, Have FUN! H.Merijn

Patrick Chim · ‎10-03-2002

Hi,

Can you try the following script,

for i in `cut -c57-66 | sort -u`
do
grep "$i" | head -1
done >

I suggest you to cut off all the header and trailer before you issue this script.

Regards,
Patrick

Supporto Unix · ‎10-04-2002

Hi
if you want only that column in sort and whitout duplicates try this:

more text_file | grep "^/"|cut -c57-64|sort|uniq

bye

Bjoern Myrland · ‎10-04-2002

Robin Wakefield · ‎10-04-2002

Frank,

In case procura's script gives syntax errors, try:

perl -ne '1..5 and print,next;$snr=substr($_,56,9);$x{$snr}||=$_;END{for(sort keys%x){print$x{$_}}}'

Rgds, Robin.

H.Merijn Brand (procura · ‎10-04-2002

Robin, what version do you use? I think that 'for' as statement modifier works in 5.6.1 as well.

OTOH it might indeed be good to remomber that not all of you run perl-5.8.0, and certainly not like me with the defined-or patches in :)

Enjoy, Have FUN! H.Merijn

Robin Wakefield · ‎10-04-2002

Hi procura, I tried it on an early 5.004 version (yeah I know), so I'm sure it's OK in later releases. I was just trying to show what needs to be done if it does fail.

Rgds, Robin

Pierce Byrne_1 · ‎10-04-2002

Try this, you may want to mess about with formatting but it should work
The results go to file "sortedfile"
"sorter" is the source file

echo "HEADER INFO" > sortedfile
for snum in `grep rdsk sorter | cut -c57-64 | sort -u`
do
linedets=`grep ${snum} sorter | head -n1`
lineSnum=`echo "${linedets}" | cut -c57-64`
if [ "$lineSnum" = "$snum" ]
then
echo $linedets >> sortedfile
fi
done

Robin Wakefield · ‎10-04-2002

Hi Frank,

This is an awk version, inc. a sort routine:

awk 'NR<8{ print }
NR>7 { a[substr($0,56,9)]=$0 }
END {
i=0
for(b in a){
array[i]=b
i++
}
for (j=1;j<=i-1; ++j)
for (k=j;array[k-1]>array[k];--k){
temp=array[k]
array[k]=array[k-1]
array[k-1]=temp
}
for (i=0;i print a[""array[i]]
}
}' filename

Rgds, Robin

Sean OB_1 · ‎10-04-2002

Frank,

You can use the -u option of sort. Or the uniq command.

Sean

john korterman · ‎10-04-2002

Hi Frank,
I take it that you only want to write out a serial number once, namely for the first occurrence of a number of items sharing the same. That is at least what the attached script does. However, headings are messed up.
regards,
John K.

it would be nice if you always got a second chance

Jordan Bean · ‎10-04-2002

Correct me if I'm mistaken, but it looks like you're not using PowerPath. If you were, the last three digits of the serial numbers would be unique to each path. It would serve you best to sort on the first five digits.

Use this PERL script:

#!/usr/bin/perl
use strict;
use integer;
our $h={};
our $k;
while(<>){
next if /^\s*$/;
print,next unless m[^/];
@_ = unpack('A23 A10 A22 A18',$_);
@_ = map { (split(/\s+/,$_),undef)[0,1] } @_;
$k = ( $_[2] eq 'EMC' )?substr($_[6],0,5):$_[6];
$h->{$k} = $_ unless defined $h->{$k};
}
foreach $k (sort keys %$h) { print $h->{$k}; }

It breaks up the input line from syminq more than you really need just in case you want to manipulate more than just the serial number. It's probably not the most efficient, but it seems to works.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: scripting question

scripting question