Operating System - OpenVMS
1830240 Members
2049 Online
109999 Solutions
New Discussion

VMS solution for UNIX ( UNIQ -c )

 
Prem Mohan
Advisor

VMS solution for UNIX ( UNIQ -c )

Hi All,

After the phenomenal responses I got , I need some more help on the UNIQ command in UNIX.

I use a unix script like this

cat |cut -c 27-35|sed 's/^0*//' | cut -c1-3 | grep -f t.txt | sort | uniq -c.

briefly what it does is it cuts at a particular ( cut -c 27-35 ) location, strips off the zeros ( sed ) and again cuts only 1-3 characters and searches in a t.txt and sorts and gets the count of the unique ones.

My problem is I can achieve all of the above except uniq -c. I have even got until getting the duplicates together using the sort command in VMS but I am not able to get the count.

The contents of the file is attached.

Please do not ask me install GNV as I have some third party applications running on VMS machine and the vendors has threatened us that any other things installed on the machine will make the support void.

Any help will be greatly apprecaited.

Prem.
9 REPLIES 9
Jan van den Ende
Honored Contributor

Re: VMS solution for UNIX ( UNIQ -c )

Prem,

try to add /stat (for statistics) to the SORT command.
I have no system add hand right now, but if you try it yourself, you will see that from the stsatistisc output you will be able to distill your wanted info.

Success!

Proost.

Have one on me.

Jan
Don't rust yours pelled jacker to fine doll missed aches.
Jan van den Ende
Honored Contributor

Re: VMS solution for UNIX ( UNIQ -c )

Sorry,

I forgot to mention that you might also need to add /NODUPLICATES to the sort command... :-(

Again,

Success.

Proost.

Have one on me.

Jan
Don't rust yours pelled jacker to fine doll missed aches.
Bojan Nemec
Honored Contributor

Re: VMS solution for UNIX ( UNIQ -c )

Prem,

I think that you cant get this result by a single command. As jan mentioned you can use sort to get statistics and to strip out the duplicates, but uniq -c gives you a count of duplicates on each line. You can write a DCL procedure to simulate the uniq -c command:

$ open/read f 'p1'
$ cnt = 1
$ first = 1
$ old = ""
$l:
$ read f line/end=end
$ if first
$ then
$ first = 0
$ old = line
$ goto l
$ endif
$ if line.eqs.old
$ then
$ cnt = cnt + 1
$ else
$ write sys$output f$fao ("!7UL !AS" , cnt ,old)
$ old = line
$ cnt = 1
$ endif
$ goto l
$end:
$ close f
$ if cnt.gt.0
$ then
$ write sys$output f$fao ("!7UL !AS" , cnt ,old)
$ endif

If you name the procedure uniq.com then try with @UNIQ file_name.

Bojan
Hein van den Heuvel
Honored Contributor

Re: VMS solution for UNIX ( UNIQ -c )

>> Please do not ask me install GNV as I have some third party applications running on VMS machine and the vendors has threatened us that any other things installed on the machine will make the support void.

Wow. When I was young we had this rule 'The customer is King'. You pay for the app right? They are your slave! That seems too easy/lame a position for your vendor to take. It sugggests to me that they do not sufficiently understand their application (under VMS).

Yes I do appreciate the fact that 'installing stuff' can have unexpected, unintended, side effects. But some stuff can be adequatly tested / anticipated. Oh well, I am sure you do not like it either and have little influence on the outcome. Still.. I'd challenge the support statement for specific tools/products perhaps under specific usage conditions (like a different username, or wiht the ganartuee login.com does not change).

fwiw... here are some perl solutions for you problem:

while (<>){
$x = $1 if /^.{3}0{0,8}(\d{1,3})/;
if ($x eq $o) {
$i++;
} else {
print "$i $o\n" if $i;
$o=$x;
$i=1;
}
}
print "$i $o";


Sample input

$ type x.x
123123456789
123000123456
123444444444
123111111111
123444444444
123044444444
123040404040
123000000040

Show we can see the selected data:

$ perl -ne "$x = $1 if /^.{3}0{0,8}(...)/; print $x.' '.$_" x.x
123 123123456789
123 123000123456
444 123444444444
111 123111111111
444 123444444444
444 123044444444
404 123040404040
040 123000000040

One liner, almost there (no final line):

$ perl -ne "$x = $1 if /^.{3}0{0,8}(...)/;if ($x eq $old){$i++} else {print ""$i $old\n"" if $i; $old=$x;$i=1}" x.x
2 123
1 444
1 111
2 444
1 404

'One liner' solution, with final line, and selection on at least one, no more than 3 decimals for key instead of (...) for 3 of anything.

$ perl -e "while (<>){$x = $1 if /^.{3}0{0,8}(\d{1,3})/;if ($x eq $o){$i++} else {print ""$i $o\n"" if $i; $o=$x;$i=1}}print ""$i $o
""" x.x
2 123
1 444
1 111
2 444
1 404
1 40
$


Cheers,
Hein
Hein van den Heuvel
Honored Contributor

Re: VMS solution for UNIX ( UNIQ -c )

You did not provide sample contents for the filter f.txt, so I stopped reading there. However, now that I see the sort, a perl solution will look even nicer:

$ perl -e "while (<>){$x{$1}++ if /^.{3}0{0,8}(\d{1,3})/} foreach $k (sort keys %x) { print ""$x{$k} $k\n""}" x.x

1 111
2 123
1 40
1 404
3 444

while (<>){
$x{$1}++ if /^.{3}0{0,8}(\d{1,3})/
}
foreach $k (sort keys %x) {
print ""$x{$k} $k\n""
}

for the non perl / non regular expression aware readers:

$x{$1} = create and increment associative array element for %x with key $1

^ = begin of line
.{3} = 3 any chars, adapt count to needs
0{0,8} = zero or up to 8 zeroes
( = start remembering in $1
\d{1,3} = at least one, and up to 3 decimals
) = stop remembering

Hein
Hein van den Heuvel
Honored Contributor

Re: VMS solution for UNIX ( UNIQ -c )


So I can not read... You DID have the filter file attached.
And it confirms the suspiscion that the keys are numeric.
You can actually solve this relatively nicely using the DCL version of associative arrays:

$open/read filter 'p1
$open/read data 'p2
$
$filter_loop:
$ read/end=data_loop/err=done filter record
$ key_'record = 0
$ goto filter_loop
$
$data_loop:
$ read/end=report/err=done data record
$ key_integer = 'f$extract(3,9,record)
$ key = f$extract(0,3,"''key_integer'")
$!show symb key
$ if f$type(key_'key).eqs."" then goto data_loop
$ key_'key = key_'key + 1
$ goto data_loop
$
$report:
$close filter ! rewind
$open/read filter 'p1
$report_loop:
$ read/end=done filter record
$ if key_'record.gt.0 then write sys$output key_'record, " ", record
$ goto report_loop
$done:
$close filter
$close data

An alternative for re-reading the filter is a simple $SHOW SYMB KEY_*".
Or you could generate key_001 to key_999 in a loop, again printing if the value is greater than 0.


And here is the tweaked perl solution which takes a filter file as first argument

$ff = shift @ARGV;
open (FF,"<$ff") or die "Could not open filter file $ff.";
while () {
chop;
$f{$_} = 1;
}

while (<>){
$x{$1}++ if (/^.{3}0{0,8}(\d{1,3})/ && ($f{$1}))
}
foreach $k (sort keys %x) {
print "$x{$k} $k\n";
}


Hein.
Craig A Berry
Honored Contributor

Re: VMS solution for UNIX ( UNIQ -c )

Hein,

There is a slight problem in your Perl code, specifically the line that says:

$x{$1}++ if (/^.{3}0{0,8}(\d{1,3})/ && ($f{$1}))

What if the value of $f{$1} is a valid false value (for all practical purposes, zero) or exists but holds an undefined value? I think what you want is to check either for definedness or existence, not truth.

So $f{$1} becomes either exists($f{$1}) or defined($f{$1}). exists() tells you simply whether the hash contains an entry for the key you specify, and defined() tells you that it not only exists but contains a properly initialized value.
Hein van den Heuvel
Honored Contributor

Re: VMS solution for UNIX ( UNIQ -c )

Craig,

I agree with you that this is a common perl newbie pittfall. But it is explicitly no problem in my suggested solution as it controls the creation of the element and sets it to 1 (true) or does not create it at all leaving it undefined (false).
But yeah, running into an integer 0 and concluding is was undefined and visa versa has tripped my up once (or twice :-).
It is the price of perl trying to be helpfull, trying to do the natural/right thing in most circumstances.

Prem... sorry for hijacking your thread(s).

Hein.
Craig A Berry
Honored Contributor

Re: VMS solution for UNIX ( UNIQ -c )

Doh! Obviously I didn't look closely enough at your first loop :-(.