Operating System - HP-UX
1846552 Members
2227 Online
110256 Solutions
New Discussion

Re: Script help (Optimizing Disk-Usage/Allocation)

 
SOLVED
Go to solution
Chern Jian Leaw
Regular Advisor

Script help (Optimizing Disk-Usage/Allocation)

Hi,

I've attached a file in this message. The file contains 2 columns, denoting the name of the filesystem and its disk-usage (du).

I would like to calculate the disk-usage sizes of the filesystems. If the accumulated value is less than 18000000KB(18GB), place the names of those filesystems into separate files. The files represent the allocation of a disk to several filesystems not exceeding 18GB.

e.g:
/fs24/nwdv... 3092416
/fs26/nwdv... 13909787
/fs27/nwdv... 7899088
/fs34/nwdv... 6789123
/fs12/circuit... 189087

From the example above, hence /fs24 and /fs26 are to be placed in file_DISK1, /fs27 and /fs34 are to be placed into file_DISK2 and the next lists should be in file_DISKn (n=1...x).

Also, if there are sufficient disk-space remaining in a disk, after placing the current filesystems in it, I would like to have the smaller sized filesystem(s) placed into the first instance of those particular disks, i.e by appending the name of that filesystem into a file which contains other filesystem list.
e.g /fs12/circuit... could be appended into file_DISK1

I'm not sure I should do this in AWK, or even other methods of shell scripts.

Could someone please help me out?

Thank you.
12 REPLIES 12
Ceesjan van Hattum
Esteemed Contributor

Re: Script help (Optimizing Disk-Usage/Allocation)

Hi, if i have some time today, i'll try to write i nice script for you. But have you seen that the /fs36/nwdv.fw.1 23148320 is too big for a 18Gb disk.. what to do about this one?

Regards,
Ceesjan
Chern Jian Leaw
Regular Advisor

Re: Script help (Optimizing Disk-Usage/Allocation)

Ceesjan,
Thanks for your offer to show me the script.

For the case of /fs36/nwdv.fm.1 having disk usage of 23148320KB (23GB), I would like to have it placed into another disk i.e into another file called file_DISK_UNIQUE.

As for the rest, if their sum does not exceed 18000000KB(18GB), then place them into a file_DISKn (n=1....x)

Thanks.

Robin Wakefield
Honored Contributor
Solution

Re: Script help (Optimizing Disk-Usage/Allocation)

Hi,

The following perl script should do what you want. Change $loc to where you want the files stored. Change your perl location in line1.

Run it with your filename as the only argument, e.g:

yourscript.pl disk-size-file

============================================
#!/opt/perl5/bin/perl

my $infile=$ARGV[0];
my $loc="/tmp/file_DISK_";
my $eighteengig=18*1024*1024;
my ($i,$j,$disk,$size,$file,$suffix);
my (@sizes,@disks);

open IN,$infile;
while () {
chomp;
next unless (m+^/+);
($disk,$size)=split;
for ($j=0;$j<=$i;$j++){
next if ( ($sizes[$j]+$size) > $eighteengig );
$sizes[$j]+=$size;
$disks[$j].="$disk|";
last;
}
if ( $j > $i ) {
$i++;
$sizes[$j]=$size;
$disks[$j].="$disk|";
}
}
close IN;
for ($j=0;$j<=$i;$j++){
if ( $sizes[$j] > $eighteengig ) {
$file=$loc."UNIQUE";
} else {
$suffix++;
$file=$loc.$suffix;
}
open FH,">> $file";
foreach $disk ( split /\|/,$disks[$j] ) {
print FH "$disk\n";
}
close FH;
}
============================================

Rgds, Robin.
Ceesjan van Hattum
Esteemed Contributor

Re: Script help (Optimizing Disk-Usage/Allocation)

Hi again,
Here is my script, but there are a few things you should know:
1. 18G = 18*1024*1024=18874368, but i would not go further than 90% of this.
2. The scripts calculates the disks sequentially. If a disk is full, go to the next. A good program should have a recursive-backtracking mechanisme, to find all possible combinations.

Here the script and it's output:

awk 'BEGIN{disknr=1;oldsum=0}
{
fs=$1"\n"oldfs
sum=oldsum+$2;
if (sum>18874368) {
sum=oldsum;
fs=oldfs;
if (sum>18874368) {
printf "DISC_UNIQUE=%d\nSIZE: %d\nFILESYSTEMS:\n%s\n\n",
disknr,sum,fs }
else printf "DISC=%d\nSIZE: %d\nFILESYSTEMS:\n%s\n\n",disknr,
sum,fs
disknr=disknr+1;
oldsum=0
sum=$2
fs=$1
}
oldsum=sum
oldfs=fs
}END{
sum=oldsum+$2;
printf "DISC=%d\nSIZE: %d\nFILESYSTEMS:\n%s\n\n",disknr,sum,fs
}' input

The output:

DISC=1
SIZE: 14812480
FILESYSTEMS:
/fs34/nwdv.sfv.1
/fs37/nwdv.dfm.1
/fs34/nwdv.des.clk.1
/fs36/nwdv.des.clk.2
/fs34/cad.athena.linux_rsaix
/fs37/nwdv.dfm.2

DISC=2
SIZE: 18451956
FILESYSTEMS:
/fs34/nwdv.apv.fctdb.1
/fs34/nwdv.apv.bctl.1
/fs36/nwdv.des.mem.1
/fs36/nwdv.des.bctl.5
/fs36/nwdv.fw.2

DISC=3
SIZE: 18830712
FILESYSTEMS:
/fs12/wmt.av.1
/fs37/nwdv.dft.3
/fs37/nwdv.toolutils.dev.1
/fs34/nwdv.fcnet.1
/fs31/nwd.uav.2
/fs34/nwdv.shark.2
/fs34/nwdv.uav.2
/fs34/nwdv.des.bctl.3
/fs15/wmt.uav.2
/fs36/nwdv.dft.1
/fs12/wmt.dv.4
/fs12/wmt.uav.1
/fs16/wmt.dv.7
/fs34/nwdv.des.noise.2
/fs34/nwdv.sfv.3
/fs36/nwdv.apv.bctl.2
/fs37/nwdv.uav.6
/fs37/nwdv.cad.linux.1
/fs37/nwdv.dft.4
/fs34/nwdv.rvcr.4
/fs34/nwdv.rvcr.3
/fs34/nwdv.powrv.2
/fs34/nwdv.apv.ul1.2
/fs34/nwdv.apv.ul1.1
/fs34/nwdv.apv.fctdb.3
/fs34/nwdv.apv.fctdb.2

DISC=4
SIZE: 16367912
FILESYSTEMS:
/fs34/nwdv.rvcr.5
/fs28/nwd.des.clk.1
/fs36/nwdv.powerv.1
/fs36/nwdv.uav.3
/fs36/nwdv.da.maskpe.2
/fs24/wmt.des.noise.6
/fs36/nwdv.dft.2

DISC=5
SIZE: 16841608
FILESYSTEMS:
/fs36/nwdv.des.erc.1
/fs37/nwdv.des.noise.9
/fs12/wmt.common.1
/fs12/wmt.pcg.1
/stor/nwd_lay3

DISC=6
SIZE: 18466720
FILESYSTEMS:
/fs34/nwdv.rvcr.2
/fs34/nwdv.sfv.2
/fs34/nwdv.rtl.mem.1
/fs12/wmt.ucode.1

DISC=7
SIZE: 14505804
FILESYSTEMS:
/fs13/wmt.rtl.1
/fs36/nwdv.rtl.bus.1
/fs34/nwdv.rvcr.1

DISC=8
SIZE: 10303388
FILESYSTEMS:
/fs36/nwdv.apv.frz.6
/fs34/nwdv.powrv.1

DISC_UNIQUE=9
SIZE: 23148320
FILESYSTEMS:
/fs36/nwdv.fw.1

DISC=10
SIZE: 18441472
FILESYSTEMS:
/fs36/nwdv.cr.7
/fs36/nwdv.cr.8
/fs16/wmt.bav.1

DISC=11
SIZE: 14120864
FILESYSTEMS:
/fs36/nwdv.da.fubpv.1

------
Regards
Ceesjan

Chern Jian Leaw
Regular Advisor

Re: Script help (Optimizing Disk-Usage/Allocation)

Ceesjan & Robin,

Thanks so much the effort. I really appreciate it.

Ceesjan,
Need to clarify some parts of the script:
awk 'BEGIN{disknr=1;oldsum=0}
{
fs=$1"\n"oldfs
sum=oldsum+$2;
if (sum>18874368) {
sum=oldsum;
fs=oldfs;
if (sum>18874368) { ....(1)
printf "DISC_UNIQUE=%d\nSIZE: % d\nFILESYSTEMS:\n%s\n\n",
disknr,sum,fs
}
else {
printf "DISC=%d\nSIZE: %d\nFILESYSTEMS:\n%s\n\n",disknr,
sum,fs ....(2)
}
disknr=disknr+1;
oldsum=0
sum=$2 ... (3)
fs=$1 ... (4)
}
oldsum=sum ...(5)
oldfs=fs ...(6)
}
END{
sum=oldsum+$2; ...(7)
printf "DISC=%d\nSIZE: %d\nFILESYSTEMS:\n%s\n\n",disknr,sum,fs ...(8)
}

From the lines marked (1)-(8), I understand in (1), you're checking for filesystem with size 23148320KB (23GB).
Could you please explain the process on how have the names of the filesystem were saved into the appropriate disk, whenever sum is checked against the threshold ?

i.e which part of the script shows that whenever a sum < threshold, current files are saved into the appropriate disk?

Thanks.
Ceesjan van Hattum
Esteemed Contributor

Re: Script help (Optimizing Disk-Usage/Allocation)

Hi, the script is a little tricky, and not perfect as i see now.. but understandable:

First i tried to get the structure of 'sum','oldsum' and '$2' correct.. this is just addition of numbers..
The same as you can add number, you can add strings: fs=oldfs+$1 is just a longer string. I put \n inbetween to make the output look more nice: fs=oldfs"\n"$1.
So really, the sum and fs are on all places doing the same, it adds when it adds and resets when it resets..

To understand the script, now forget about the fs and only concentrate on the variable sum.

In the body of awk, you see that sum always adds the new $2. Whether or not the value gets to high..
If it was NOT to high, it will not enter the nested if, and oldsum=sum (as a memory variable).

If the sum gets too high (no matter unique or not unique), sum will be set to $2 and disknr will increase along.

So, if sum gets the value of $2, it can get a value of 23G. Out of the if's, oldsum=sum, so oldsum will be 23G as well.
So the next round, sum=23+next record.
First if-stat: too high: sum=oldsum=23..
In this case 'sum' is set back to oldsum, but still too high.. therefor unique-label.

In other words, the script always looks ahead.. and evaluates wheter or not the actions were correct or not..

I see now that the script can go wrong, if the big fs will be the first of the inputlist.

Once i finished the algorithm, i just added fs,oldfs and $1 on the same places as sum,oldsum and $2.

Regards,
Ceesjan
Chern Jian Leaw
Regular Advisor

Re: Script help (Optimizing Disk-Usage/Allocation)

Ceesjan,

Thanks for replying my message. I think I need more clarifications on which portion of the script which "saves" or keeps track of all filesystems to be allocated into the proper disk? i.e as in the output below:
DISC=1
SIZE: 14812480
FILESYSTEMS:
/fs34/nwdv.sfv.1
/fs37/nwdv.dfm.1
/fs34/nwdv.des.clk.1
/fs36/nwdv.des.clk.2
/fs34/cad.athena.linux_rsaix
/fs37/nwdv.dfm.2

DISC=2
SIZE: 18451956
FILESYSTEMS:
/fs34/nwdv.apv.fctdb.1
/fs34/nwdv.apv.bctl.1
/fs36/nwdv.des.mem.1
/fs36/nwdv.des.bctl.5
/fs36/nwdv.fw.2

It seems to be that the oldfs is always overwritten by fs whenever sum<18874368 or when sum>18874368. How is it that oldfs manages to retain all values of the filesystem list to be placed into the proper disk?

I'm still new to AWK and other scripting languages.

Could you help me out, please?

Thanks.



Chern Jian Leaw
Regular Advisor

Re: Script help (Optimizing Disk-Usage/Allocation)

Hi,

I was wondering if anyone could show me on how I could modify Ceesjan's script to ensure that the allocation of all filesystems to the disk would ensure that all disks are utilized to it's fullest, as possible?

I'm trying to reduce the number of "disk" usage.

Thanks.
Ceesjan van Hattum
Esteemed Contributor

Re: Script help (Optimizing Disk-Usage/Allocation)

Hi,
My script is not the best, it sequentially goes trough the lines of the inputfile. It takes a disk, if the fs can be put on the disk then continue, otherwise take the next empty disk.

In other words, the script doesn't really think and doesn't really memorize, it just adds all fs's and starts from 0 if total exceeds 18G.
So.. it is NO guarantee that the ouput is the most optimal.

just a hint:
awk 'BEGIN{}{..body..}END{}' input > output
BEGIN is executed before reading input
END is executed after reading input
..body.. is executed for each line of input ($1, $2, $3...NF)

To find the optimal diskconfig, you should know all fs's first, before any calculation. My script start the calculations already, before even knowing what the other fs's look like..

If you want to have the most optimal config, then you should read all input first in an array and use an intelligent algorithm, not mine.
..or you still can use my script, but run it on all permutations of the inputfile: 63 lines makes 63! = 1.98*10^87 possible input combination.

Regards,
Ceesjan

Robin Wakefield
Honored Contributor

Re: Script help (Optimizing Disk-Usage/Allocation)

Hi,

fwiw, I have converted the perl script to awk:

===============================================
BEGIN {
eighteengig=18*1024*1024
loc="/tmp/file_DISK_"
}
/^\//{
disk=$1;size=$2
for (j=0;j<=i;j++) {
if (sizes[j]+size > eighteengig ) {continue}
sizes[j]+=size
disks[j]=disks[j] disk "|"
break
}
if (j>i) {
i++
sizes[j]=size
disks[j]=disks[j] disk "|"
}
}
END {
for (j=0;j<=i;j++) {
print "j=" j " i=" i
if ( sizes[j] > eighteengig ) {
file=loc"UNIQUE"
} else {
suffix++
file=loc suffix
}
n=split(disks[j],darray,"|")
print "n=" n
for (m=1;m print darray[m] >> file
close file
}
}
}
==============================================

so you run it using:

awk -f scriptname disk-size-file

Rgds, Robin
Chern Jian Leaw
Regular Advisor

Re: Script help (Optimizing Disk-Usage/Allocation)

Robin,

Thanks for converting the script to an AWK script.

I was wondering if you could tell me where/what is the variable 'i' defined as? (refer to script portion below):

disk=$1;size=$2
for (j=0;j<=i;j++) { ...(1)
if (sizes[j]+size > eighteengig ) {continue}
sizes[j]+=size
disks[j]=disks[j] disk "|"
break
}

And what does this symbol mean?:
/^\//

What is the purpose of these lines below?:
if (j>i) {
i++
sizes[j]=size
disks[j]=disks[j] disk "|"


Sorry for asking too much as I'm rather new to scripting.

Thanks.
Robin Wakefield
Honored Contributor

Re: Script help (Optimizing Disk-Usage/Allocation)

Hi,

1. If a numeric variable is undefined, it automatically defaults to a value of 0, which is what I want. A string will default to the empty string "".

2. /\^// simply means match an input line that begins with a slash. The outer "/" characters are the pattern match delimiters, the "^" means match the beginning of the line, and the "/" following it is what I'm looking for.

3. If the previous for{} loop drops through without doing anything, it means that none of the existing array values can hold the new disk size, otherwise they'd go over the 18Gb limit. Therefore if this happens, j will be greater than i, so increment i (which will now equal the current value of j), and start a new array index, which will hold the disk size that is being processed.

Hope that doesn't sound too complicated. Put some print statements in to see what it's doing!!

Rgds, Robin.