Operating System - HP-UX
1826076 Members
3397 Online
109690 Solutions
New Discussion

Data on Age profile of files required please

 
Derek Brown
Frequent Advisor

Data on Age profile of files required please

Hi,

I wonder if anyone can help.

I posted a similar(ish) sort of question to this a few days ago and got some fantastic replies so am hoping for the same again if possible.

Please see excel spreadsheet attachment. My boss is asking for this information on files across a whole server that relate to how old files are (based on last modified date) across a range of dates; then calulating
space used and number of files within these ranges.

I am a shell scripter, and previous replies on this type of topic have been fantastic but they have been in Perl, which I am not familar
with so I can't use them to glean the type of logic that would be required for this task !!
Of course any new replies that use Perl would be most welcome.

Your advice would be most appreciated


thanks

Derek
5 REPLIES 5
Steven E. Protter
Exalted Contributor

Re: Data on Age profile of files required please

Shalom Derek,

The reason you got replies using perl is its the best tool for the job. Perhaps.

I think if you look at the man page for find, +ctime +mtime options you can produce a report with sorted output.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
James R. Ferguson
Acclaimed Contributor

Re: Data on Age profile of files required please

Hi Derek:

No Perl; too bad, but this is still straighforward.

Consider:

# touch -amt 200701010000 /tmp/ref1
# touch -amt 200712312359 /tmp/ref2
# find /path -xdev -type f -newer /tmp/ref1 -a ! -newer /tmp/ref2 -exec ls -l {} \+|awk '{SZ+=$5};END{printf "# files = %d K-size = %d\n",NR,SZ/1024}'


The above would sum the number of files found in '/path' that were modified during the year 2007, along with their size in kilobytes.

Notice the use of '-xdev' so that we don't visit any mountpoints beneath '/path'.

Notice the use of the '+' delimiter to 'find'. This greatly reduces the number of processes that are spawned making processing quite fast.

Regards!

...JRF...
Bill Hassell
Honored Contributor

Re: Data on Age profile of files required please

This is a bit off topic but is your boss trying to manage disk space? If so, it would be a big mistake to remove files based solely on age. Many critical HP-UX files have dates older than the installation date of your server and if removed, would cause major problems and failures. And the touch command allows anyone with proper permissions to change the modification date forward or backward. Finally, there are almost no HP-UX files with extensions. Files such as abc.exe or help.txt are a PC concept.

If this is a disk space management issue, you want to look at directories sorted by size. The du command will do this. Here's the technique:

du -kx /var | sort -rn

Run it for every mountpoint shown in bdf. But before you remove anything, be sure that the file(s) are not needed.


Bill Hassell, sysadmin
Hein van den Heuvel
Honored Contributor

Re: Data on Age profile of files required please

Hello again Derek.

See, I have to do my stuff in perl because I do not know any shell (well). Perl allows me best to use (nearly) the same solutions on Unix (any shell! :-), Windoze and OpenVMS.
I'm just stuck inside Unix with the VMS blues.

So here is another perl solution, in two flavours. First, to be fed by a pipe from find. This allows you to use the find options you know and love (xdev, depth, type,..). Second, standalone. Uses the perl File::Find module in its most basic incarnation. Many options available, none used. See: http://perldoc.perl.org/File/Find.html
For both just pipe the output in report.csv

Cheers,
Hein.


-------- pipe_find_by_date.pl -----
use strict;
use warnings;

my ($i, @range_by_days, @range_by_name, @counts, @sizes);
my $zones = 1;
#
# Construct ranges from simple list.
#
$range_by_name[0] = "More than";
foreach my $range (qw(10y 5y 1y 6m 3m 0m)) {
$_ = $range; # $range is aliased to the constants and can not be chopped.
$counts[$zones] = $sizes[$zones] = 0;
if ('y' eq chop) {
$range_by_name[$zones]= $_ . " years";
$range_by_days[$zones++]= $_ * 365; # who is counting? (leapyears)
} else {
$range_by_name[$zones]= $_ . " mths";
$range_by_days[$zones++]= $_ * 365/12; # give or take a day
}
}

#
# Read the input feed, triage results.
#
while (<>) {
chomp;
my $age = -M;
my $size = -s;
my $i = 1;
$i++ while ( $age < $range_by_days[$i] );
# print STDERR "$range_by_days[$i], $range_by_name[$i], $age, $size, $_\n"
$counts[$i]++;
$sizes[$i] += $size;
}

#
# Finally report out, building print lines a column at a time.
#
my $header_line = 'Total Files';
my $size_line = 'Size (GB)';
my $count_line = 'Number of Files';

for ($i=1; $i<$zones; $i++ ){
$header_line .= sprintf (",%s - %s", $range_by_name[$i - 1], $range_by_name[$i]);
$size_line .= sprintf (",%.1f", $sizes[$i]/(2**30));
$count_line .= ",$counts[$i]";
}
print "$header_line\n$size_line\n$count_line\n";

-------- find_by_date.pl -----

use strict;
use warnings;
use File::Find;

my ($i, @range_by_days, @range_by_name, @counts, @sizes);
my $files = 0;
my $zones = 1;

#
# Construct ranges from simple list.
#
$range_by_name[0] = "More than";
foreach my $range (qw(10y 5y 1y 6m 3m 0m)) {
$_ = $range; # $range is aliased to the constants and can not be chopped.
$counts[$zones] = $sizes[$zones] = 0;
if ('y' eq chop) {
$range_by_name[$zones]= $_ . " years";
$range_by_days[$zones++]= $_ * 365; # who is counting? (leapyears)
} else {
$range_by_name[$zones]= $_ . " mths";
$range_by_days[$zones++]= $_ * 365/12; # give or take a day
}
}

#
# Action routine to triage and count.
#
sub found_file {
my $age = -M;
my $size = -s;
my $i = 1;
$i++ while ( $age < $range_by_days[$i] );
# print STDERR "$range_by_days[$i], $range_by_name[$i], $age, $size, $_\n"
printf STDERR ("%d... %s\n", $files, join (',',@counts)) unless $files++%10000;
$counts[$i]++;
$sizes[$i] += $size;
}

#
# Trigger Action !
#
while (my $directory = shift) { find (\&found_file, $directory) };

#
# Finally report out, building print lines a column at a time.
#
my $header_line = 'Total Files';
my $size_line = 'Size (GB)';
my $count_line = 'Number of Files';

for ($i=1; $i<$zones; $i++ ){
$header_line .= sprintf (",%s - %s", $range_by_name[$i - 1], $range_by_name[$i]);
$size_line .= sprintf (",%.1f", $sizes[$i]/(2**30));
$count_line .= ",$counts[$i]";
}
print "$header_line\n$size_line\n$count_line\n";
Hein van den Heuvel
Honored Contributor

Re: Data on Age profile of files required please

I'm sorry to reply twice, but the dog ate my whitespace!
Forgot to click "Retain format(spacing). " for the reply.

Two scripts and example runs attached as .txt file.

Cheers,
Hein.