- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- summarize bytes of files and make a control break ...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-04-2012 09:11 AM
05-04-2012 09:11 AM
hello,
i have a file which is created by data protector.
i want to summarize the bytes of the files , which are printed in column "size" .
but i have a specials :
files are printed like :
/filesystem/directory/DIR/file
/filesystem/directory/MOVED201202/subdirectory/app/DIR/IF/geladen/...../file
i need to get a summarize of files only for 3 subdirectories like : "/filesystem/directory/DIR" , "/filesystem/directory/MOVED201202" so i think i have to split the files with "/" and need to get 3 subdirectories .
so the "control break" is after 3 subdirectories .
is it possible by awk ?
the output should be :
"/filesystem/directory/DIR" : 3000 bytes
"/filesystem/directory/MOVED201202" : xxxx bytes
the reason for the report: we need to create filesystems for the subdirectories , so i have to know the size.
regards
in the attachment is the output
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-05-2012 12:55 AM - edited 05-07-2012 07:13 PM
05-05-2012 12:55 AM - edited 05-07-2012 07:13 PM
Re: summarize sizes of files and make a control break for a fix directory structure
Try this:
#!/usr/bin/ksh
# Adds up hierarchical (must be ordered) directory and file sizes.
# Outputs just directories (ends in "/") and sizes
awk '
BEGIN { getline; getline # skip first two lines
# Add dummy directory entry for "/"
directory["/"] = 0
OFMT = "%.0f"
}
function add_entry() {
for (dir in directory) {
if (filename ~ dir) # substring match
directory[dir] += size
}
# check to see if a directory, ends in "/"
if (substr(filename, length(filename), 1) == "/")
directory[filename] = size # initialize with directory size
}
{
size = $4
filename = $7
add_entry()
}
# print out directory entries
END {
for (dir in directory)
printf "%12.0f %s\n", directory[dir], dir
}' input-file | sort -k2,2
I get:
410893371 /
410893371 /filesystem/directory/
337175785 /filesystem/directory/DIR/
73716466 /filesystem/directory/MOVED201202/
73691984 /filesystem/directory/MOVED201202/subdirectory/
73691888 /filesystem/directory/MOVED201202/subdirectory/app/
73691792 /filesystem/directory/MOVED201202/subdirectory/app/DIR/
73691696 /filesystem/directory/MOVED201202/subdirectory/app/DIR/IF/
73691600 /filesystem/directory/MOVED201202/subdirectory/app/DIR/IF/geladen/
96 /filesystem/directory/lost+found/
- Tags:
- awk
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-06-2012 01:45 AM
05-06-2012 01:45 AM
Re: summarize sizes of files and make a control break for a fix directory structure
hello Dennis,
great awk but only the output is different from the target value:
Files of Directories : /filesystem/directory/DIR Target value (size): 336998633 awk value : 337175785
Files of Directories : /filesystem/directory/MOVED201202 Target value (size): 73709842 awk value : 73716466
above a calculation : MS Excel : please rename catalog_xls.txt to catalog.xls , then you see the differences.
kind regards,tom
ps: why isn't allowed to upload xls format ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-06-2012 03:20 AM - edited 05-06-2012 03:22 AM
05-06-2012 03:20 AM - edited 05-06-2012 03:22 AM
Re: summarize sizes of files and make a control break for a fix directory structure
>but only the output is different from the target value:
Because I add in the size of the directories too.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-06-2012 08:30 AM
05-06-2012 08:30 AM
Re: summarize sizes of files and make a control break for a fix directory structure
hello,
can you explain this part :
- first you initalize the array:
# Add dummy directory entry for "/"
directory["/"] = 0
- what means for dir in directory
what is the content of the array
first entry : directory["/"] = 0
next entry : ???
function add_entry() {
for (dir in directory)
{
if (filename ~ dir) # substring match
directory[dir] += size
i tried to extend your awk , but the result is a little bit much less then your awk.
is it the right way ?
my extension :
#!/usr/bin/ksh
# Adds up hierarchical (must be ordered) directory and file sizes.
# Outputs just directories (ends in "/") and sizes
awk '
BEGIN { getline; getline # skip first two lines
# Add dummy directory entry for "/"
directory["/"] = 0
OFMT = "%.0f"
}
function add_entry() {
for (dir in directory)
{
if (filename ~ dir) # substring match
{
if (substr(typ, 1 , 1) == "-")
directory[dir] += size
}
}
# check to see if a directory, ends in "/"
if (substr(filename, length(filename), 1) == "/")
{
if (substr(typ, 1 , 1) == "d" )
{
directory[filename] -= size
}
}
}
{
typ = $1
size = $4
filename = $7
add_entry()
}
# print out directory entries
END {
for (dir in directory)
printf "%12.0f %s\n", directory[dir], dir
}' file | sort -k2,2
- Tags:
- awk
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-06-2012 09:10 PM
05-06-2012 09:10 PM
Solution>- what means for dir in directory
Goes through all of the entries in the array (in random order)
> what is the content of the array
first entry: directory["/"] = 0
next entry: ???
Since they are random, there is no specific "next". This is a hash, not a map.
> if (substr(typ, 1 , 1) == "-")
> directory[dir] += size
With bother with the leading "-"? Better to just look for trailing "/".
Also the proper check for the opposite of "d" is (!= "d") and not (== "-").
>if (substr(typ, 1 , 1) == "d" )
Again, no need to check both ways.
> directory[filename] -= size
Why are you subtracting? If you don't want to count them, initialize it to 0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-07-2012 12:37 AM - edited 05-07-2012 01:40 AM
05-07-2012 12:37 AM - edited 05-07-2012 01:40 AM
Re: summarize sizes of files and make a control break for a fix directory structure
> Why are you subtracting? If you don't want to count them, initialize it to 0.
i change : directory[filename] -= size to directory[filename] 0
and then i get the right ouput:
Files of Directories : /filesystem/directory/DIR Target value (size): 336998633 awk value : 336998633
Files of Directories : /filesystem/directory/MOVED201202 Target value (size): 73709842 awk value : 73709842
some last questions:
>> what means for dir in directory
>> Goes through all of the entries in the array (in random order)
this i understand.
>> what is the content of the array
>> Since they are random, there is no specific "next". This is a hash, not a map.
what i don't understand :
for example, when awk read this line :
-rw------- prod edv 2636 04/10/12 03:13:10 /filesystem/directory/DIR/ABRKONZ_AUFWO_201137_IF_20120410031303.dmp.Z
then you call " add_entry() "
and it is not not in the array directory.
how i put it into the array ?
for (dir in directory) <= here it isn't in the array the value?
{
i don't understand, how the "directories" are stored in the array ?
regards
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-07-2012 11:09 AM - edited 05-07-2012 07:12 PM
05-07-2012 11:09 AM - edited 05-07-2012 07:12 PM
Re: summarize sizes of files and make a control break for a fix directory structure
>how I put it into the array?
This puts it in the array:
directory[filename] = size # add directory size
And I only put directory entries into the array and for each file, I add to that directory entry.
>for (dir in directory) <= here it isn't in the array the value?
This just iterates over the list of keys in the array.
>I don't understand, how the "directories" are stored in the array?
The array stores key/value pairs in the hash: directory[filename] = size
filename is the key, directory[filename] is the corresponding value.