1751978 Members
4595 Online
108784 Solutions
New Discussion юеВ

Scripting Query

 
SOLVED
Go to solution
Duffs
Regular Advisor

Scripting Query

Hi,

I am trying to find out the IP address that has got written most frequently to my access_log file. The file is a few thousand lines and therefore it is not practical to view this manually. I have tried to select the first column in the access_file and print it, sort it and simply eyeball the result but with so many different addresses there must be a more solid way of doing it?

i.e.
# cat access_log | awk '{print $1}' | sort -rn > /tmp/access.txt

And this leads on to my next question which is if I was looking for the most frequent entry in a file but not necessarily an IP address how could I get it if it wasn't delimited by colums and could be of any alphanumerical value?

Regards,
D.
5 REPLIES 5
Goran┬аKoruga
Honored Contributor

Re: Scripting Query

Hello.

Write a trivial awk or perl script using hash arrays.

Regards,
Goran
Dennis Handly
Acclaimed Contributor
Solution

Re: Scripting Query

>I am trying to find out the IP address that has got written most frequently

Try:
awk '{print $1}' access_log | sort | uniq -c | sort -rn > /tmp/access.txt

>if it wasn't delimited by columns and could be of any alphanumerical value?

Are you trying to find the most frequently occurring "word" in a file?
You could use tr(1) to convert your separators to a newline then use the above sort/uniq/sort pipeline. (Removing blank lines first.)
Matt Palmer_2
Respected Contributor

Re: Scripting Query

Hi,

you could use 'webalizer' to parse the access_log for you, as it does all the sorting on your behalf, then use curl to pull back the stats page.

regards

Matt
Duffs
Regular Advisor

Re: Scripting Query

Hi,

Denis, yes I am trying to find the most frequently used word in a file. Are there any alternatives to 'tr' or is this the only way of doing it? What would the command look like to obtain this.

R,
D.
Duffs
Regular Advisor

Re: Scripting Query

Hi,

Thanks for the feedback! Spot on Dennis with the tr command for the non delimited file also.

Using 'tr' I replaced the spaces with a new line and redirected the output to a new file then used the sort/uniq/sort pipeline as suggested and it produces the desired result.

Regards,
D.