Showing posts with label sort. Show all posts
Showing posts with label sort. Show all posts

Wednesday, April 22, 2009

Dansguardian access.log summarizing, counting, unique

I have a dansguardian access.log file in smoothwall. I'd like to get a list of unique domains in use, and who'd be a sample IP address to check on.

This, my first effort, is good as far as it goes, which is to simply alphabetize the domains and give an IP address for *someone* who has accessed it:

awk "{ split (\$5,a,\"/\"); print \$4 \"\t\" a[3]; }" access.log | sort +1 -u


Of course, if I needed a date or time, I could add it in the print statement.


But now I think to myself, what about seeing how popular a domain (front part of url) is?

awk "{ split (\$5,a,\"/\"); print \$4 \"\t\" a[3]; }" access.log | sort +1 | awk '{a[$2] = $0; b[$2]++ } END {for(i in a){ print a[i] "\t" b[i]};}' | sort +1


This gives an IP address that has accessed the domain, and how many times that domain has been accessed. It DOES NOT mean that the IP address has accessed that domain that many times. If I wanted to do that ...


awk "{ split (\$5,a,\"/\"); print \$4 \"\t\" a[3]; }" access.log | sort | awk '{a[$0] = $0; b[$0]++ } END {for(i in a){ print a[i] "\t" b[i]};}' | sort


Further, you can use the above to see who "hogs" the web...
awk "{ split (\$5,a,\"/\"); print \$4 \"\t\" a[3]; }" access.log | sort | awk '{ a[$0] = $0; b[$0]++ } END {for(i in a){ print a[i] "\t" b[i]};}' | sort -r -n +2 -t " "

Inside the " " Linux users would use, in vi: ctrl-v, then Tab to put the real tab character. This puts the biggest numbers on top, so piping through more or head would be ideal.

I would argue that using these scripts is faster than most any other log analysis program, or use it in conjunction with your log analysis program.

Sunday, March 8, 2009

Sort GMAIL FREE with mail2web.com

You'll kick yourself how easy this is: www.mail2web.com
Login: youremail@gmail.com
Pass: yourgmailpassword
It'll give you sort (From, To, Size, Subject) without hassle. Plus you can delete.

ETA: On second thought, maybe it's not so special. It's incredibly slow. Well, at least it's an online option. I got it up and then it stopped working for me when I started sorting. I tried IMAP4, and it is sluggishly moving.

Blog Archive