Friday, August 10, 2012

Parse Apache Logs by Date Range

Parsing apache logs by date and by date ranges can be fairly simple with a bit of awk scripting.

We use AWK to compare date fields in order to retrieve specific rows.

The date fields between access logs and error logs can vary, so some adjustments are needed:

Note that the date field is contained within a single column in the access_log file, therefore we can do a comparison against a single column.  Typically column #4.

AWK Date Range for access logs:

$ awk '$4>"[09/Aug/2012:15:00:" && $4<"[09/Aug/2012:15:59:"' ./access_log | less

The date field in the error log is in separate columns.  Example: [Thu Aug 09 15:30:...  That in itself is four columns.  They must be combined in order to be compared effectively.  To do this, we assign a combination of those four columns to two variables: $from and $two.  We then use these two variables for the comparison.  See below:

AWK Date Range for error logs:

$ awk '$from>"[Thu Aug 09 15:30:00" && $to<"[Thu Aug 09 15:59:00"' from='$1 " " $2 " " $3 " " $4' to='$1 " " $2 " " $3 " " $4' ./error_log | less