Data Logging Tools
August 12, 2009
In real world, the process can be usually automatic of capturing and recording a sequence of values for later processing and analysis by computer. For example, the level in a water-storage tank might be automatically logged every hour over a seven-day period, so that a computer can produce an analysis of water use. This monitoring is carried out through sensors or similar instruments, connected to the computer via an interface.
The computer logging the data samples the readings at regular time intervals. The raw data are analyzed either continuously (displayed on a changing screen display or as a graph on a plotter) or at the end of the logging period.
In computerized data logging, a computer program may automatically record events in a certain scope in order to provide an audit trail that can be used to diagnose problems.
Examples of physical systems which have logging subsystems include process control systems, and the black box recorders that has been installed in aircraft.
Many operating systems and infinite computer programs include some form of logging subsystem. Some operating systems provide a syslog service (described in RFC 3164), which allows the filtering and recording of log messages to be performed by a separate dedicated subsystem, rather than placing the onus on each application to provide its own ad hoc logging system.
In many cases, the logs are not impenetrable and hard to understand; they need to be subjected to log analysis in order to make sense of them. It can be useful to combine log file entries from multiple sources. This approach, in combination with statistical analysis, may yield correlations between seemingly-unrelated events on different servers. Other solutions employ network-wide querying and reporting.
A server log is a log file (or several files) automatically created and maintained by a server of activity performed by it.
A typical example is a web server log which maintains a history of page requests. The W3C maintains a standard format for web server log files, but other proprietary formats exist. More recent entries are typically appended to the end of the file. Information about the request, including client IP address, request date/time, page requested, HTTP code, bytes served, user agent, and referrer are typically added. These data can be combined into a single file, or separated into distinct logs, such as an access log, error log, or referrer log. However, server logs typically do not collect user-specific information.
These files are usually not accessible to general Internet users, only to the webmaster or other administrative person. A statistical analysis of the server log may be used to examine traffic patterns by time of day, day of week, referrer, or user agent. Efficient web site administration, adequate hosting resources and the fine tuning of sales efforts can be aided by analysis of the web server logs. Marketing departments of any organization that owns a website should be trained to understand these powerful tools.
Syslog is a standard for forwarding log messages in an IP network. The term “syslog” is often used for both the actual syslog protocol, as well as the application or library sending syslog messages.
The syslog protocol is a client/server-type protocol: the syslog sender sends a small textual message (less than 1024 bytes) to the syslog receiver. The receiver is commonly called “syslogd”, “syslog daemon” or “syslog server”. Syslog messages can be sent via UDP and/or TCP. The data is sent in cleartext; however, although not part of the syslog protocol itself, an SSL wrapper such as Stunnel, sslio or sslwrap can be used to provide for a layer of encryption through SSL/TLS.
Syslog is typically used for computer system management and security auditing. While it has a number of shortcomings, syslog is supported by a wide variety of devices and receivers across multiple platforms. Because of this, syslog can be used to integrate log data from many different types of systems into a central repository.
Syslog is now standardized within the Syslog working group of the IETF.
MultiTail is a program for monitoring multiple log files, in the fashion of the original tail program. The original tail presents the last few lines of a single log file, optionally providing a real-time display of the growing file.
MultiTail started as an attempt to provide a program which would display two log files in a split screen. Originally it was a copy of wtail. The difference was that wtail started reading at the start of the file by itself while MultiTail invoked one or more tail processes which naturally only display the last few lines. The author really enjoys adding all features and twists to software he writes so the next step of adding filtering support was quickly done. After that coloring, horizontal split and lots more were added. Added abilities have made it a configurable tool that can monitor not only log files but also the output of other commands, such as Tcpdump.
MultiTail splits the terminal window or the console of a Unix system into two or more subwindows into which it can merge log files and command outputs. It can also display (like the original tail) in a single window.
Multitail runs on all major Unix platforms (AIX, *BSD, HP-UX, IRIX, Linux, Mac OS X, SCO OpenServer, Solaris, Tru64) and also on Cygwin 1.5.19-4; Cygwin allows running Unix programs on Microsoft Windows.
Listing of Log analysis tools:
1. 123 Log Analyzer – Commercial Analyzer
2. Absolute Log analyzer – Commercial product with a Windows program
3. Access Watch – Complete site monitor.
4. Advanced Log Analyzer – Commercial product available as a CGI or Windows
5. AlterWind Log Analyzer – Free “lite” version available
6. Analog – Calls itself the most popular log analyzer in the world
7. AWFFull – Intended to improve upon Webalizer
8. AWStats – Up-to-date free, open-source analyzer
9. BrowserCounter – A cute little perl script that gives you a detailed list of
your users’ browsers.
10. Deep log analyzer – Windows-based commerical product
Web log analysis software (also called a web log analyzer) is software that parses a log file from a web server (like Apache), and based on the values contained in the log file, derives indicators about who, when and how a web server is visited. There are two types of log analyzers
Post parsing reporting – Most log analyzers fall into this category. The log files are parsed and all the reports are generated after that – usually on a scheduled basis. This can put great strain on a computer as the parsing and reporting are done in one go. This is usually done during a quiet period and the reports are often more than a couple of hours old.
Real-time, on-demand Reporting (sometimes called “Live reporting”) – SurfStats and WebTrends have products in this category. The log files are parsed to a database in the background. A report is only generated when requested. This type of analyzer is usually more suited for many users as it places less strain on a server.
Here is the list of Indicators reported by most web log analyzers
1. Number of visits and number of unique visitors
2. Visits duration and last visits
3. Authenticated users, and last authenticated visits
4. Days of week and rush hours
5. Domains/countries of host’s visitors
6. Hosts list
7. Most viewed, entry and exit pages
8. Files type
9. OS used
10. Browsers used
Some of the log analyzers (like ClickTracks, SurfStats, WebTrends, etc.) also report on who’s on the site, conversion tracking and page navigation.