Today’s Web hosting data centers deploy
multiple servers, running many operating systems. With the growing number of
online businesses, it's important to know how many customers reach your
websites. Beyond just the number of hits, it is imperative to understand
customer behavior and market trends, which needs Web analytics. This article
looks at the top 10 analysis tools for website access, categorized by
popularity, functionality and ease of use. They are, essentially, must have
gadgets in every network administrator’s toolbox.
Websites directly catering to a business
are always complex for the business owner, as well as the technology support
team. Owners want to know things like how many hits are generated over a period
of time, and also which product pages are being accessed more frequently than
others. This information is essential for them to correlate it directly or
indirectly with the sales and profit figures. The owner would also be
interested in customer trends. For example, they may want to find out if Web
users are trying to access a particular set of products just because those
products are on discount.
Websites
directly catering to a business are always complex for the business owner, as
well as the technology support team
From the technology support standpoint,
administrators want to ensure the reliability and stability of their websites.
If the Web hits are increasing, they would want to know what impact this can
have on CPUs and memory usage, as well as on the network throughput. Similarly,
it would be important for them to know if and when the servers' hardware needs
upgrades, or when to add more Web servers into the pool. Another requirement
could be to troubleshoot website-related problems by looking at the HTTP error
field (for instance, a 404 means that links on the website point to
non-existent pages, causing a bad user experience).
Web servers create detailed and verbose Web
logs in the form of text files. All fields are important for analytics;
however, Table 1 lists fields that are crucial for analyzing website usage and
trends.
Web analyzers parse the details of those
text files to carry out an analysis. For example, by sorting based on the
source IP address, we can find out how many hits were generated by a particular
Web client; or by intelligently sorting through the Web page file names, we can
know which pages are hit the most. Based on the values in the browser field and
OS type, it is easy to know the number of Windows machines running the IE
browser or Firefox, Mac users running Safari, etc. As you can see, all this
information is extremely useful to tune the website according to the users’
experience, thus increasing traffic and leading to better business.
Top 10 Web analyzer tools
Given below is our list of top 10 tools for
Web analysis for mid to large IT Web infrastructures. We selected these tools
based on their popularity, deployment base, and simplicity to install,
configure and put to use. The list contains a few tools that can perform
on-the-fly Web analysis, which can help in troubleshooting website code-related
problems.
AWStats:
Though this is one of the first-generation tools, it is still widely used. Written
in Perl, it works well on multiple platforms. A great feature of AWStats is
that it supports virtually all popular Web servers’ log formats, right from
Microsoft IIS and Linux Apache to O’Reilly Web servers. It is capable of
creating customizable views, including bar graphs and pie charts, thus offering
a clear insight into Web traffic statistics. AWStats is meant for small to
medium infrastructures, where log files are not too heavy to process. This tool
is managed and updated at http://awstats.sourceforge.net.
Though
AWStats is one of the first-generation tools, it is still widely used
Webalizer:
Unlike various GUI-based tools, Webalizer is a complete command-line-operable
utility, which makes it popular among Linux and UNIX administrators. It has its
own small configuration language, which can be used to decide how the tool
should read and parse the log files, and the fields in it. For example,
configuring its Ignore Site option with an internal IP address range can
help get rid of internal Web traffic, and focus only on external hits. Due to
its extensive command switches, it can be used in a scheduled job to perform
the daily administrative tasks of looking into Web logs or automatically
creating useful reports. This tool can be downloaded from http://www.webalizer.org.
Webalizer
is a complete command-line-operable utility, which makes it popular among Linux
and UNIX administrators
Piwik: When
many log files contain huge amounts of information, it becomes cumbersome and
time-consuming to parse those. This needs a faster log parsing tool, and Piwik
solves the problem. Besides just the typical Web analysis, Piwik comes with a
set of plug-ins to enhance the reporting styles. For example, its GeoIP plug-in
can be utilized to map source IP addresses in the logs to a particular country,
state or city. While supporting multiple platforms, it has its own Python-based
command interface to get the most in terms of reports. Today, many Web hosting
facilities use Piwik, and also provide its customizable Web user interface to
their customers as an offering. This tool can be found at http://piwik.org.
Besides
just the typical Web analysis, Piwik comes with a set of plug-ins to enhance
the reporting styles
OpenWebAnalytics: Written in PHP and using MySQL as the back-end, this utility comes
in handy especially when administrators want to collaboratively process the
logs of multiple websites together. OpenWebAnalytics is capable of processing
really large logs, and can optionally fetch those directly from a database
format too. Unlike many other professional tools, this open source version can
provide a click-stream report, whereby the users clicking on a Web page are
shown in a date-and-time format. This helps code troubleshooters to know
exactly what the Web user did, and they can try repeating those steps to
replicate the problem. It can also create a heat-map type of report,
segregating the website into most-hit and least-hit pages, shown in the form of
color gradients for easy understanding. This tool is available at http://www.openwebanalytics.com.
This
utility comes in handy especially when administrators want to collaboratively
process the logs of multiple websites together