Evaluation of Web Site Analysis Tools

Mark Bondurant

June 29, 2000

 

 

The heart of any e-commerce business is its customers and sales.  Just as in any brick and mortar retail business, tracking the needs and habits of its customers is of paramount importance.  Tools that aid in tracking the use of our systems or problems encountered when accessing them are vital to providing services that fit our user’s needs, which is the primary goal of any e-commerce business. 

 

At the heart of any web site analysis tool are the metrics it uses to evaluate web site activity.  Does it collect the data that will allow us to answer the questions we need answered?  Most importantly -- will it not only allow us to answer the questions we have right now, but will it also provide us with the flexability to answer questions we don’t yet know yet how to ask?  We need to know when customers have difficulty in using services that we provide.

 

In this study I reviewed 21 different web site analysis tools.  All but seven were immediately eliminated because they didn’t allow us access to our own data.  A popular configuration was one that involved installation of web server ISAPI filters on our sites or java applets in our pages, which transmit data to an outside outsourcing company to be stored in their database.  These companies would then return the information to us in the form of pre-packaged reports.  If we wanted our information in a form not provided, then we would have to pay them to write the report for us. 

 

Although these solutions tended to be the cheapest and were the quickest to implement, I feel that the loss of control of our own data was an unacceptable compromise.  Since writing reports was the main form of revenue for these companies, the flexibility to write ad-hoc reports or the ability to link in non-collected data such as user information, were needs that couldn’t be practically supplied by these companies. 

 

The second most popular scenario was the use of a proprietary database.  This again denied us access to our own data except through pre-packaged reports.  Any reporting or query needs would again have to be contracted out to the company we purchased the software from.


 

Product

Accessable Database

 

 

WebTrends

Through Reports

Site Server

Yes

Funnel Web

Through Reports

Bella Coola

Remote

FastStats

 

HitList

Yes

Lateral Line

Remote

Coremetrics

Remote

EasyStat

Remote

Hitometer

Remote

NetBrief

Remote

LiveStats

Propietary

SuperStats

Remote

NetStats

Remote

iLux

Yes

Urchin

Remote

NetTracker

Yes

Summary

Remote

SurfAid

Propietary

WebManage

Yes

HitBox

Remote

 

Table 1.  This shows outsourcing businesses dropped from the list.


Although the remaining seven companies stored their data in a manner that we could control and use in the future, the next concern was the accessibility and flexibility of the data stored.  The group could be broken into two competing philosophies.  The first stored all logging data in a database from which we could write queries to produce reports.  The second only stored data in the database that was of immediate use, relying instead on quick scans of the log files to add data to the database when needed.

 

I chose to eliminate the log file scanners.  As our systems grow, so will our log files grow and so will the performance of our reports decline.  Unless we were very careful in how we stored our data in the database, we would also have problems with flexibility.  I think the tendency would be to create sets of tables for each report rather than sharing data between reports.  In the end, the file scanners would use more resources than those who loaded all the data in the beginning.  Again, as with the outsourcers, these products offer a quick solution up front that would cost us down the line.

 

Product

Supports Load Balanced Clusters

URL Lookup

Accessable Database

Database Compatable with SQL 7.0

Database Accessable Through ADO

 

 

 

 

 

 

WebTrends

Yes

Yes

Only Through Reports

Yes

Yes

Site Server

Yes

Yes

Yes

Yes

Yes

Funnel Web

Yes

Yes

Only Through Reports

No

Yes

HitList

Yes

Yes

Yes

Yes

Yes

iLux

Yes

Yes

Yes

Yes

Yes

NetTracker

Yes

Yes

Yes

Yes

Yes

WebManage

Yes

Yes

Yes

Yes

Yes

 

Table 2.  Eliminating the log scanners.

 

The remaining candidates all support our work environment.  In fact they were all capable of storing their data in a SQL Server 7.0 database, an advantage in that it would allow us to link the analysis data with any of our other databases, such as subscriber profiles.  It also means we won’t have to purchase any new licenses nor do we have to develop any new expertise.  They all support our load balancer configuration and were capable of performing URL lookups.  URL lookups are nice for identifying strangers who show up at our sites.  So each of the five remaining products produced accessible, flexible data.

 

The next and most complicated metric chosen was the quality of the data.  Again the pack could be broken into two competing philosophies.  Most of the group scanned the IIS log files for their data.  One product, HitList, chose to use sniffer technology to watch the data stream entering and exiting IIS.  WebManage added ISAPI filters to extend the information provided by the IIS logs.  Both log scanning and siffer technologies have advantages and disadvantages and capabilities.

 

What kind of information we are looking for?  We need to know who did what when.  We also need to know how long did it take them to reach fulfillment and what kind of problems did they encounter on the way.  The IIS logs contain detailed information about who did what when but don’t contain timing information.  All of these products make use of these logs and produce more or less the same who, what, when information.  Only two of the above products deliver timing information.

 

NetManage inserts an ISAPI filter between IIS and the TCP/IP ports.  It stores IIS’s response time to user requests.  HitList, being a packet sniffer, has the added advantage in that it sits outside the server and can measure the server’s response time as a whole.  This is a much more realistic measure of our user’s experience because other distractions within the server can delay IIS’s responses.  Both products log page errors, however HitList also logs clients hitting the stop and refresh buttons on their browsers.  These are good signs of user frustration.  HitList can also log abnormal load times for objects such as large pictures, reports, and most of all java applets. 

 

Providing timing information can come at a price.  NetManage’s technique of active sampling will slow our web plant down, while HitList’s method of passive sampling is unobtrusive.  HitList however, requires a separate server outside the web plant on which to run.  This can be an old PC, but it still requires extra overhead from support staff.

 

Product

Sampling Method

Cost

Pros

Cons

 

 

 

 

 

Site Server

ISAPI Filter/Service

$2.6K/Box + $1.3K/Box for 5 extra site licenses

Directly couples to IIS, user profiles linked to security

No error or timing information, 24 hour latency

HitList

Dedicated Server, Passive Sampling

$10K/$4K (really 3 box cluster would be $30K)

Links to customer profiles, true server response time, passive sampling, wide range of reports, active support in setup, no latency

Runs on separate server.

iLux

ISAPI Filter/Service

$60K 

Strong user profiling, pop-up queries

No error or timing information. 

NetTracker

Service

$8K (plus)

 

No error or timing information

WebManage

ISAPI Filter/Service

$5K (plus)

Cheaper, reports IIS response time, 30 minute latency.

Not server response time, active sampling

 

Table 3.  Special features and cost are displayed along with sampling method.

 

At this point I eliminated NetTracker as lacking any features that made it stand out.  WebManage could do everything it did and more, for less money.

 

Other products, such as iLux and Site Server, also provide features such as pop up user query forms and built in user profiles.  Site Server also links strong site security in with  user profiling.  This may be important if we wish to pass any hurdles the Department of Defense may place before us.

 

The next factor I considered was information latency.  IIS log scanners generally report on yesterday’s information.  When IIS closes a log, the log scanner scans the data into the database.  For error and timing information this is not acceptable.  We need to know about problems that are occurring when they are occurring.  WebManage creates it’s own mini logs that are scanned into the database every 30 minutes.  Good enough for a post mortum, but not fast enough to alert us when things are going wrong.  HitList has the advantage of having a server all to itself and can keep the database current to within minutes.  It can also email out alerts or beep operators when there are problems.

 

Next I considered service.  The more expensive products, HitList and iLux, come with a high level of personal service.  Service includes installation help, training, help desk, and consulting services.   HitList’s added abilities make it’s installation difficult and I think installation help would be especially useful.  iLux’s strength is in their marketing experience and they want to consult heavily in this area and in the area of user personalization.  They offer added capabilities such as marketing campaign response statistics reports and user preference trends.  Site Server comes with no support, but Microsoft Consulting services can be purchased at a flat rate of $200/hour.  All of these, except for Microsoft Consulting, are factored into the price mentioned above.

 

Lastly, what good is information if we can’t look at it?  All these products come with the same set of basic reports.  These include:  Hits per object, site load by time of day, most popular and least popular pages, etc.  iLux by far comes with the most extensive reporting portfolio, most revolving around marketing.  All of these products have an ad-hoc reporting capability that allows you to produce your own custom reports without having to resort to coding.  All can produce reports in Excel or HTML format.  All have the capability to schedule reports and send them through email.

 

My preferences lean towards HitList.  It’s the only product that delivers reliable performance and problem data.  I like the alerts when things are going wrong.  I also like the information about object load times.