If you’re building a system that needs to track affiliate sales, you’ll need to integrate some form of analytics into your software. Your affiliates will want to see how many total visitors hit their link, total uniques, how they got there (search terms or referrer URL) and if they made a product purchase.
There are a few ways of doing web analytics – writing tracking code right into your application (affiliate landing page), using JavaScript, or looking at web logs. I’ll be focusing on the JavaScript version. I won’t even go into web log processing here, since although there is interesting information there, it’s not real time enough for our use but is a powerful way to “check” the other methods or even gather information on spider visits (frequency, times of day, etc.)
Our tracking code will be fairly simple:
- Grab any information from the URL (GET parameters) and server data (user agent, remote IP, referer [sic] URL)
- Record the information in a database
- Continue processing the page
Decide on a Tracking Method
If you’re embedding the tracking code directly into your application, it’s a matter of adding some code to your controller and creating a model (and associated tables) to store the visitor data. The reporting backend will work exactly the same. The pros here are you don’t have to deal with JavaScript and/or cross-browser problems, and there may be a performance benefit since there are fewer HTTP requests being made to your server. The cons are that any time you want to change the tracking code, you need to change the controller, and you lose the ability to use the same tracking code on different sites, or sites that aren’t yours. Typically you set up one application (and domain) for doing analytics and reporting, and you have multiple websites. If you only have one website, and don’t mind running your analytics and reporting there, I’d recommend embedding the tracking code in your controller.
Using JavaScript to record visitor information is relatively simple. We need to write a controller to handle the requests to record visitor information, and a model to do the actual recording. The client side is a small JavaScript snippet, which will extract some variables and make a GET request to our controller. We won’t be using any AJAX here, since we need to deploy this code to multiple sites and have only one analytics site (i.e. we run the code on www.domain1.com but have our analytics requests hitting analytics.anotherdomain.com) – this is cross-site scripting (XSS), and although we want to allow it in this case, your browser won’t! Pros of this method are the ability to deploy to multiple sites and consolidate analytics/reporting to one server, and the ability to change tracking code without re-deploying your application. Cons are JavaScript browser incompatibility and increased complexity and load due to many (small) requests.
My Analytics Solution
We’ll be writing a controller and model using the Kohana PHP framework, and the client-side JavaScript without a framework, since all it does is generate a request for a 1×1 pixel GIF. This is the same way Google analytics and Mint do it. So, on to the code.
Web Analytics Model
Our model will store time, IP, request and referer [sic] URL information. Here is the MySQL table:
CREATE TABLE IF NOT EXISTS Hits (
id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
recorded_time TIMESTAMP NOT NULL, -- Time the record was created
ip INTEGER UNSIGNED NOT NULL,
ua VARCHAR( 200 ) NOT NULL DEFAULT '-', -- User agent
request VARCHAR( 200 ) NOT NULL DEFAULT '/',
referer VARCHAR( 200 ) NOT NULL DEFAULT '-', -- Referer
is_unique BOOLEAN NOT NULL DEFAULT TRUE
);
Hopefully there isn’t anything unclear there. I’ve created fields to record the time at which the hit happened, the IP (stored as an integer for compactness), the user agent string, the original request, the referer URL and whether this is a unique or not (has this person already visited our site.)
The model is equally simple:
class Click_Model extends Model {
function __construct() {
parent::__construct();
}
function create( $ip, $ua, $request, $referer, $is_unique=1) {
$ret = false;
$row = array();
// Convert IP to integer
$ip = $this->_ip_to_integer( $ip );
// Either 0 or 1
if( $is_unique > 0 ) {
$is_unique = 1;
} else {
$is_unique = 0;
}
if( $ip > 0 ) {
$row[ 'ip' ] = (int)$ip;
$row[ 'ua' ] = $ua;
$row[ 'request' ] = $request;
$row[ 'referer' ] = $referer;
$row[ 'is_unique' ] = $is_unique;
$ret = $this->_create_if_not_exists( 'Clicks', $row );
}
return $ret;
}
/**
* Converts a text IP address to an integer.
**/
function _ip_to_integer( $ip ) {
$octets = split( '\.', $ip );
return (int)( $octets[ 3 ] + $octets[2]*256 +
$octets[1]*256*256 + $octets[0]*256*256*256 );
}
/**
* Inserts the row if it's new and returns the ID, or just returns the
* ID if it already exists. The table must have a column called 'id'
* that is the INTEGER AUTO_INCREMENT PRIMARY KEY style.
**/
function _create_if_not_exists( $table, $row ) {
// Try to insert - if it doesn't exist we'll get an ID of zero
$columns = join( ',', array_keys( $row ) );
$placeholders = join( ',', array_fill( 0, count( $row ), '?' ) );
$q = $this->db->query( "INSERT IGNORE INTO $table ($columns) ".
"VALUES ($placeholders)", array_values( $row ) );
$ret = $q->insert_id();
if( $ret == 0 ) {
$q = $this->db->getwhere( $table, $row );
if( $q->count() > 0 ) {
$result = $q->result_array( false );
$ret = $result[ 0 ][ 'id' ];
}
}
return $ret;
}
}
The model class is pretty straightforward. Since Kohana doesn’t support “INSERT IGNORE”, I had to roll my own version. The model only handles inserts – actual reporting and such are left out.
Web Analytics Controller
The controller only does one thing – validate and record the data passed to it, then return a 1×1 pixel GIF:
class Hit_Controller extends Controller {
private $gif_data = "\x47\x49\x46\x38\x39\x61\x01\x00\x01".
"\x00\x80\xFF\x00\xFF\xFF\xFF\x00\x00".
"\x00\x2C\x00\x00\x00\x00\x01\x00\x01".
"\x00\x00\x02\x02\x44\x01\x00\x3B\x00";
function __construct() {
parent::__construct();
}
/**
* Basically grab all the parameters, record in the database and return
* some content.
**/
function index() {
if( isset( $_GET[ 'ru' ] ) ) {
$h_model = new Hit_Model();
$ip = $this->input->server( 'REMOTE_ADDR' );
$ua = $this->input->server( 'HTTP_USER_AGENT' );
$h_model->record_click( $ip,
$ua,
$this->_get_elem( $_GET, 'ru' ),
$this->_get_elem( $_GET, 'rf' ),
$this->_get_elem( $_GET, 'u' ) );
}
// Return a 1x1 pixel transparent gif
header( 'Content-Type: image/gif' );
echo( $this->gif_data );
}
function _get_elem( $a, $k ) {
$ret = '';
if( isset( $a[ $k ] ) ) {
$ret = $a[ $k ];
}
return $ret;
}
}
The only validation we do here is check that the referer URL was passed (the ru variable in the GET string.)
Client-side JavaScript
The JavaScript that acts as our view (although nothing is displayed) and executes in the user’s browser is quite simple. It marshals the require parameters, then munges this into a request for a GIF. In order to tell the difference between a unique visitor and a pageview, we set a cookie upon first visit, which is then checked upon subsequent pageviews. Here’s our JavaScript:
function track() {
var days = 7; // Number of days to keep cookie alive
var ru = document.location.href;
var rf = document.referrer;
var rest = '';
if( ru.length > 0 ) {
if( rf == '' ) {
rf = '-';
} else {
rf = urlencode( rf );
}
// If there's a query string, grab it and stick all the parameters on the
// end.
var qstring = ru.split( '?' );
if( qstring.length > 1 ) {
rest = qstring[ 1 ];
}
ru = urlencode( ru );
rf = urlencode( rf );
var clicked_time = new Date();
clicked_time = Math.round(clicked_time.getTime()/1000);
// Build data.
var d = 'rf=' + rf;
if( ru.length > 0 ) {
d += '&ru=' + ru;
}
if( rest.length > 0 ) {
d += '&' + rest;
}
d += '&ct=' + clicked_time;
// If the cookie already exists for this bonus code, this isn't a unique hit
var unique = 1;
old_cookie = readCookie( 'analytics_unique' );
if( old_cookie != null && old_cookie != "" ) {
unique = 0;
}
// Set cookie.
setCookie( 'analytics_unique', 'visited', days, '/' ); // For uniqueness
d += '&u=' + unique;
// Now request the 1x1 pixel gif to record the click.
(new Image()).src = 'http://your.analytics.site.com/click.gif?' + d;
}
return true;
}
function setCookie( name, value, days, path ) {
var date = new Date();
date.setTime( date.getTime() + ( days*24*60*60*1000 ) );
var expires = "; expires=" + date.toGMTString();
document.cookie = name + '=' + value + expires + '; path=' + path;
}
function readCookie(cookieName) {
var theCookie=""+document.cookie;
var ind=theCookie.indexOf(cookieName);
if (ind==-1 || cookieName=="") return "";
var ind1=theCookie.indexOf(';',ind);
if (ind1==-1) ind1=theCookie.length;
return unescape(theCookie.substring(ind+cookieName.length+1,ind1));
}
function deleteCookie( cookieName ) {
if( readCookie( cookieName ) ) {
setCookie( cookieName, '', 0, '/' );
}
}
function urlencode(str) {
str = escape(str);
str = str.replace(/\+/g, '%2B');
str = str.replace(/%20/g, '+');
str = str.replace(/\*/g, '%2A');
str = str.replace(/\//g, '%2F');
str = str.replace(/@/g, '%40');
return str;
}
There are a few convenience methods for reading/writing cookies and encoding the data so things don’t get screwed up when we request the image. The final piece is to add a rewrite rule so our controller gets hit with any requests to click.gif:
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} ^/click\.gif\?
RewriteRule ^/click\.gif\?(.*) /hit?$1 [L]
The above just strips off all our GET parameters and feeds them to our hit controller, which we know returns a 1×1 pixel gif.
Extensions
You could extend the above to include more information about the user’s browser such as platform, Java-enabled, Flash version, JavaScript version or screen resolution. With some post-processing you’d be able to do geolocation on the user’s IP, and strip out keywords from search engines or PPC campaign variables. If you added a little more information to the uniqueness cookie, you’d be able to record bounce rate and time on page.
I’ve completely glossed over how the data should be presented to the users (your affiliates.) Most affiliate systems show total clicks, uniques and sales grouped by date, time of day or campaign ID. Of course, the main benefit of writing your own engine from scratch is you can offer affiliates things that other programs don’t show them such as referrer URL, search terms, PPC campaign variables and geographic location.


