Archive for the ‘Building Sites’ Category

Affiliate Links With No IDs

Saturday, December 6th, 2008

I sent my previous article on avoiding link juice loss with affiliates to a friend and he sent me in the direction of linkconnector.com which has something they call “Naked Links” which are basically just links from the affiliate site back to the merchant site directly, with no subdomains or GET parameters. So I started thinking about how to do this, since it seems like an awesome idea. You end up with truly clean links to your merchant index page or even deeper, without any redirections at all.

One way of adding this functionality to your affiliate system (you being the merchant) would be to get your affiliates to register their sites in their affiliate accounts. You can get affiliates to prove that they own a site by getting them to upload a file with specific HTML content or add a CNAME record to their DNS (a la Google Apps for enterprise.) Then when a request comes in your server side script looks at the HTTP_REFERER (yes, spelled incorrectly just like the RFC) and sees which affiliate (if any) should get the cred. Then the script simply sets the variable in a cookie and gives the content with a 200. Affiliate tracking with no IDs or funky links. This method won’t work for cases where the affiliate is doing forum or article marketing, unless the affiliate registers the URL and adds an HTML comment or something to verify. And that won’t work in all cases, such as posting in forums and commenting in blogs – it would be first to post/comment and register their link.

If you wanted to do something like what link connector is doing, the merchant site behaves as a middleman here, sending a request to you to check the referrer URL for which affiliate ID to use. Pretty simple, although fault tolerance should be high on the priority list. The merchant site should not cause a denial-of-service attack against itself by opening too many remote connections.

Avoiding Link Juice Loss With Affiliates

Thursday, December 4th, 2008

Although geared more toward those building or working with larger systems that use affiliate links, I’m astounded how often I see large sites throw away link juice that their affiliates are giving them. In all honesty, building a simple affiliate system isn’t all that hard, and basically amounts to dropping a cookie based on the parameters of a given link. The parameters are typically an affiliate ID with some optional tracking variables to allow your affiliates to track specific campaigns and traffic sources (PPC, organic, banner.) The parameters can be embedded however you want, although typically they are in a query string of a GET request, such as:


http://www.somesitesellingstuff.com/affiliate?affid=bigsteve&campaign_id=first&media_id=ppc&product=footsoak

The contrived example above shows affiliate ‘bigsteve’ with a link from his ‘first’ ‘ppc’ campaign for a ‘footsoak’ product. Typically the links are more gross, but you get the picture. The parameters can also be encoded in the hostname itself. Clickbank encodes the product name and affiliate ID as a subdomain and a single 8-character tracking code into a “hoplink”:


http://bigsteve.footsoak.hop.clickbank.net/?tid=firstppc

The affiliate script simply strips out the appropriate variables, drops a cookie containing the variables and then returns the appropriate content. Pretty straightforward, except for that last part: delivering the content. On most sites, they return the content with an HTTP 200, which isn’t a good idea – it’s returning the same or really similar content for several different URLs. Don’t forget, most affiliates drop their link directly on their sites, forums and articles, rather than bouncing through a redirect. This effectively dilutes a bunch of free link juice!

The solution is to change your affiliate script slightly so instead of setting the cookie and serving the content, it redirects (301 please) to a logical page. In fact, the whole operation can be done using Apache mod_rewrite! A lot of this depends on what you’re selling and your site structure, but here’s an example that should make good sense:

RewriteEngine on
RewriteCond %{REQUEST_URI} /affiliate?productid=(\d+)&affid=(\d+)$
RewriteRule ^(.*)$ http://www.somerandomsite.com/product_detail/$1 [L,R=301,CO=affid:$2:.somerandomsite.com:20160]

The above rewrite rule determines if it’s a request for a product page with an attached affiliate ID. If so, it will 301 the user to the appropriate product detail page and set the cookie for the affiliate id (with a 14 day expiry, in minutes = 20160) at the same time. If you need to get more complicated, I’d recommend a script. The above rewrite will break if the parameters aren’t in the right order, and it does no error checking. You can technically redirect to where ever you want, but give the user continuity with an overlay. For example, if you want the link juice to go to your index page but want to show the product page to the person who clicked the link. The simplest way of doing that is to set a cookie for the ‘real’ page to display, redirect (301!) to the index page, and then have a snippet of code in the index page that does a Javascript overlay. Super easy to do with JQuery’s BlockUI plugin.

If you’re an affiliate and you’re interested in keeping your link juice instead of passing it on, that is possible too. Set up a link on your own site that will always do a 302 redirect to your affiliate link. This can be done very easily using mod_rewrite (example below.) Now if you want to drop a link somewhere, simply use your new link. Not only does it hide your affiliate link from the wandering eye, but it’s shorter to type, and looks more “friendly”. Here’s an example that rewrites any links to http://www.myaffiliatesite.com/footsoak-review/ to the appropriate affiliate link. Just add a line to your .htaccess file and edit accordingly:

RewriteEngine On
RewriteCond %{REQUEST_URI} ^/footsoak-review/$
RewriteRule ^(.*)$ http://www.somerandomsite.com/affiliate?product=footsoak&affid=12345 [L,R=302]

Just remember, if you’re the publisher you want to direct the link juice to the same page. Focus it with a 301 redirect. If you’re the affiliate, you want to stop the link juice at your own site (any site you own) by using a 302 redirect.

If you need help or have questions, please contact me.

Content Generation With N-Grams

Friday, November 7th, 2008

Although this is an outdated method, I thought I would post some content generation code I wrote a while ago. Google possesses the n-gram data (more on those later) and algorithms to detect content generated in this fashion. It’s a cool method for text generation but I haven’t found too much in the way of available source code for it. Sure, there is code to take some text and generate n-grams (there’s a perl module for it!), but no sample code to run the n-grams in “reverse” to generate statistically-equivalent text.

The steps for generating statistically-equivalent text to some document are as follows:

  1. Generate a database of n-grams from source document(s) that are similar in nature to what you want to generate. If you want to generate content about male pattern baldness, use articles and content about male pattern baldness. You must record how often each n-gram appears in your source text.
  2. For each n-gram, create a new record that has the first n-1 characters as they key, and the last character and how often it occurred as the value. For example, the 4-gram “then” occurred 15 times in your source text, so your new database entry would have the key “the” with the value ( “n”, 15 ).
  3. Group the same keys together, from step 2. This new database is what you will use to generate the content. For example, step 1 gave you the following 4-grams: ( “then” => 10, “ther” => 20, “thes” => 30 ). Grouping the results from step 2 would give you ( “the” => ( ( “n”, 10 ), ( “r”, 20 ), ( “s”, 30 ) ) ).
  4. Now to generate text, simply start with a random key from your database at step 3, and use the occurrence values as weights to a random number generator to decide which character ( n, r, or s) above should be chosen. Then use the next n-1 characters as a key into your dictionary at step 3 and lather, rinse, repeat until you have enough text.

Here is the source code to generate content. To generate 1,000 characters of text, put all your source content into a file (we’ll call it source.txt), and do the following:

$ gendict.pl 8 source.txt > s_dict.txt
$ gentext.pl s_dict.txt 1000

Obviously you can play around with the ‘n’ parameter (I chose 8 as a starting point.) If you go too small, you’ll end up generating garbage words, and if you go too big, you’ll generate large portions of your source text, but it will make more sense.

I used character-level n-grams in this code, but word-level n-grams would work well, for a large source body. This is similar to the Dissociated Press algorithm except we do a pre-processing step and build an n-gram database first. This n-gram database can be used for other things, such as duplicate content detection, generated content detection and source author recognition to detect cheaters and people using essay writing services.

Modifying the code to stick the generated text into a MySQL database, and then generate an RSS feed from that would allow you use a technique like Affiliate Marketing through RSS Feeds easily. The key to this method is giving it enough source content, and playing around with the size of the n-grams.

Here is some sample text I generated using this document as source with n=8:

baldness, use articles and content detection, generated in this code, but
word-level n-gram appears in your source text, so your new database of
n-grams in thi. It's a cool method, I thought I would work well, for a
large source to generaly-equivalent text.