I personally love Arne Brachhold’s Google XML Sitemap plugin for WordPress. I personally use it on any WordPress install I do. On larger blogs, or blogs where you’re using automated content generators (i.e. posting content in an automated way through XML-RPC) the default build mode will slow down your blog because it rebuilds the entire XML sitemap from scratch every time you create or update a post or page.
There is a second build mode this plugin supports, which is to build via a GET request. For sites that have a lot of posts or do automated posting, this is a great option. It’s possible to schedule the XML sitemap updates to happen at specific times of the day with a simple script that uses an HTTP GET request to refresh them. This will speed up posting, especially for sites that use automatically generated content. Here’s a simple php script that you can schedule via cron to update your sitemaps and send you an email when it is done. Just update the $admin_email variable to where you want the email to go and the $sitemap_link variable to whatever the XML Sitemaps plugin tells you when you change the build mode. Notethat you may need to change the link to include the wp-admin especially if you’re on WordPress mu – the link the plugin gives doesn’t work (i.e. http://myblog.com/?sm_command… to http://myblog.com/wp-admin/?sm_command…)
Here’s the script:
<?php
$admin_email='info@myblog.com';
$sitemap_link = 'http://myblog.com/?sm_command=build&sm_key=90210';
function getURIContents( $uri ) {
return file_get_contents( $uri );
}
function generateSitemap( $link ) {
$ret = '';
$result = getURIContents( $link );
if( !preg_match( '/.*DONE.*/', $result ) ) {
$ret = $result;
}
return $ret;
}
$result = generateSitemap( $sitemap_link );
if( $result != '' ) {
mail( $admin_email, 'Sitemap Generator failed:$result" );
} else {
mail( $admin_email, 'Sitemap Generator Complete',
"Completed sitemap generation." );
}
?>
If you get errors related to file_get_contents not being able to load a remote URI, just replace the above function with this, which uses libcurl and you should be ok:
function getURIContents( $uri ) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $uri);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_USERAGENT, 'IE 6 - Mozilla/4.0' );
ret = curl_exec( $ch );
if( curl_errno( $ch ) ) {
$ret = '';
} else {
curl_close( $ch );
}
return $ret;
}