Need to get the HTML output from a URL and place it in Drupal's cache? Well then, you may do something like this:
function tf_crawl_url ($url) {
// see if we have this url cached already, if we do pull the html from cache,
// if we don't, then curl the url and store it in cache
$html;
$cache_key = $url;
$cache = cache_get($cache_key);
if ($cache) {
drupal_set_message("Grabbed $url from cache.");
$html = $cache->data;
}
else {
$curl_handle = curl_init();
curl_setopt($curl_handle,CURLOPT_URL,$url);
curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT,2);
curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,1);
$html = curl_exec($curl_handle);
curl_close($curl_handle);
cache_set($cache_key,$html,"cache");
drupal_set_message("Curled $url and placed in cache.");
}
return $html;
}
Example usage:
$html = tf_crawl_url("http://www.drupal.org");
print $html;
Add new comment