Varnish Cache Set Up and Usage

Varnish Cache Set Up and Usage With Centos

Varnish an http accelerator that can be installed on any Linux distro. Varnish allows your website files to be pulled from the cache if they exist. If the files do not exist, Apache, mySQL and PHP do their regular requests and load the files into the browser.

Although Varnish seems like the ultimate caching system, becoming familiar with its configurations, logging and commands can be quite a committment; especially if you run multiple dynamic websites and applications like Joomla, WordPress, Magento and custom php / mySQL scripts. However, asides from getting around those details, it kicks butt. You could see that slow loading Joomla or Drupal site loading as if it is plain html / css static files from a dedicated server. Once you see how well it performs, you probably never want to go back; unless you have some web applications that you absolutely don't want to cache.

Installing Varnish on Centos

root# yum install varnish


Configuring Varnish

After you have installed Varnish, there are two main configuration files that you will edit; '/etc/varnish/default.vcl' and '/etc/sysconfig/varnish'. The former file would take up most of your time, while the latter file may be rarely edited. The file '/etc/sysconfig/varnish' is where you set varnish to cache to memory or disk, set the cache size, set the port for Varnish to listen on, and set the Varnish admin port. By default, many of these settings are usable. But, you will probably want to make adjustments. A lot of tips for Varnish can be found here https://www.varnish-cache.org/docs/trunk/tutorial/putting_varnish_on_port_80.html.

The other file, 'default.vcl' can end up getting very long and specialized. When you use the default.vcl file, you will want to make entries for every ip that is used on your server. You will also make very specific caching rules. Some examples are shown below.

The first major change that takes place is changing Apache's default port 80 to something else which is usable.

Ip Address setup

The backends and acl will vary from server to server. Here is an example below. Another example can be found at https://gist.github.com/jeremyjbowers/1542949

backend default {
  .host = "ip#1";
  .port = "<apache new port number>";
}
backend default2 {
  .host = "ip#2";
  .port = "<apache new port number>";
}
backend default3 {
  .host = "ip#3";
  .port = "<apache new port number>";
}


 acl afirst_ip_quad_block {
"ip#1";
}
acl asecond_ip_quad_block {
"ip#2";
}
 acl athird_ip_quad_block {
"ip#3";
}

Customizations

The example below allows PHP to grab an ip with the variable $_SERVER['HTTP_X_FORWARDED_FOR']. By default it is $_SERVER['REMOTE_ADDR'].

As a general rule, all conditions that are set in the sub vcl_recv {} and vcl_fetch() blocks. You will do most of the changes here. A good method for changing the default.vcl file is to back it up and reload it with the '/etc/init.d varnish reload' command or restart varnish with '/etc/init.d/varnish restart'.

sub vcl_recv {
remove req.http.X-Forwarded-For;
set req.http.X-Forwarded-For = client.ip;
if (server.ip ~ afirst_ip_quad_block) {
set req.backend = default;
#return(lookup);
}


Unset a cookie and cache all images with the given extensions

if(req.http.host ~ "^(www\.)?example\.com" &&  req.url ~ "\.(jpg|jpeg|png|gif|css|js)$" ) {
unset req.http.cookie;
set req.backend = default2;
}


Do Not Cache Certain Pages

elsif (req.http.host ~ "^(www\.)?example\.com" && (req.url ~ "^/member\.php" ||
      req.url ~ "^/signup\.php" ||
      req.url ~ "^/newsletter\.php"
    )) 
{
#     return(pass);
#remove req.http.cookie;
   }

Testing Varnish

With varnish, you can set custom headers and test to see if it hits or misses. A good example can be found at https://www.varnish-cache.org/trac/wiki/VCLExampleHitMissHeader.

After you have added this block, you can use Firebug with Firefox or use the network tab after inspecting the element with Google Chrome. Now, you can see the new headers and examine the hits, misses and amount of hits from the cache for that page.

Rebooting Varnish Script

The script below can be set as a cron job to restart Varnish at your desired internal.

#!/bin/bash

# Check if gedit is running
if ps aux | pgrep varnish >/dev/null 2>&1
then
    echo "Running"
else
    #echo "Stopped"
        /etc/init.d/varnish start
fi


Monitoring Varnish

You can check the activity of Varnish using the 'varnishstat', 'varnishncsa' and 'varnishlog'. After seeing these, you can see visitors logs and see how well it caches. Once the 'varnishncsa' process is running, you can also write its output to your own custom log file. The command for that is varnishncsa -a -w /var/log/varnish/access.log. You can check the logrotate settings in '/etc/logrotate.d/varnish' to make sure it creates rotated logs.

If you want to make sure Varnish stays up and running, you can use a service like Monit or Munit to monitor its process and make sure it goes up if somehow is not running.

Working With Varnish

If you have many websites and domains, you may find that you will edit files and want them refreshed. In order to do this with varnish, you will continually need to flush the cache for a specific url while you are conducting web maintenance.

The first line below would clear the entire cache for example.com. the second line would only remove the '.jpg' files from the cache.

root# varnishadm "ban req.http.host ~ example.com"
root# varnishadm "ban req.http.host ~ example.com && req.url ~ .jpg"