Skip to main content

Wordpress on Hiphop / Nginx / Varnish

I recently was asked to investigate speeding up one of the Wordpress sites of a fairly large government organization in Britain.  A large part of my investigation focused on the server stack because I felt that we could get more out of the hardware that was provisioned for us.

I decided to set up a stack on my development machine to see how it would work and if it was feasible.  I settled on nginX with hiphop and a Varnish frontend cache.  I realize that nginX would be just fine as the cache and server but in this particular case it would not be possible to replace Apache with nginx on the live server.  I also wanted to experiment with ESI and it looked better documented in Varnish than nginx.

Installing HHVM is very easy:

 wget -O - http://dl.hhvm.com/conf/hhvm.gpg.key | apt-key add -  
 echo deb http://dl.hhvm.com/ubuntu saucy main | tee /etc/apt/sources.list.d/hhvm.list  
 apt-get update  
 apt-get install hhvm  

Installing nginx is also very easy:

 sudo apt-get update  
 sudo apt-get install nginx  

Instead of manually configuring nginx to use hhvm I used a tool which ships with it (found at /usr/share/hhvm/install_fastcgi.sh).  The Github page has documentation (here) in case you don't want to use the packaged install script.  Note that the install script will install for Apache and nginx.

There is a tool (here) that will migrate your Apache config to nginx.  I used it to get a demonstration config file which I then edited after RTFM on nginx config.

My test config nginx file ( /etc/nginx/sites-enabled/default ) looks like the snippet below.

 # Read http://codex.wordpress.org/Nginx  
 #    http://wiki.nginx.org/Pitfalls  
 #    http://wiki.nginx.org/QuickStart  
 #    http://www.queryadmin.com/854/secure-wordpress-nginx/  
 #    http://tautt.com/best-nginx-configuration-for-security/  
 #  
 #    Generate your key with: openssl dhparam -out /etc/nginx/ssl/dhparam.pem 2048  
 #    Generate certificate: sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/nginx/ssl/nginx.key -out /etc/nginx/ssl/nginx.crt  
 server_tokens off;  
 add_header X-Frame-Options SAMEORIGIN;  
 add_header X-Content-Type-Options nosniff;  
 add_header X-XSS-Protection "1; mode=block";  
 add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval' https://ssl.google-analytics.com https://assets.zendesk.com https://connect.facebook.net; img-src 'self' https://ssl.google-analytics.com https://s-static.ak.facebook.com https://assets.zendesk.com; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com https://assets.zendesk.com; font-src 'self' https://themes.googleusercontent.com; frame-src https://assets.zendesk.com https://www.facebook.com https://s-static.ak.facebook.com https://tautt.zendesk.com; object-src 'none'";  
 server {  
   server_name www.example.com;  
   listen 8080;  
   root /home/web/sites/default/html/;  
   index index.php;  
   access_log /home/web/sites/default/logs/access.log combined;  
   error_log /home/web/sites/default/logs/error.log warn;  
   include /home/web/sites/default/html/nginx.conf;  
   location / {  
     # include the "?$args" part so non-default permalinks doesn't break when using query string  
     try_files /wp-content/w3tc/pgcache/$cache_uri/_index.html $uri $uri/ /index.php?$args ;  
   }  
   location /wp-admin/ {  
     return 301 https://$server_name$request_uri;  
   }  
   location /mystery-login {  
     return 301 https://$server_name$request_uri;  
   }  
   # Prevent any potentially-executable files in the uploads directory from being executed  
   location ~* /uploads/ {  
     location ~ \.php {return 403;}  
   }  
   # Do not log favicon.ico requests  
   location = /favicon.ico {  
     log_not_found off;  
     access_log off;  
   }  
   # Do not log robots.txt requests  
   location = /robots.txt {  
     allow all;  
     log_not_found off;  
     access_log off;  
   }  
   location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ {  
     expires max;  
     log_not_found off;  
   }  
   include global/w3tc.conf;  
   # Common deny or internal locations, to help prevent access to not-public areas  
   location ~* wp-admin/includes { deny all; }  
   location ~* wp-includes/theme-compat/ { deny all; }  
   location ~* wp-includes/js/tinymce/langs/.*\.php { deny all; }  
   location /wp-content/ { internal; }  
   location /wp-includes/ { internal; }  
   location ~* wp-config.php { deny all; }  
   # Rewrite rules for Wordpress SEO by Yoast  
   rewrite ^/sitemap_index\.xml$ /index.php?sitemap=1 last;  
   rewrite ^/([^/]+?)-sitemap([0-9]+)?\.xml$ /index.php?sitemap=$1&sitemap_n=$2;  
   # Add trailing slash to */wp-admin requests  
   rewrite /wp-admin$ $scheme://$host$uri/ permanent;  
   # Redirect 403 errors to 404 error to fool attackers  
   error_page 403 = 404;  
   # Deny all attempts to access hidden files such as .htaccess, .htpasswd, .DS_Store (Mac).  
   # Keep logging the requests to parse later (or to pass to firewall utilities such as fail2ban)  
   location ~ /\. {  
     deny all;  
   }  
   location ~ \.php$ {  
     fastcgi_split_path_info ^(.+?\.php)(/.*)$;  
     if (!-f $document_root$fastcgi_script_name) {  
       return 404;  
     }  
     fastcgi_keep_conn on;  
     fastcgi_pass  127.0.0.1:9000;  
     fastcgi_index index.php;  
     fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;  
     include fastcgi_params;  
   }  
 }  
 server {  
   server_name example.com;  
   listen 8080;  
   return 301 $scheme://www.example.com$request_uri;  
 }  
 server {  
   server_name www.example.com;  
   listen 443 ssl;  
   root /home/web/sites/default/html/;  
   index index.php;  
   access_log /home/web/sites/default/logs/access_ssl.log combined;  
   error_log /home/web/sites/default/logs/error_ssl.log warn;  
   # enable session resumption to improve https performance  
   # http://vincent.bernat.im/en/blog/2011-ssl-session-reuse-rfc5077.html  
   ssl_session_cache shared:SSL:50m;  
   ssl_session_timeout 5m;  
   # Diffie-Hellman parameter for DHE ciphersuites, recommended 2048 bits  
   ssl_dhparam /etc/nginx/ssl/dhparam.pem;  
   # enables server-side protection from BEAST attacks  
   # http://blog.ivanristic.com/2013/09/is-beast-still-a-threat.html  
   ssl_prefer_server_ciphers on;  
   # disable SSLv3(enabled by default since nginx 0.8.19) since it's less secure then TLS http://en.wikipedia.org/wiki/Secure_Sockets_Layer#SSL_3.0  
   ssl_protocols TLSv1 TLSv1.1 TLSv1.2;  
   # ciphers chosen for forward secrecy and compatibility  
   # http://blog.ivanristic.com/2013/08/configuring-apache-nginx-and-openssl-for-forward-secrecy.html  
   ssl_ciphers 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:ECDHE-RSA-RC4-SHA:ECDHE-ECDSA-RC4-SHA:RC4-SHA:HIGH:!aNULL:!eNULL:!EXPORT:!DES:!3DES:!MD5:!PSK';  
   # enable ocsp stapling (mechanism by which a site can convey certificate revocation information to visitors in a privacy-preserving, scalable manner)  
   # http://blog.mozilla.org/security/2013/07/29/ocsp-stapling-in-firefox/  
   resolver 8.8.8.8;  
   ssl_stapling off;    # can turn on if cert allows  
   ssl_trusted_certificate /etc/nginx/ssl/nginx.crt;  
   # config to enable HSTS(HTTP Strict Transport Security) https://developer.mozilla.org/en-US/docs/Security/HTTP_Strict_Transport_Security  
   # to avoid ssl stripping https://en.wikipedia.org/wiki/SSL_stripping#SSL_stripping  
   add_header Strict-Transport-Security "max-age=31536000; includeSubdomains;";  
   ssl_certificate /etc/nginx/ssl/nginx.crt;  
   ssl_certificate_key /etc/nginx/ssl/nginx.key;  
   location / {  
     # include the "?$args" part so non-default permalinks doesn't break when using query string  
     try_files /wp-content/w3tc/pgcache/$cache_uri/_index.html $uri $uri/ /index.php?$args ;  
   }  
   # Add trailing slash to */wp-admin requests  
   rewrite /wp-admin$ $scheme://$host$uri/ permanent;  
   include /home/web/sites/default/html/nginx.conf;  
   rewrite ^(/)?mystery-login/?$ /wp-login.php?$query_string break;  
   location ~ \.php$ {  
     fastcgi_split_path_info ^(.+?\.php)(/.*)$;  
     if (!-f $document_root$fastcgi_script_name) {  
       return 404;  
     }  
     fastcgi_keep_conn on;  
     fastcgi_pass  127.0.0.1:9000;  
     fastcgi_index index.php;  
     fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;  
     include fastcgi_params;  
   }  
 }  
Take note of the port_in_redirect because it can help with issues around Varnish or nginx including 8080 in the url when doing a redirect (this can happen if you access a url without a trailing slash).  If you're getting port 8080 and you've tried this then also double check your Wordpress site config to make sure the site root does not include 8080.

Varnish is included in the Ubuntu packages, but on their site they recommend rather using the packages supplied by varnish-cache.org.  They list the steps required to set it up and I'm not going to reproduce them here because they might change - rather go to their site.

To configure Varnish on Ubuntu you need to nano /etc/default/varnish.  On RHEL this file is /etc/sysconfig/varnish

There are a number of options provided.  The easiest way to get running is to pick alternative 2 by commenting out the other options.

Just make sure to change the port to 80 as below:

 DAEMON_OPTS="-a :80 \  
        -T localhost:6082 \  
        -f /etc/varnish/default.vcl \  
        -S /etc/varnish/secret \  
        -s malloc,256m"  

At this point Varnish will listen for incoming web requests on port 80 and all we need to do is wire it up to nginx. To do so nano /etc/varnish/default.vcl

It stitched my varnish configuration from a number of sources and will run through it piece by piece here.

Firstly we tell Varnish where to find nginx and set up an authentication that identifies the local machine (more later).

 backend default {  
  .host = "localhost";  
  .port = "8080";  
 }  
 acl purge {  
  "127.0.0.1";  
  "localhost";  
 }  

After that we add to the various hooks that Varnish provides.  The code below will likely need to be modified for your site.  I got it from a variety of sources and there might even be some unnecessary duplication.

 sub vcl_recv {  
   # only using one backend  
   set req.backend = default;  
   # only cache example.com and optionally the www subdomain  
   if (req.http.host !~ "(www)?example.com") {  
      return(pass);  
   }  
   # remove cookie from static content and always return cached version  
   if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {  
     unset req.http.cookie;  
     return(lookup);  
   }  
   # allow for purge option but only from the site we allow  
   if (req.request == "PURGE") {  
    if (!client.ip ~ purge) {  
     error 405 "Not allowed.";  
    }  
    ban("req.url ~ "+req.url+" && req.http.host == "+req.http.host);  
    error 200 "OK";  
   }  
   # set standard proxied ip header for getting original remote address  
   set req.http.X-Forwarded-For = client.ip;  
   # logged in users must always pass  
   if( req.url ~ "^/wp-(login|admin)" || req.http.Cookie ~ "wordpress_logged_in_" ){  
     return (pass);  
   }  
   # don't cache search results  
   if( req.url ~ "\?s=" ){  
   #  return (pass);  
   }  
   # always pass through posted requests and those with basic auth  
   if ( req.request == "POST" || req.http.Authorization ) {  
      return (pass);  
   }  
   # remove cookies from everything other than admin areas so we can cache content  
   if (!(req.url ~ "wp-(login|admin)")) {  
     unset req.http.cookie;  
   }  
   # else ok to fetch a cached page  
   return (lookup);  
 }  
 sub vcl_fetch {  
   # remove some headers we never want to see  
   unset beresp.http.Server;  
   unset beresp.http.X-Powered-By;  
   unset beresp.http.X-Pingback;  
   set beresp.do_esi = true; /* Do ESI processing */  
   set beresp.ttl = 24h;  
   # don't cache response to posted requests or those with basic auth  
   if ( req.request == "POST" || req.http.Authorization ) {  
      return (hit_for_pass);  
   }  
   # only cache status ok  
   if ( beresp.status != 200 ) {  
     return (hit_for_pass);  
   }  
   # remove cookies from static content  
   if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {  
    unset beresp.http.set-cookie;  
   }  
   # Drop any cookies Wordpress tries to send back to the client.  
   if (!(req.url ~ "wp-(login|admin)")) {  
     unset beresp.http.set-cookie;  
   }  
   # else ok to cache the response  
   return (deliver);  
 }  
 sub vcl_deliver {  
   if (obj.hits > 0) {  
     set resp.http.X-Cache = "HIT";  
   }  
   else {  
     set resp.http.X-Cache = "MISS";  
   }  
   unset resp.http.Via;  
   unset resp.http.X-Varnish;  
 }  
 sub vcl_hit {  
  if (req.request == "PURGE") {  
   purge;  
   error 200 "OK";  
  }  
 }  
 sub vcl_miss {  
  if (req.request == "PURGE") {  
   purge;  
   error 404 "Not cached";  
  }  
 }  
 sub vcl_hash {  
   hash_data( req.url );  
   if ( req.http.host ) {  
     hash_data( regsub( req.http.host, "^([^\.]+\.)+([a-z]+)$", "\1\2" ) );  
   } else {  
     hash_data( server.ip );  
   }  
   # ensure separate cache for mobile clients (WPTouch workaround)  
   if( req.http.User-Agent ~ "(iPod|iPhone|incognito|webmate|dream|CUPCAKE|WebOS|blackberry9\d\d\d)" ){  
     hash_data("touch");  
   }  
   return (hash);  
 }  

For further reading I recommend:

Comments

Popular posts from this blog

Separating business logic from persistence layer in Laravel

There are several reasons to separate business logic from your persistence layer.  Perhaps the biggest advantage is that the parts of your application which are unique are not coupled to how data are persisted.  This makes the code easier to port and maintain. I'm going to use Doctrine to replace the Eloquent ORM in Laravel.  A thorough comparison of the patterns is available  here . By using Doctrine I am also hoping to mitigate the risk of a major version upgrade on the underlying framework.  It can be expected for the ORM to change between major versions of a framework and upgrading to a new release can be quite costly. Another advantage to this approach is to limit the access that objects have to the database.  Unless a developer is aware of the business rules in place on an Eloquent model there is a chance they will mistakenly ignore them by calling the ActiveRecord save method directly. I'm not implementing the repository pattern in all its glory in this demo.  

Fixing puppet "Exiting; no certificate found and waitforcert is disabled" error

While debugging and setting up Puppet I am still running the agent and master from CLI in --no-daemonize mode.  I kept getting an error on my agent - ""Exiting; no certificate found and waitforcert is disabled". The fix was quite simple and a little embarrassing.  Firstly I forgot to run my puppet master with root privileges which meant that it was unable to write incoming certificate requests to disk.  That's the embarrassing part and after I looked at my shell prompt and noticed this issue fixing it was quite simple. Firstly I got the puppet ssl path by running the command   puppet agent --configprint ssldir Then I removed that directory so that my agent no longer had any certificates or requests. On my master side I cleaned the old certificate by running  puppet cert clean --all  (this would remove all my agent certificates but for now I have just the one so its quicker than tagging it). I started my agent up with the command  puppet agent --test   whi

Redirecting non-www urls to www and http to https in Nginx web server

Image: Pixabay Although I'm currently playing with Elixir and its HTTP servers like Cowboy at the moment Nginx is still my go-to server for production PHP. If you haven't already swapped your web-server from Apache then you really should consider installing Nginx on a test server and running some stress tests on it.  I wrote about stress testing in my book on scaling PHP . Redirecting non-www traffic to www in nginx is best accomplished by using the "return" verb.  You could use a rewrite but the Nginx manual suggests that a return is better in the section on " Taxing Rewrites ". Server blocks are cheap in Nginx and I find it's simplest to have two redirects for the person who arrives on the non-secure non-canonical form of my link.  I wouldn't expect many people to reach this link because obviously every link that I create will be properly formatted so being redirected twice will only affect a small minority of people. Anyway, here's