25 January 2010

PHP email Class

I was looking for a PHP SMTP email class and stumbled across this PHP email class by Manuel Lemos. PHPclasses.org is rapidly becoming my first port of call when I'm looking for a class to fill some general function in my code. Of all the scripting sites out there, and there are hundreds, this is possibly one of the few that I would recommend to people.

One of the reasons that I liked the script was the debugging feature that showed the server responses as they happened. I've been busy setting up firewalls, SMTP relays, and IIS (hate hate hate) so it was useful to be able to get debug info from my client software.

In any case the class worked flawlessly first time, it's free to use, and my client is happy.

19 January 2010

Squid proxy server

My job title is "PHP developer" but because I'm the only person in the office familiar with Linux I get roped into administering the LAMP stack and other systems roles.

Yesterday I was asked to investigate methods of monitoring individual bandwidth use. I've installed Squid proxy server so that all traffic is getting routed through my pet Linux box that I keep spare just for such occasions.

Right, now I'm asked to install software to filter out sites not related to work (like OkCupid... uhoh). So I find a program that slots into Squid and install it.

Bang! Facebook and all other sites mysteriously get replaced with a kitten (from icanhazcheeseburger.com) chewing network cables captioned as "Ohnoes!!!1 kitty is eating my megahurtz!". Putting me in charge results in silliness. I hope they do it less often.

Anyway, despite my best efforts to make the access denied picture cute and adorable my users still hate me. It's tough being a webslave.

A little Linux server does an admirable job of a proxy server, local intranet server, firewall, and virus scanner. My pet Linux box was resurrected from our office "graveyard" because it was old, delapidated, and useless as a development server. Our developers use Microsoft products which require modern computers. I installed Ubuntu onto this box that Microsoft says is useless and now its running a production website, a local intranet, handling our proxy, caching our DNS, scanning traffic with Clam, filtering sites, and our users have commented that our internet is faster than it used to be. I didn't have to pay for any of the software or buy new hardware. Crazy monkey man would be sorely disappointed.

PS: If icanhazcheeseburger.com asks I'll happily remove their image. I hope that linking and crediting is good enough.

13 January 2010

Prevent XSS attacks

XSS (Cross Site Scripting) is a surprisingly easy weakness to exploit on unprepared websites. To describe it at its highest level an XSS attack involves injecting code into a webpage and having a user execute it on the grounds that they trust the website you have hijacked.

There are a great many vectors for an XSS attack to come through, but for the most part applying a few simple safety precautions will greatly improve your site security.

XSS attacks can be split into one of three categories:

  1. Stored XSS attacks - are those where the attacker stores malicious code in your database, forum, comment section or elsewhere on your site. The victim receives the code when they request that particular content from your website.

  2. Reflected XSS attacks - are those where the malicious code is reflected off the server and sent to the victim as part of search results, emails, error messages, etc. This can be set up by tricking the victim into clicking a specially crafted link (or filling in a malicious form) that generates the appropriate response from the insecure server.

  3. DOM XSS attacks - are those where the payload is delivered on the client side by manipulating the script once it has been sent by the server.
It is essential for web developers to sanitize their data so that it does not open doors to these simple attacks.
Cardinal rule: Don't insert data you can't trust
Don't put received data into scripts, tags, HTML comments, attribute names, or tag names. Especially don't run Javascript that you receive.

If your language has automatic functions to sanitize strings (e.g.: PHP has filter_var) then use it.  Be cautious of using functions that are designed to work with html entities, some XSS attacks can work around this.

This applies to HTML elements, Javascript, CSS, attributes, names, and everywhere else that you use received data.

Useful XSS prevention resources

Acunetix provides a free version of their vulnerability scanner that does a good job of detecting XSS attacks.

The Open Web Application Security Project (OWASP) has an extensive wiki on website security.

These guys provide us with users who click on links they shouldn't. Without them we wouldn't have a job.

Trevor Sewell is a UK developer who has kindly provided a PHP XSS class that sanitizes POST data.

11 January 2010

Google Guideline - How spiders view your site

In its "Guidelines for Webmasters" document Google notes that "search engine spiders see your site much as Lynx would".

A web spider is a program that searches through the internet for content (see here for more definitions.)

Lynx is a web browser that was used in the good old days of internet before we had fancy things like mouses, graphics, or sliced bread. Put very simply Lynx is a bareboned web browser that supports a minimal set of features. You can download a free copy from this website. There are other uses for Lynx other than SEO (such as pinging a webpage in a crontab), but for SEO it is mainly used for usability and visibility testing.

If you don't feel like installing new software there are a number of online spider emulators that will try to show you how a spider views your website. One that I found is available here.

Now that we have the means to see how Google spiders view our website we can have a look at what implications the guideline has for our site.

Firstly we need to realize that search spiders "crawl" through your site by following links. Obviously if a spider is unable to read a link then it won't find the page the link points to.

Certain technologies like can make links invisible to spiders. Google can now index text from Flash files and supports common Javascript methods. They don't currently support Microsoft Silverlight so you should avoid using it (it's probably a good idea to steer away from Microsoft proprietory formats anyway no matter how much crazy monkey man screams "developers!" and sweats in his blue shirt).

Google maintains an easy-to-read list of technologies that it supports. You can find it online here.

View your site in a spider emulator or Lynx and make sure that you can navigate through the links. If you can't then there is a good chance that Google can't either.

One way to nudge spiders along is to provide a sitemap. This also helps your human readers. Remember that Google does not like you to have more than 100 links on a page so if you have a large site try to identify key pages rather than providing an exhaustive list.

Some people argue that if you need a sitemap then your navigation system is flawed. Think about it - if your user can't get to content quickly through your navigation system then how good is your site at providing meaningful content? Personally I like to balance this out and provide sitemaps as an additional "bonus" while still ensuring that all my content is within 2 clicks of the landing page.

07 January 2010

Getting Google to notice you

Keep it simple

I've read so many articles by SEO experts outlining how to get a high position on search engines. After ranking a website at number 1, and keeping it there for well over a year now I can offer some solid advice.

The truth is that getting a good, sustainable ranking is a relatively simple affair. However, SEO experts want to make it sound as complicated as possible. How else will they be able to charge you their consultancy fee?

Before you continue reading my blog read this link:
Google Guidelines for Webmasters.

If you adhere to those guidelines you will get ranked.

Stating the bleeding obvious

Question: How does Google make money?
Answer: Primarily by selling advertising.

Question: How does Google make money from advertising?
Answer: By getting lots of people to look at it and click through to their clients

Question: How does Google get lots of people to look at their adverts?
Answer: By have a good service that they want to use (the search engine)

Question: How does Google get people to click to visit their clients?
Answer: By showing adverts that are relevant to the user.

Helping Google for fun and profit

Too many SEO's spend time trying to manipulate Google. You really should be spending your time and effort trying to help Google make money. If you understand my logic above you will create sites that are useful and valuable to users. This makes it easier for Google to provide its users with meaningful search results. This makes it easier for Google to make money.

The mantra "content is king" is a direct consequence of following this logic. If you provide lots of valuable content then Google will see that your site is useful to users and will promote it in its search rankings.

The more money Google can make from indexing your site, the higher it will rank. Google is not a charity, that's why the owners have private jets and an airport to park them on.

If you make their life easy they will love you. If you try to trick them they will punish you. I always assume that the people at Google are cleverer than me. Matt Cutts academic record is enough to convince me not to try to be clever, for example.

Moving onwards with SEO

I'll spend some time examining the various Google guidelines over some postings in the future. To a newcomer they may seem pretty daunting.

I'll also cover some topics such as internal linking, sales funnels, and calls to action.

These are not neccessarily related to SEO but are directly related to the performance of your website. That said, I have found that by having a well structured internal linking campaign my Google ranking improved for certain "champion pages" which I then also pointed my pay per click (PPC) campaigns at. This dropped my cost per click and improved my organic results.

04 January 2010

5 practical ways to reduce spam

Internet "spam" is a term coined to describe unsolicited emails that companies and individuals send out en masse. Because it is very cheap to send an email spammers will send millions of emails out. Even if just one or two people purchase their product the spammer will still make a profit. Spam ranges from being simply annoying to being an outright scam or even dangerous. Consider as an example the ability to purchase medication without needing to see a doctor or obtain a prescription. Spammers send this information out to children.

Spam messages eat up a large amount of internet bandwidth which leads to service degradation for legitimate users. It has become such a problem worldwide that many countries are adopting legislation to combat it.

Spam Statistics

Currently the world's worst offending country is the United States, followed by China and the Russian Federation. America has adopted the CANSPAM act which is aimed at reducing spam, but this has yet to show significant effect in the amount of spam messages coming out of America.

There are many companies that deal with spam. In order to obtain a quick indication of spam statistics I chose one at random and viewed their stats. They recorded 2,509,170 spam complaints in the 24 hours prior to writing this example. Now consider that there are many other companies dealing with spam and not all users will actively complain about spam. This means that the global statistic must be a great deal higher than this.

Reducing Spam

  1. Ask your ISP if they use a spam filter on their servers and consider swapping to another company if they don't. One of the most popular products is "SpamAssassin". It's free and easy to configure. Having a company filter spam on their servers will reduce your bandwidth costs (since you won't be downloading spam messages) and improve your security by blocking malicious emails before you download them. Of course this also means that you will save the time you used to spend manually downloading and deleting the messages.

  2. Avoid publishing your email address on your website or Internet forums. Spammers use web "spiders" to search the internet for email addresses to send spam to. If you are going to publish your email address on your website make an effort to obfuscate it with Javascript or some other method. One approach is to write out your email address in full: For example - info@drugalarm.co.za becomes info @ drugalarm [dot] co [dot] za. Obviously it can't be hard for a spammer to program a search spider to scan for these common patterns so this method is far from foolproof.

  3. Use the services of websites that offer free temporary email addresses when you're registering on a forum or service that you either don't trust or don't intend to use more than once. Mailinator.com is a good example of this sort of service. Remember that because you're using a temporary email address you won't be able to use it to retrieve passwords, track parcels, etc.

  4. If your spam problem is severe you could consider setting up a challenge response system. This runs on your mail server and sends a reply to anybody who sends you an email. If the person replies to the challenge reponse their email address gets added to the list of people who may email you (and they won't have to keep confirming they're not a spammer). Most spammers will not be able to receive or replt to these challenges. One negative side effect of using this system is "back scatter" - spammers routinely fake the "from" field of their emails so your challenge response will target whoever the spammer is pretending to be.

  5. Augment your server side protection by using programs that run on your computer. There are several effective and free solutions that will reduce your spam load. SpamPal is one such program. It is fairly simple to setup and is quite effective in reducing spam. A drawback of using client-side spam filtering is that you will still be paying for the bandwidth required to download your spam messages.

Ultimately the world's spam problem will only be solved when people stop buying their products or falling for their scams. Educate your friends and family about the risks of purchasing products from spam messages. Make sure that you understand what a 419 scam is so that you can avoid making a spammer rich.