Login for Facebook test users

While writing Facebook applications, it is useful to create test users in the developer.facebook.com “Roles” tab for your app, and login as those users.

But, Facebook test users are a little tricky to login, since they don’t really show you the e-mail or password of these users. You can get a unique “already logged in” url by asking Facebook a question, and here is a short script that does just that.


I use it in Chrome Incognito windows to test various features with these throw-away users, and it works great. Enjoy.

Broken tmux in OS X is easy to fix

I have started using tmux instead of screen for the last couple of months. Ported my screenrc file into a similar tmux one and everything works great. With one small exception, tmux on Mac OS X makes strange problems and errors appear – and I had not idea why things were so broken, so I googled for a long time and finally found the reason.

The first problem I noticed is that launchctl stopped working correctly, it would show an error saying: launchctl error: launch_msg(): Socket is not connected.

The second problem was when I tried changing my vimrc to use the OS X clipboard copy and paste inside of vim, so anything yanked would appear in the clipboard and viceversa. To change it all you have to do is add set clipboard=unnamed to your vimrc, and in the latest versions of vim it just works. Unless you do that in tmux, then it just doesn’t work.

Long story short, if you are using tmux and experience strange problems with OS X facilities — try to use https://github.com/ChrisJohnsen/tmux-MacOSX-pasteboard. There is an explanation there about why this happens and how to fix it. I used the wrapper method that required adding a line to my tmux.conf and now everything works great!

Hope this helps.

Google Adwords is where you waste your money

Overpaying for Google AdWords, you are not alone!

AdWords is a wonderful thing, give it a bucket of money and get as much traffic to your website as your money allows. Some people might have discovered this and are happily spending their hard-earned coin, oblivious that AdWords actually requires you to research how it works, or else.

Before I dive into explaining how to effectively use AdWords, there are several problems that most people find painful to deal with when using the system. Without understanding how AdWords works, most likely money just goes to waste instead of being invested into getting results.

The goal of any successful Ad campaign is getting more money in the end than you had before starting it. AdWords is not usually used to build a brand and get recognition for your product (like Coca-Cola ads in the movies), but rather to acquire paying customers to come and pay on your website – and to invest less money, getting them to come, than they will pay you back. Simple ROI (return on investment) rule that everyone probably already knows.

Here comes a surprise, some people while using AdWords are paying a hundred times more and getting a hundred times less than their competitors. Usually after a while these people realize that AdWords is a waste of money, and they stop using it. If only they would use it correctly, this would not be their conclusion. I’ll explain the things to know so you will not make the same mistake, and start earning money instead of wasting it.

If you even tried using AdWords, a most common gotcha that bites you is getting the Ads you created get denied, blocked, or even banned in some extreme cases. This is annoying and makes you think “WTF Google? I’m paying you! Don’t you want my money?”, and the answer is: no, they don’t.

Google’s motive to annoy you comes from their belief that the customers who use Google services will get a slightly better experience, and they probably have a point. History shows that search engines which showing spam results instead of relevant results got extinct, same with ad networks that show spam annoying advertisements. Google AdWords are making it hard to create bad/spam ad copy because they want the network remaining relevant to people who see the ads. They might have a point, and it seems it is working fine for them for a very long time now.

Another issue with AdWords is trying to actually use its interface to create your advertisements, they really made it extremely hard for anyone to use it. It has a million options and making sense of how it works requires years of training and study. And then they change it every month into “yet another new version” or some such. Fortunately for us, most people who use AdWords don’t care, and thus they make mistakes by not bothering to research and you – their competitor an advantage.

If you read this far, you probably want to know how you can still use AdWords and not fall into the trap of wasting your money, so let me share with you several things I learned during the years – this post is long enough, so I will write it in a sequel.

Rack Middleware

Middleware is a very powerful tool that is usually used to filter incoming requests or outgoing responses to your web application, and solves some problems in a very elegant and DRY way. It is usually a pluggable component acting as a filter in the request/response flow. Has the benefit of being easy to reuse in any application that needs them.

WSGI is an interface, that defines how web applications and web servers talk to each other. WSGI is the name most used with connection to the Python programming language, but WSGI has equivalents in most other programming languages used to create web application. For example Rack is the equivalent of WSGI for the Ruby programming language. Although it might not be apparent, WSGI in Python and Rack in Ruby are almost identical with regards to how you create middleware for them.

Popular web frameworks usually have their own way of adding middleware to WSGI/Rack that is sometimes simpler. One such example is the Django web framework that has a really excellent and simple way of adding middleware to your application, described in the Django middleware documentation.

In Rack (and in Python’s WSGI) middleware usually looks like this:

Django makes it simpler by adding methods for each step in the filter:

The best part about Rack middleware, is the excellent way to test your custom middleware using Rack::Test. It took a bit of wrapping my head around it, but when I finally had the eureka moment it all snapped into place. Let me demonstrate with a simple example that shows how to test a Rack middleware using RSpec and Rack::Test.


The secret sauce is including Rack::Test::Methods at the start of the description, and it does all the magic of using “app” and allowing to get/post/etc… as documented in the Rack::Test::Methods documentation.

Hope this helps someone, leave your comments and/or questions below.

Keywords in Search Engine Optimization (SEO)

I wrote a small web tool that can be used to measure keyword density of keywords and phrases in online articles. This blog post explains why keyword density is important and what is the role of keywords in promoting your website to the headlines of search engines.

It might be surprising, but most people are not aware of these very simple concepts. Just by reading this short article, you are already more advanced and knowledgable than most of the people who build stuff on the internet. Please use this knowledge for good and not for evil.

So, why would you even want to be in the headline (aka. first place) of the search engine result page (aka. SERP) in the first place? Well, if you want your shiny website to get traffic (aka. unique visitors), then organic search results are the best way to get it, and its cheap with no direct fees unlike Ad networks like AdWords where you pay for user clicks. Another advantage of organic search results is that they convert extremely well, in other words, users who are looking for your product and find you in the first place of the SERP are much more likely to buy your product.

Keywords are the heart of any search engine optimization. The whole purpose of SEO can be distilled into this very simple principle:

Place a specific web page to appear as high as possible in the list of search results for a certain keyword.

Note that it talks about single web pages and specific keywords, your website might have more than just one page, and it might appear at the first place for more than just one keyword. It depends on your the competition for the keywords you target, and if many people search for it – its a good bet there is a lot of competition.

Armed with this new understanding, you might say “I’ll just repeat the same word a million times on this page – and it will rank my page as the first result in the SERP.”. Well if this was 1995 that was exactly so. But modern search engines do not really like keyword stuffing and will most likely penalize a page that does it rather than reward it.

But lets step back for a moment, and look at the importance of Keywords in SEO as a general principle. Usually, the very first thing that SEO experts do is called “Keyword Research”, and that needs a little bit of explaining. Keyword research is mostly two things;

First, the hard part, is coming up with a list of keywords. The most apparent way of doing this is just asking the business owner what are the keywords he believes relevant to his business. A slightly more sophisticated way is visiting pages of competing companies for the business and looking for keywords they use in their articles, be it via keyword density, or by finding links from other websites to these competitors and looking at the link text.

Second thing in keyword research is finding among all the relevant keywords for a business – the keyword or phrase which has the most search hits in search engines, and then creating content and links targeting that keyword. Google has a tool to make this research, the Google Adwords Keyword Tool, it was built to see how many searches a certain keyword receives in Google searches. Bing has a very similar tool called the Bing Keyword Research Tool.

Modern search engines like Google and Bing prefer to rank pages with factors that are out of the direct control of web masters and content authors. These factors include things like links from other web pages to your page, and many other secret algorithms which are usually not revealed in full. One part of the algorithms is keyword density, and although it is not as important as it once was, it still should not be neglected completely.

The Keyword Density practice of SEO, described in length on wikipedia at http://en.wikipedia.org/wiki/Keyword_density explains how to calculate this important metric, and that instead of having a keyword density of 99%, many SEO experts consider the optimum keyword density to be 1% to 3% and using a keyword more than that might be considered search spam.

This article started with a promise, I promised to show you a tool that can make it easier to find keyword density of articles you write. This calculator was created to help you optimize articles and have the best word density for search engine optimization, between 1% to 3%.

Using it is very easy, just paste some text into a text area, and it will calculate, in real-time, how many times each word is repeated in the article and the density of that word as percentages. Among other features, the density calculator also removes stop-words that do not really matter when doing indexing of articles, words like “a”, “an”, “for”, “of”, “the” and the likes.

Note that it might need some more testing. For example, with a very small amount of words the calculation might not be correct and show distorted results, and other bugs might be hidden in it as well, so it is far from perfect – use at your own risk.

Lastly a link to the tool that started this article:


Broken Django behind a Load Balancer

Having problems with your seven Django servers behind a single load balancer or proxy, and you have no idea which of the servers is giving your that “one in eight requests” error?

The solution is simple, add some information about the instance to the HTTP response headers!

It doesn’t really matter which load balancing or proxy solution you use, Amazon ELB, HAProxy, Varnish, Pound, Nginx, this would just work with most of them without any modification.

Here is a simple example of just such a middleware:

Simple Amazon AWS API queries

After using the excellent python boto Amazon Web Services library, I felt a tingle of unease where boto was taking away the transparency and clarity of AWS APIs. Amazon have excellent documentation that contains detailed API References, and the pace of their new features released is staggering. It is a pity that boto is written in such a way that it needs to re-implement every new feature Amazon releases in their own wrappers and a foreign language (method names).

So as the first step in making it more transparent to use the original Amazon AWS API reference documentation in a copy-and-paste fashion, I had to write the same reinvent-the-wheel piece of boiler plate that everyone wrote a hundred times before (google for it, its there). But mine is shinier and prettier, or so I would like to think.

I present to you, in all its glory — The Python Amazon AWS API queries class!

Google AppEngine URLFetch in Unit Tests

Started using Google AppEngine for a personal project of mine some time ago, and noticed that like everywhere else in python, the state of testing (tdd) is really poor.

There are several “solutions” that provide stubs that can be used in unit testing Google AppEngine applications, including something called a “testbed” which is part of the API itself. But the problem with these is that they provide functional bits of API implemented on your local environment just like it would work on a deployed AppEngine application.

It sounds quite good to have a local personal instance of something similar to the datastore you get in deployed applications, but unfortunately for the urlfetch service it is not exactly what I was looking for in tests.

The thing I need is an object that will not urlfetch anything, will not access the network at all. The requirement in this case is an object that I can tinker with its state before and after my own methods have used the urlfetch facility. After a lot of digging in the current implementation of the stub, I ended up writing a very simple mock for this myself. It is far from perfect, but its a start.

I think I really dislike Python


> irb
irb(main):001:0> a = [1,2,3]
=> [1, 2, 3]
irb(main):002:0> a[10] = 20
=> 20
irb(main):003:0> a
=> [1, 2, 3, nil, nil, nil, nil, nil, nil, nil, 20]


> perl
@a = (1,2,3);
$a[10] = 20;
use Data::Dumper;
print Dumper(@a);
$VAR1 = 1;
$VAR2 = 2;
$VAR3 = 3;
$VAR4 = undef;
$VAR5 = undef;
$VAR6 = undef;
$VAR7 = undef;
$VAR8 = undef;
$VAR9 = undef;
$VAR10 = undef;
$VAR11 = 20;


> python
>>> a = [1,2,3]
>>> a[10] = 20
Traceback (most recent call last):
  File "", line 1, in ?
IndexError: list assignment index out of range
>>> a
[1, 2, 3]

A bright idea in the middle of the day

I was reading a blog yesterday about “The sad state of open source monitoring tools” and was thinking about it for some time. Coincidently today I had a chance to look at my CruiseControl configuration files, which I wrote quite a long time ago.

I really love the DSL that CruiseControl is using for it’s configuration, it’s extremely powerful at describing how to build projects. Especially powerful are the variables, that unlike in Ant are not immutable, and the way plugins can be pre-configured with your own defaults, as well as renamed to other names. It’s really easy to configure it in such a way that adding a new version for a project is just 1-3 lines of XML, for example

<xxx-project name="XXX v6.66">
  <property name="version" value="6.66"/>

Just in those 3 lines, the pre-configuration already includes all the information about the project. Where it is at, who to send e-mail to, where is the version control, EVERYTHING! If the only variable that changes over time is the version number, then that is all you need to leave as a variable … everything else is just a template that can be re-used. And these templates are extremely easy to combine from smaller templates, it’s a template-in-the-template kind of configuration.

IMHO this would very much apply to configuration of monitoring software, like nagios for example. And the way the (CC) plugins are written in java – adding new plugins that check all kinds of esoteric things is really easy to do.

If it would also have the XML/XSLT configuration of how the web-interface looks like (the way CruiseControl does), and the super-easy installation (again like in CruiseControl). It would be a really really really great product, extremely powerful, easy to configure, and potentially great looking.

If only ThoughtWorks would write such a thing … I would be thrilled!

Actually nagios is already extremely similar to what I described, but for some strange reason I find the rigid configuration of nagios a large PITA. Maybe some-day when time stops and I will have unlimited time to code, I will do it myself.