Archive for the ‘seo’ Category

October ’06 Status update

Saturday, October 28th, 2006

It’s way past time I updated the blog with some more recent info. I hope you’ll understand time the 2nd scarcest resource of my single person startup, only right after cash, but very close.

For the past month, I’ve been able to do little more than answer support e-mails, respond to customer’s queries, and take note of the bugs/requests I’ve received. My day-job has required a lot of time and it has been pretty stressful, so I just forgot about trying to actually achieve anything.

I had been pretty busy working in ViEmu for the previous couple of months. I took a quiet August, I started surfing – which I love as a great summer activity – and I worked a lot in ViEmu/VS version 2.0. The worst part of it was a more than 10 hour symbol-less assembly debugging session of the innards of Visual Studio, in order to find a bug in one of the APIs and implement a feature I badly wanted to offer in 2.0 (automatic keybinding removal/management).

After this, I was able to release ViEmu/VS 2.0 back in mid September. The keybindings handling feature in particular has caused some trouble, so I will completely change the approach for the next major version (whenever that happens), but, all in all, 2.0 is a heck of an improvement over the previous 1.4 version. Sales have gone up, the feedback has been great, and I’m very satisfied with the result. It also makes use of the completely new ngvi emulation engine, which is also integrated in kodumi (my upcoming text editor), which will hopefully be released in early 2007. Having the engine confirmed to work right by hundreds of users gives me a great confidence in it.

I also released ViEmu for SQL Server Management Studio 2005. It has made a modest debut, with not too many sales, but it should be useful to some folks into heavy DB development, and turns ViEmu in a rounder offer.

I’ve updated the web site to offer ViEmu/SQL too, but I only did the minimum investment of time into this. And the reason is that I still plan to release a third ViEmu product before taking on kodumi development more seriously: ViEmu for Word and Outlook. Quite a few people have asked for it over time, I think integrating the ngvi engine in the Word framework won’t be too much trouble, and the main point is that I expect to make the maximum ROI from the effort invested so far. Vi/vim emulation will never be a huge market, and implementing it for many other environments wouldn’t be a sensible business decision, but having the triad of ViEmu/VS, ViEmu/SQL and ViEmu/Word+Outlook seems like the best trade-off of effort and potential. ViEmu sales are already in a place where I could live off of it, and adding up a third product could make it a comfortable situation to confront the release of kodumi 1.0 and developing the technology I intend to.

I will have to do a pretty complete redesign of viemu.com presenting the 3 products. And presenting multiple products is always much more difficult than presenting a single one. Given that this effort is in the near future, I decided to do the minimum redesign possible for the release of ViEmu/SQL.

Some interesting facts:

  • July and August sales were slow (especially predictable for July, given June had been the last month of the previous pricepoint and I cannibalized a lot of natural July sales), but September managed to catch up with dollar-sales in June (the best selling month ever so far), and October has again broken that record, almost catching up with the maximum ever unit sales in June.
  • Finally, viemu.com has made it to the first page of both the “visual studio vi” and “visual studio vim” Google searches. As soon as I have an afternoon to sort it out, I will finally be redirecting the old “ngedit.com/viemu.html” page to “viemu.com”. It’s taken 6 months for Google to acknowledge the new location (I didn’t want to redirect straight away and risk losing the ranking, as it had taken many, many months to have that page on the first page for these very interesting searches).
  • I have a chart and an article almost ready, called “The Ultimate WM_KEYDOWN/WM_CHAR Table from Hell”. I’ve had to delve even more deeply in the broken-ness of the Win32 input model, as ViEmu 2.0 has full keyboard mapping support, and it’s simply amazing how broken it is. The previous article on the subject is, funnily, the 2nd Google result for “WM_CHAR”, right after the MSDN reference page, and the 4th or 5th for WM_KEYDOWN (and brings quite some traffic to my site). I believe the new chart will be very useful and it will be pretty popular on del.icio.us, etc… more exposure is always good.
  • As always, I still plan to blog profusely… in the future :). I certainly enjoy writing and sharing my experience, and it’s definitely useful for the business, but I still have to prepare ViEmu/Word+Outlook and get kodumi 1.0 ready before I can dedicate more time to blogging. Actually, there is some very interesting technology I am preparing for kodumi (and for other projects afterwards), and I’d love to blog about it. But I don’t have time for everything… As soon as I have a released product which appeals to a higher percentage of developers, it will make more sense to invest in blogging as a means to gain awareness.
  • Andrey Butov took the plunge, left his day job, and went fulltime into his business. The effects have already been noticeable: a new web site specialized in Wall Street Programmers, a new design for his main site, etc… He was even so kind as to feature ViEmu/VS in the front page of the new site! When he released his book So You Want To Be A Wall Street Programmer a few weeks ago, I decided to buy it and read it. The reason is not that I intend to ever work in Wall Street, I am as close to 100% sure as possible that I won’t. But I enjoy his writing style, and I was curious about the development industry over there. I found the book as interesting and entertaining as expected, and I also got a good idea of how the internal development in investment firms works. Since my products are and will keep being oriented towards developers, I found that the new knowledge would be useful for better targeting of my upcoming products. I’m familiar of how development works in 2 or 3 different industries, and I’m confident that I can target my products efficiently to those, but I’ve now added another one to the strategy-decisions mixing pot, an industry which can spend a lot of money, so I think I’ll be glad I spent the time to read the book. Recommended.

And a closing note with regards to blogging subjects: I’m doing some core technology development for kodumi. It’s quite probable that the blog will turn towards that subject area: basic computer science, parsers, languages, types, the nature of code and data, etc… I’ll still post about business and other issues, but I plan to blog a lot about the technology – I think it’s pretty groundbreaking and that it will be useful in many areas. So don’t be too surprised if you find a post here talking about really basic stuff (such as “what is a number”, “what is a type”, or “code and data are one and the same thing”).

But if you really, really want to read purely about setting up a small-software-company, you have to head over to Patrick McKenzie’s “MicroISV on a Shoestring” blog. Patrick is a smart guy (“smart” as in “really smart”), and he also writes very well, so his blog is the best account of going from zero to having a working business I’ve found. Recommended, too.

My first PR6!

Monday, July 31st, 2006

A couple of weeks ago, Google updated it’s public page rank – that is, the page rank that the Google toolbar shows you on a given page. Back in March 29, I released the graphical cheat sheet and tutorial, which was and has been pretty popular, with hundreds of links around the web and many tens of thousands of visitors (and still the most important traffic driver to my site). The graphical cheat sheet page and tutorial had been showing PR0 since then, meaning not-yet-assigned. The main www.viemu.com page itself had only been showing a meager PR2, as I had linked to it some time before. I was pretty eager to see Google show its love, not only by showing the page among the top 5 results for “vim tutorial” and several other searches, but also with its PR.

Finally, a couple weeks ago, I was glad to find out that Google had assigned it a page rank of 6, which is a pretty respectable number. It seems the rank is kind-of logarithmic, so a single page rank point may reflect a 10x variation of popularity.

My previous highlights were PR5 for this blog and for some ngedit.com pages, both of which have dozens of links around the web. It seems you need hundreds of links, possibly with at least several of them from reputable pages, to get into PR6. I don’t know whether del.icio.us user/link pages are spidered and accounted for by Google, that would get it in the thousands.

Anyway, the update in the page rank as reported by the toolbar hasn’t had any effect in the traffic Google directs to my page, or the results ranking. This is expected, as it seems the page rank reported by the toolbar is just a more-or-less quarterly snapshot of the internal pagerank that Google actually uses.

As a side effect, there is another SEO trick I think I’ve found. Google has assigned a very modest PR of 3 to the main www.viemu.com page. There are some links around the web to this page, but nowhere near the amount and significance of the links to the cheat sheet itself. But I’ve been pretty surprised to find that the vi tips page in the site has gotten a wonderful PR5! What is my interpretation? The vi tips page is the only one that links to the cheat sheet & tutorial page directly. Nobody has linked to the tips page directly, so I’m pretty sure all of those points are assigned by Google thanks to the fact it is the only parent of the PR6-popular page. Good to know?

I will try to set some time aside to restructure the site so that the home page itself, which is the best landing page for potential customers, links directly to the graphical cheat sheet & tutorial page. You might be interested in applying this knowledge to future design decisions about your site, as well.

On other issues, sales in July have been slower than usual, but still good – given I’ve just ended the special introductory price, and that I had cannibalized most natural-July-sales by announcing the price increase prominently during all of June. I’m expecting ViEmu will continue to sell well at the new price after the slow period of the year, and I’m looking towards some increase with the release of 2.0 later during summer. Hopefully, even higher afterwards, thanks to some extra tricks I’m preparing. I wanted to finish 2.0 by early August, but this, of course, has turned out to be a very optimistic timeframe – late August or early September is much more likely. I’m also feeling I need some time away from hands-on development to “recharge” my motivational batteries, so I will be taking a quiet, calm August, and advance slowly towards the next steps. All is fine and I’m looking forward to a very exciting second half of 2006.

Enjoy the summer everyone!

ViEmu Marketing Week

Monday, May 29th, 2006

As I mentioned in my last post, apart from ViEmu and the kodumi editor, I’m working in another product. I tend to concentrate on development most of the time, rather than marketing ViEmu. Don’t get me wrong, not only do I think that marketing is second only to product quality as the most important part of this business, but I enjoy marketing. The reason is that I think that ViEmu can not become a large business, because of the inherently small audience of a vi-vim emulator for Visual Studio. Thus, I think that the best way to grow the business is to release a product for a larger audience, rather than trying to squeeze every extra N% sales by implementing effective sales techniques.

Anyway, there are two phenomenons that push in the other direction. For one, pure coding of a product before it’s released lacks the thrill of direct feedback, so it’s very tiring. At least that’s how I experience it. And second, any improvements in ViEmu sales make it directly to the monthly bottom line, which is a pretty good motivator.

Thus, after a solid coding Saturday, I decided to dedicate some time to marketing ViEmu better. There were two main things that were irking me:

  • I get pretty good feedback by e-mail and through the forums, but looking at the number of downloads it still looks paltry. Not having a super-easy way to get feedback (esp. criticisms!) leads to my ignorance about why those that don’t buy ViEmu don’t buy it.
  • The main page of ViEmu (also doubling as landing page from Google adwords) was a bit dull. Too much text. Informative for those interested, but I don’t think it really “grabbed” visitors.

So, I dedicated all of Sunday to redesigning it, adding functionality so that visitors can send feedback from a simple form there, and making it more “catchy”. This ended up as an animated demo of Visual Studio running ViEmu. You can see the result here:

New main page

And, just for reference, the old one is still here:

Old main page

I’ll let you know how it turns out to work. I am planning to implement some other marketing “tricks” during this week, as well as releasing ViEmu 1.4.5, and then I’ll go back to more coding and support, coding and support, coding and support…

Fact sheet May’06

Thursday, May 25th, 2006

Fact #1: I haven’t posted on the blog for well over a month. With all the pending things I have (ViEmu 2.0, the text editor, my day job obligations, support, etc…), I can hardly find time to do so. Promises not kept: the “Friggin’ Darn Tough/Functional Dynamic Template-based C++” series, an article on ViEmu I promised to Keith Casey from CodeSnipers, an article with cool graphical charts on the digg effect as seen from viemu.com (more on this below), etc… Hopefully everything will come along. Until I build the business to the point where it will sustain me, I really just can’t afford do put my available energy in anything other than improving & supporting ViEmu, and preparing the next product.

Fact #2: The final name for the NGEDIT text editor will be kodumi. I wanted a name that sounded good, and which wouldn’t be limiting for the future evolution of the product. It means “hacking” in Esperanto, although Esperanto is not, like, so widespread that the meaning is the important part. I like how it sounds and I can identify with it. It still works when the product becomes more than a text editor. If the product is really good, which I’m hoping it will, this should ensure the name sticks. I’m open to feedback and criticism. I’m pretty stubborn and it’s unlikely I’ll change it, though.

Fact #3: The next product I release probably won’t be the kodumi text editor. There is quite some work yet to be done with kodumi before 1.0, and I’ll probably release another product based in another functional part of the editing core, as a VS add-in. Hopefully with a much larger appeal than a vi emulator. It will actually be based in one of the innovative features I’m planning for kodumi 1.0. It’s nice to have a product that has several offspring before being born.

On the other hand, given that I will probably be releasing this product, it may make sense to have a single site for all VS add-ins instead of a separate one for each product (such as viemu.com). Oh well… this right after moving to viemu.com… so much for my strategy forecast skills.

Fact #4: The amount of traffic you get from a reddit / del.icio.us / digg front page is amazing. I’ve also got thousands of visitors from StumbleUpon.

Here are some graphics that show it, as the graphical vi/vim cheat sheet I released made it (twice!) to those front pages. I apologize for not being able to write a full article on this, it would be worth an entire study.

In order to understand these properly, take into account that originally ViEmu was hosted at ngedit.com, and I moved it to its own domain viemu.com together with the release of the cheat sheet. The traffic graphs include both domains, as they’re served from the same account, but the Alexa graphs below show both domains with separate lines.

I released the cheat sheet on March 28. Here is the traffic for that day (click for a full sized image):

You can clearly see the moment it picks up to 100kbps sustained. The climb was caused by it getting to reddit’s homepage, which happened about half an hour after I submitted it (people liked it, so they voted for it, making it reach the front page – it’s not against their guidelines to submit your own stuff).

The traffic before the climb used to be typically low – very nichey product, a few blog readers, etc… enough to result in some sales, but nothing big.

I went to bed as soon as I saw it at the bottom of reddit’s front page. The next day would be crazier.

As a side effect, people started bookmarking it to their del.icio.us account for later reference. This is understandable given the “reference” nature of the cheat sheet. As soon as a fair number of people did this, it also appeared in del.icio.us’ popular page, thus getting more traffic from there.

This is the traffic on the 29th:

I apologize for not presenting a higher-resolution sampling, I forgot to save it from my hosting provider, and I can’t generate it again.

Anyway, please take into account that the lowest bar in the graph is as high as the 100kbps high in the previous one. It was pretty amazing. I first watched it for hours no end in reddit’s and del.icio.us’ homepage, and a lot of traffic coming. But then I submitted it to digg, and watched it play the voting game in digg’s “sub leagues” (the system is very different from reddit). And then the big spike came: it made it to digg’s front page. All hell broke loose, bandwidth requirements grew to 2Mbps sustained, and the number of visitors was amazing. It made reddit and del.icio.us look like a joke.

My hosting provider handled it without a hiccup. On the other hand, that very afternoon after submitting to digg, (1) there was a power outage at my building, (2) when it came back, my DSL service was down and unfixable according to my ISP, (3) I got a flat tire when driving to a friends’ in order to watch the digg effect, and mainly to be on the watch in case bandwidth went beyond the monthly limit, which happened, so (4) I had to upgrade my web hosting account. You can say I had all the hiccups web servers usually have in these cases.

Here you can see the traffic for the next two days:



You can see the long tail of the digg effect. Also, the cheat sheet got linked from many places around the web, and StumbleUpon started to pick it up as well.

Here you can see a graph of all of March’s traffic, a nice picture of the reddit, del.icio.us & digg effects:

And here is a glorious graph of all of 2006’s traffic:

I promise that I had traffic before March 29, even if here it’s squashed into oblivion!

Finally, I’ll bring you some captures of what alexa thinks of my domains (it doesn’t know they are related somehow).

First, here is the Alexa’s “Daily Reach” measure, for the last 12 months, 6 months and 3 months (just for your static zooming enjoyment):





I can almost tell you where each spike comes from: the first one, in May last year, comes from Eric Sink’s kind mention of my blog & NGEDIT. The second one comes after the release of ViEmu. The largish one before the digg effect comes from a mention in Bungie’s web newsletter (which, expectedly, led to thousands of hardcore gamers, only one of whom was courageous enough to actually download ViEmu), etc…

I chose to show the daily reach above just because it is the Alexa measurement which best shows the evolution of my web presence. Their best known stat is the “rank”, which ranks the site globally among all websites. They only plot it for the top 100,000 sites, but they give you the number in any case. Here are the graphs of the rank, for the last 12 and 3 months:



Actually, the second large spike you can see earlier this month was due to the cheat sheet making it once again to digg and del.icio.us’ front pages, this time as a direct link to the cheat sheet’s GIF file.

Amidst all of this traffic madness, there is another important source of visitors which is often overlooked. I know I did. The name is StumbleUpon. This is not a social links site, but a plugin that you install to your browser, and with which you both (a)vote sites up or down, and (b)discover sites other stumblers’ liked. The effect is much slower, but the amount of visitors it can bring during a few weeks competes with the likes of reddit and digg.

In order to show this better, I will show some visitor numbers by referrer (only for viemu.com). I’ve decided not to totalize them by domain, as the distribution of source pages also provides some interesting info. I haven’t included many other sources, generated from bloggers, news sites and site owners discovering it and linking to them.

March
Total unique visitors: 22,901

http://www.digg.com 3910
http://digg.com 3210
http://digg.com/programming/vi_vim_Graphical_Cheat_Sheet_Tutoria… 2665
http://www.digg.com/index/page2 543
http://www.digg.com/index/page3 631
http://www.digg.com/index/page4 238
http://www.digg.com/index/page5 95
http://digg.com/index/page2 398
http://digg.com/index/page3 500
http://digg.com/index/page4 184
http://digg.com/programming 141
http://www.digg.com/programming 141
http://reddit.com 1814
http://del.icio.us/popular/ 1116
http://del.icio.us 112
http://www.stumbleupon.com/refer.html 120
http://popurls.com 392
http://diggdot.us 154

April
Total unique visitors: 20,429

http://www.stumbleupon.com/refer.html 7858
http://digg.com/programming 127
http://www.digg.com/programming 121
http://digg.com/programming/page2 69
http://digg.com/programming/vi_vim_Graphical_Cheat_Sheet_Tutoria… 376
http://www.digg.com/search 68
http://del.icio.us 104
http://del.icio.us/search/ 70
http://hedera.linuxnews.pl/_news/2006/04/03/_long/3795.html 1883
http://www.linuxnews.pl 556
http://linuxnews.pl 536
http://www.wykop.pl 216

May
Total unique visitors: 6,208 (this doesn’t count those coming through the GIF link as that is not considered a “page” by awstats)

http://www.stumbleupon.com/refer.html 805
http://digg.com/programming/vi_vim_Graphical_Cheat_Sheet_Tutoria… 133
http://www.digg.com/programming/vi_vim_Graphical_Cheat_Sheet_Tut… 53
http://digg.com/search 36
http://digg.com/search/page2/ 19
http://del.icio.us/search/ 45

Just for fun, I have included the links from several sites in Poland during April. For some reason it was very popular there during that month. Maybe vi/vim is better suited to heavily accented languages like Polish?

Fact #5: I’d need to sell about 1.5x to 2x as much as I’m selling now to live off of the income from ViEmu. Not a big success 10 months after release. It’s ok, as I’ve learned a lot from the experience, and I needed to do most of it for kodumi anyway, which is the main goal. At least for the kodumi I want to develop and release.

Fact #6: vi/vim emulation for VS is not for the masses. I have gotten over 50k visitors to the site in the past two months. This is about more than 20x as much as I was getting beforehand. I guess a product with a more general appeal would have noticed an enormous spike in sales. I’ve only seen a smallish upwards trend. Even VS users are a minority among vi/vim fans! I’ve sworn not to switch over to a Dvorak keyboard layout until the business really takes off, I could end up targeting an even smaller market!

Fact #7: I don’t understand Google results. I’m on page number one for “vim tutorial”, but nowhere to be seen for “vi tutorial”. I was extra careful to write “vi/vim graphical cheat sheet and tutorial” everywhere, so that I would be found by any of the likely keywords, and the result is so bad it’s sick. Searching for “vi emulation visual studio” gets the old page, even if there are links to www.viemu.com all over the place. If there’s a sandbox, I don’t understand why it affects some keywords and not others. Is “vi” too short? Then how did my SEO work before with the ngedit.com address? I’m starting to experiment with creative redirections to the new site, but I’m going to do it the slow way in order to cut the losses in case Google doesn’t like my playing around.

Fact #8: it was cool to have the vi/vim cheat sheet translated into simplified Chinese by Donglu Feng, a nice guy who sent it over to me. It makes regular vi/vim seem a piece of cake:

Google *loves* the H1 tag

Friday, March 3rd, 2006

(The short version, in case you don’t want to read more: have H1 tags in your pages, containing the keywords you’re interested in. It pays off)

Isn’t it frustrating when your page doesn’t even appear in Google for your target keywords? It can be even worse:

  • you may not be targeting competitive keywords at all, and there may be no competing products at all.
  • There can be several other pages, about your product, and linking to your page, actually appearing on the results!

This is what was happening to me with the main page of my product, ViEmu. ViEmu provides vi/vim emulation within visual studio, so pretty obviously the target keyphrases are “vi visual studio” and/or “vim visual studio”. The product and the page have been there, accessible through http://www.ngedit.com/viemu.html since late last July. There have been quite many mentions of it, which link to that page, from blogs, review sites, etc… The page has been indexed all along, and appearing on the top results page of Google for things like “visual studio vi emulation” or, quite obviously, “viemu”. But it didn’t even register on the much more interesting “vi visual studio” and “vim visual studio” search phrases. By this, I mean it was nowhere to be seen on the first 40 pages of results or so. What’s even funnier is that many of the mentions of my page did appear there, even on the first few pages.

I have an adwords campaign (read my report on adwords for details on the effectiveness, click fraud, etc), which helps out, but I’d really prefer to be on the main results. What’s more, I couldn’t easily understand why I wasn’t.

Trying to understand how Google sorts its results is a tricky task, as well as a moving target. But I had an advantage: a certain review from Tobias Gurock over at Gurock Software was scoring incredibly on search results – it was on the first or second page of the results for the interesting searches! So I decided to have a look at it and try to find out why Google liked it so much.

The first thing to check, obviously, was whether that page had a significantly higher PageRank, or many more incoming links. Actually, it seemed to be about the same as mine (PR5), so that probably wasn’t it. Even other reviews, with much lower PR, did at least appear after page 3 or 4 of the results.

So that left the actual content itself. After some review, discarding the title, presence of the keywords, etc… I got it down to two differences:

  • The name of the html file, in their case, contains the keywords (it’s “http://software.gurock.com/postings/vi-emulation-for-visual-studio/25/”)
  • …and their H1 tag is “Vi emulation for Visual Studio” (today it also includes a self-link, but I think it used to contain just the text before they moved to WordPress)

Changing the filename was out of question – with all the links out there, I wouldn’t want to lose that. A 301 http redirect may be a possibility, but I was weary of Google consequences (I could make it even worse, possibly losing the accumulated PR).

But the H1 tag… see, my original web site design did not include an H1 tag at all. Along the product’s logo, I had a graphical rendering of the name – not leaving room for a text H1 tag.

So, I decided to “upgrade” those pages to having an H1 tag. The contents of the tag: pretty obvious, “ViEmu: vi/vim emulation for Visual Studio”. The text rendering of this title looked fine, so on February the 11th, I uploaded the new ViEmu pages. And started waiting.

Google comes often to my site (daily?), so that was fast. About a week afterwards, Google’s cache started showing the new content. So now I knew it was already there. The search results, anyway, kept on the same.

But around the 21st (last Tuesday), I found out to my grateful surprise that my page started appearing on the first results page, around #5. During all of last week, it was a bit unstable – some searches would work fine, but searching a few hours later would show the old results with my page nowhere. This week, finally, about 90% of the searches already show the new results!

I attribute the instability to the results requiring propagation around Google’s servers, which is expectedly a slow process. I’m guessing (and hoping) it will disappear altogether in a few more days.

Lessons learned? Always include a header in your design (I guess an image with alt text may work as well, but I’m not trying). Name the html files with the relevant keywords, not just with your product name.

And be prepared to learn a lot!

ViEmu, adwords and clickfraud

Friday, December 9th, 2005

While I was doing some http logs analysis on the number of downloads of ViEmu, my commercial vi/vim emulator for Visual Studio, some interesting information turned up. Given that I think it may be useful to other entrepreneurs using adwords to promote their business, and that I have also received several requests for my experience with adwords, I’ll be sharing that information in this post. Hopefully it will save a few bucks for other fellow developers.

Some warnings are due before I delve into the details. First, I don’t really have any evidence of clickfraud – there simply are some things in my logs which look, hm, weird, and they may be a signal of something else. But it could all be due to my more-than-limited understanding of adwords. Given that I’m not spending much, I haven’t spent too much time investigating it. It would be wasting that time which is better spent in other areas.

As well, I haven’t taken the time to read all the information available on the net on these issues, so please feel free to point out the possible flaws in my reasoning.

As to the applicability of my case to other people, I guess my case is not the most common one, as I think I’m the only advertiser working on many of my keywords. Given this, there is hardly any bidding at all, and click prices are very cheap (5 euro cents/click). It must work very differently if you are advertising on keywords with a tough competition (I guess I will be able to comment on that once I release the NGEDIT text editor).

Anyway. I set up my google adwords campaing at the end of July, as I released ViEmu 1.0. It took a few hours or days for advertisements to appear for relevant searches, but it’s been working almost unattended since. I changed a minor detail in the ad text, and I added other keyword combinations as google searches reached my site and taught me what terms people actually use to search for in case they’re interested in vi/vim integration with Visual Studio.

Given I had hardly researched at all, I learnt stuff as things happened. When I set up the campaign, I saw I had to pay about 4 euro cents per click. But afterwards, I had to raise the bid to 5 cents/click, as google warned me and turned off advertising for some key phrases because of a price that was too low. This is pretty simple to see at your adwords.google.com account.

I also started getting hits from those clicks. I found them as hits referred from “pagead2.googlesyndication.com/…” where the “…” is a really long and complex reference. It actually took me a while to realize thouse clicks were from google adwords.

There have also been other weird hits, which had a referer address of “searchportal.information.com” followed by some kind of encoded ID (such as “UVsPWVALXVUMVV8LWQgQRggaCFIXE1Y_CFEIDA0BAQ”). These addresses took me to a search page which has the nasty habit of becoming a “frame parasite” to your web surfing, and used to encode URLs to those ID strings and route everything through their site. I had severe doubts that someone educated enough to use vi/vim would surf with such a bugger.

Anyway, back to my http log review, I started doing an analysis on my November data. I usually keep track of how many downloads of my product there are a month, and try to study the correlation with monthly sales (given the 30 day trial period, tracking is a bit difficult, but I think general trends are still there). I decided to classify all hits to www.ngedit.com in November to be able to tell how many of those came from IP addresses that ended up downloading the trial version of product.

I used vim on the log files to do this process. vi/vim is pretty good for this kind of text processing and I had the desired list in a short while, although it did involve some of that vi black magic.

Anyway, it turned out that, out of the ~20,000 hits of the month, over 6,000 belonged to IP addresses that downloaded ViEmu. As an aside, it was higher than I expected. But now that I could focus better on less information, I could start seeing some new information. I removed all lines not containing “pagead2” in this reduced hit log (ad-vi-tisement: “:v/pagead2/d”), and got myself down to just 11 lines – and to my amazement, there were only 3 IP addresses! One IP appeared only once, but the other two appeared 5 times each. In the full log, there were 24 hits from pagead2, and the repetition of IPs was kind of “hidden” (I hadn’t done a :!sort on them to see the unique addresses).

I nslookup’ed both addresses, which actually only differed in the last byte of the IP address, and only ‘localhost’ was returned from the reverse DNS lookup. I went back to the full hit log, removed everything but IPs belonging to the same subnetwork (n.n.n.*), and I also found out that some of the “searchportal.information.com” links belonged to them. Things started to make some sense.

Let me show you one of the googlesydication referers at this point (broken up in lines for nice display):

http://pagead2.googlesyndication.com/pagead/ads?
client=ca-pub-0919305250342516
&dt=1131084326621
&lmt=1131084326
&format=336x280_as
&output=html
&url=http%3A%2F%2F72.14.203.104%2Fsearch%3Fq%3Dcache
  %3Aq1fqcI2sut4J
  %3Awww.lyrics007.com
  %2FBeverly%252520Craven%252520Lyrics
  %2FPromise%252520Me%252520Lyrics.html
  %2Bpromise%2Bme%26hl%3Dvi
&color_bg=FFFFFF
&color_text=000000
&color_link=0000FF
&color_url=008000
&color_border=FFFFFF
&ref=http%3A%2F%2Fwww.google.com.vn%2Fsearch
  %3Fhl%3Dvi
  %26q%3Dpromise%2Bme
  %26btnG%3DT%25C3%25ACm
  %2Bki%25E1%25BA%25BFm
  %2Bv%25E1%25BB%259Bi
  %2BGoogle
  %26meta%3D
&cc=161
&u_h=600
&u_w=800
&u_ah=570
&u_aw=800
&u_cd=32
&u_tz=420
&u_his=12
&u_java=true

That’s a URL!

I started trying to decipher these URLs. Watching other pages that implement adsense, and how they appear on google’s cache, I deduced the referer for this click (for which I was being charged), came from a google cache page (“url=http://72.14.203.104/search…”). It was a cached page from www.lyrics007.com, which is a repository of song lyrics. The google cache had been accessed from a search at google Vietnam (a trip to www.google.com.vn showed that). The search seemed to include part of the lyrics and “vi” with some weird unicode characters in between (I’m as of yet unsure of whether those %25BB are geometric signs or diacritic marks).

Who gets payed for that click? I think the owner of www.lyrics007.com does. A whois look up showed that the hosting provider is located in Houston, Texas, and that it is registered by someone in Hong Kong.

If I visit any of the lyrics pages, sure, the Google ads are relevant to the content of the page. But it seemed that the ad for ViEmu appeared when looking at a google-cached copy of the page. It’s weird but it may happen. I think the “vi” with weird characters in between may have tricked the adsense engine into showing my ad.

The visits coming from ‘lyrics007’ showed different types of activity. Sometimes just a hit to the ‘html’ file, other times regular page viewing involving hits for the graphics on the page, and even downloading the product! I even found other hits from the same IP addresses coming from ‘searchportal.information.com’.

So what may have happened? I have two possible explanations.

One is that a developer in Vietnam was looking for the lyrics to some song, using www.google.com.vn. Developers also listen to music and check lyrics once in a while. He clicked on the google cache, in order to access the page, and Google picked my ad (as, obviously, adsense technology is imperfect and the relevance of the ad is just an heuristic). While humming to the tune of the song, the guy in question saw the ad to my product, and was excited to finally see vi emulation in Visual Studio. He clicked on the ad and came to my site. He even downloaded it.

This would mean that I payed google and lyrics007.com for reaching a potential customer of mine – someone who I wouldn’t have reached easily in another way. Fair enough.

The weird thing is that this same guy has some other friends using the computer (or sharing the IP address) who went through exactly the same process several other times during the month. With different song lyrics, of course. And some of the times, their browser crashed before even hitting the ‘css’ file or the page graphics (or maybe they surf with images deactivated?).

They even downloaded ViEmu several times during the month – they must have a messy download directory.

This also happenend from other domains, not only lyrics007. I haven’t researched them much, but they seem to come from nearby areas. If all of the cases have similar explanations, then the domain holders / adsense publishers are not to blame at all.

And then I have a second possible explanation.

Some guy in Hong Kong has set up several domains with song lyrics and other easily accessible content downloaded from other sites. As those guys are damn smart, they have figured a way to force a google cache access to their page into showing any adsense ad. I’ve been trying to do it myself, and haven’t been able to, but the cache does show weird adsense results. Then, they have some kind of bot which accesses those pages and simulates clicks on the ads. They probably click on many “cheap” advertisers & keywords like mine, but every once in a while they might click on a 50 cent or even a $1 ad. I guess they can make quite some cash that way, apart from the legitimate traffic that their site drives. They even use another method based on ‘searchportal.information.com’ URL hijacking, which hides even more information from advertisers. And they have even improved the bot to fake normal access to web sites.

I can’t know which one is the right explanation. But, I talked to Andy Brice of PerfectTablePlan, and followed his suggestion of turning off advertising on the “content” network (adsense). I’m only advertising on google’s own search results, for which only google gets paid, and which removes the clickfraud incentive for 3rd party publishers.

I’ve also limited ads to specific countries. Mainly, I’ve limited the countries to those on which I already have customers:

  • USA
  • Canada
  • Russia
  • UK
  • Australia
  • Netherlands
  • Germany
  • Finland
  • Norway

I’ve also added other countries which I think are as likely as those to get me customers: Sweden, France, New Zealand, etc… but that’s about it.

Given that I have been spending less than €10 a month, the scam hasn’t been problematic for me. That’s the main reason it took me several months to investigate and optimize the issue – doesn’t make sense to optimize expenses when they are among the lowest ones.

I expect my adwords costs to go down to one tenth of what they have been – which I think amounts to pretty much the legitimate/interesting traffic that I was getting anyway.

I’m pretty happy for the result of google ads – one customer did tell me that they had found about ViEmu through an ad in google search. That single sale makes up for the rest, which I like to understand as the cost of my training with google adwords.

I hope this post doesn’t upset google – I believe it helps other people make a better use of adwords, and thus also helps google have more happy customers!