Archive for August, 2005

Next steps

Monday, August 29th, 2005

Sorry for the delay – I finally have spent several days with the flu, which isn’t the best premise to do anything useful.

When I released ViEmu, the past July 26th, I talked to a friend and told him “step one.” It is quite true: that was only the first step. One month later, I have released a new version incorporating suggestions from customers, I have had a few reviews (mostly positive, although some point to some limitations, and others simply wonder about the whole idea), and have had a few sales. The same way as myself, even if vi command line support is not present yet, some people have found using the vi/vim input model within VS a big win. My site is nowhere to be seen on Google results for obvious searches like “visual studio vi emulation” (although it once was something like #12, it has disappeared into oblivion), but I have an adwords campaign set up to handle that, and it indeed is #1 in similar searches on MSN or Yahoo.

I’m very happy to have a product actually released and customers using it. I have already learnt a lot about setting up the sales & marketing, deployment issues, receiving and acting upon customer feedback, and, overall, about the whole development-release-sales cycle. I’ve even learnt how to prepare a Microsoft Installer (.msi) file to perform automatic upgrades (for which the most similar analogy I can think of is drilling a whole on your skull.)

Now I have to think what the next steps are. I now have two “babies” to take care of: ViEmu and NGEDIT.

The next step for ViEmu is clear: adding “ex” commands support (the vi command line which allows gems such as :v/./.,/./-1join to compress empty lines, and similar useful tasks.)

As an aside, it would be good on one hand to “back-port” the vi/vim emulation to NGEDIT. The emulation in ViEmu is implemented in C++ (originally ported from the NGEDIT Scripting Language), and is now much more complete that the original.

And as a second aside, some refactoring of NGEDIT is in order, now that I can see the code with more critical eyes (thanks to the perspective gained in two months of not touching the codebase.) I’ve actually already started with this, which is also a good way to get familiar again with the code of NGEDIT (it’s already an almost 50k+ lines of code small beast.)

But these last two are minor tasks which shouldn’t take too long. The vi/ex command line support is more work, but I will surely work it out in the following weeks/months: I need to do a regular expression engine (which I plan to share between both products), and a parser for the ex commands, which shouldn’t be too bad. (Note: following with my tradition of technical postings, I’ll probably explain how I implement regular expression support – no, I won’t use any off-the-shelf libraries, in part because of the functionality I want to provide, but also probably due to my severe case of the Not-Invented-Here syndrome.)

The main area that needed a full reevaluation was how to tackle the development of NGEDIT itself. I’ve actually thought a new development and release strategy/roadmap, but let me first expose the preliminaries.

I have some ideas for what I think will be “killer” features for a text editor. I may be wrong, but please bear with me and assume that’s the case. The reason I decided to develop a text editor was the thought of these features, as the text editor market is already quite saturated.

Blindly as I do think about the value of the new features, I initially thought that other editor developers would be ripping them off as soon as NGEDIT is out. Maybe it is unwarranted precaution, but I decided to first develop a quite complete text editor, which would be mostly up to par with current editors (probably not Visual SlickEdit, which is probably the most feature-loaded editor there is, but definitely with other popular text editors such as UltraEdit.) This way, I wouldn’t lose my edge with the new stuff.

It turned out that the whole set of features that a modern text editor has is a heckuva lot of work to develop. I initially thought this part would be less work, but it turns out that all of the development work in NGEDIT has only started to get it into what other editors offer. Actually, I haven’t even started designing or implemented the actual innovative features of NGEDIT (and they do require quite a lot of research!)

Now comes the experience of ViEmu. The kind of “echo” that ViEmu has received has been a bit less than I expected. Probably releasing in August is not the best moment, equally probably I should wait a bit more than a month before evaluating the result, and probably it will take a bit of time and probably some more versions released until it becomes more widely known. But I have found out that, even if the internet is a great resonance chamber which creates a great echo of remarkable products, it behaves as a dense and difficult-to-travel information mesh to those products that are not remarkable, or which target a too small group.

I can’t help but think that ViEmu is targetting those who are “vi lovers”, probably much less than 5% of the developers, which themselves are probably less than 5% of the general software-buying public. 5% times 5% is about 0,25% of the potential software buying audience (and, yes, this is a bit of a faux argument, as you can never target 100% of the audience, but yet my point about how ViEmu is a very niche product is still valid.)

NGEDIT is a general purpose text editor, which already targets a much wider audience than ViEmu. This makes it a better starting point in order to generate a profitable business. But then, one thing is easy to see: even if I develop an editor with the 14,273 or so features other text editors have, that won’t make it remarkable. I could spend one year implementing everything down to spell checking, FTP file access, and emulation of some obscure and forgotten editor’s command set, and even then I would still have a slow start.

The point about usability is important, and will help it become a successful product, but yet that’s not something that creates its own phenomenon.

Fortunately, I have the “killer” features to try to create a remarkable product. But then, does it make sense to spend a bunch of time to have just a “standard” product before I even start with them?

On the other hand, I’ve realized one thing: not only vi emulation is something that (at most) 5% of the users of a text editor miss, even regular expressions are something that is not used daily (or at all!) by many programmers.

And as a final element, my motivation starts to decline if my work consist in coding known-feature-after-known-feature. And given that I have created a quite powerful underlying technology framework, the codebase looks a bit like a Gruyere cheese full of holes that are designed-in but not filled in yet, such as double-byte character set support, filling in the template code for a lot of template-based-generalized elements, completing the scripting framework, plug-in support, etc… All these promise a lot of not-too-creative code-churning hours.

Meanwhile, I’m really eager to start researching, designing and developing the new features.

So, I’ve figured out there is a much more sensible strategy to NGEDIT. Put it the short way, I will be focusing in the innovative features, and leaving the “completeness” in comparison to other editors as an aside. I have two or three basic tasks to perform before actually starting with them, partly due to a few early wrong decisions which require a bit of code refactoring, but basically I think I will be able to start with them this week.

I’ve changed my immediate focus from the-complete-text-editor to a-nice-little-editor-with-nifty-stuff. I am currently using it myself, as I make changes, and focusing in making a better tool for me than vim or Visual Studio with ViEmu. If I can achieve that, I have the confidence it will be likewise for other people (even if other people don’t focus on vi-like editing, of course, the new features have nothing to do with that.)

I also know, from other projects and even other disciplines, that when you focus on immediate use, many other issues become apparent and pieces start to fall in place, and even completeness comes along.

Regarding the point of staying ahead of other text editors, or at least not too far behind, I think I have probably overestimated the risk. Even if it is successful, NGEDIT will take some time to catch on, which is usually measured in months, and I will be able to employ that time in getting the checklist features that are missing from NGEDIT (although probably not yet C++ Intellisense, but that’s not so common either.)

I’m using NGEDIT as a I go along. Developing a text editor is good in that you can use and test it for its own development. On the other hand, it is a bit of a mess. Even if you get a bit organized, and prepare deployment releases and don’t use the bleeding-edge last build (as I had to do for ViEmu in an even more complex interaction involving the IDE), it is still a bit of a mess. So, what I am doing is to develop the regular expression library with NGEDIT, which I start from Visual Studio itself, and the whole process is a bit less messy. And ViEmu then also benefits from this, even if it does require a bit of mental task-switching.

So: I’m focusing on bringing NGEDIT 1.0 as a very innovative editor, if not as feature-complete as existing editors, and I’m really pleased that I’ve taken this decission, which I think makes sense both development- and business-wise.

And only time will tell if the strategy was right, but then this is a bet!

Blog’s got the new look!

Tuesday, August 23rd, 2005

I’ve just finally finished and uploaded the new blog theme – it’s based on the main NGEDIT page, and it looks so much nicer than the old stock design!

I’ve also verified that everything goes fine with both Firefox and IE. The problems with IE were caused by content which was too wide, and overflowed the main post area. I’ve had to edit several old posts so that everything’s fine. I apologize, because that may make your news reader bring up a lot of seemingly “new” posts, which actually aren’t. Basically, I’ve had to retouch the “screenshot” post and many of the older code listings.

I’ll now look into some WordPress plugins to round up the “blog day.”

I know, I know, I promised I would be posting a summary of “business progress” (and some loud thinking on what the next steps are), but I don’t feel very well right now (I think I’ve got a cold.) Duh.

Updating WordPress & Preparing Theme

Tuesday, August 23rd, 2005

I have just updated to WordPress 1.5.2. Very quick & painless. Please tell me if something’s wrong.

I am also in the process of “skinning” the blog, so it may look temporarily funny in the next few hours (it was already looking very wrong through IE!)

I will post as soon as it’s finished.

ViEmu 1.1 released

Tuesday, August 23rd, 2005

Finally! I have just released and updated ViEmu 1.1 . It improves version 1.0 in many ways, including a new way to hook into the Visual Studio environment: much better integrated, and compatible with third party tools like Visual Assist or Resharper (which was a recurring request with the previous version.) There are a few minor vi/vim emulation improvement.

I plan on summarizing the general project status in another blog post, which I think will be interesting in a more business-oriented rather than technical way.

Next steps & screenshot

Wednesday, August 10th, 2005

ViEmu 1.0 has been out for two weeks now and I’ve had some interesting feedback. I’m getting ready ViEmu 1.1, which solves some limitations in the integration with VS – ViEmu will take over all editing windows within VS, instead of being a separate editor type. This helps with the VB form and HTML editors within VS, and also with not getting the standard editor instead of ViEmu through some UI elements which are not completely correct within VS (such as the “View Code” button which bypasses the whole internal VS editor mechanism).

I think I will be able to release ViEmu 1.1 next week. This will address all the major outstanding issues in 1.0 and, with the exception of unforeseen bugs or problems, it will be the latest released version for some time.

Right after that, I will be getting back to NGEDIT. I’ve had the chance to think quite a lot about NGEDIT during the development of ViEmu, and I will probably start by doing some refactoring of the code. Apart from this, there are two major pieces that are missing before I can feel comfortable with how complete the tech base of NGEDIT is: syntax highlighting and regular expression search. I also want to improve the core memory management code, so that it will better lend itself to hex editing of files and storing other types of information.

There are a myriad other things to do then, in order to cover what the expectation of a modern text editor is. They will take quite some time, as each little detail needs its own love and care. But, after the refactoring and the major two features, I will already be able to start focusing on the most important part: the UI. I have a ton of ideas that I’m looking forward to implement and try out.

I thought it may be time to post an early screenshot of NGEDIT. It is from April, quite incomplete, but gives a good idea of the general look. Toolbar icons in large to appreciated the drawings 🙂 (click to see a full size version)

Screenshot

Given that I’ve checked out the beta of VS2005 to do the porting of ViEmu to VS2005 (which is already working), I’ve seen that Microsoft have gone with a similar look for the gradient-based UI background. With a bit less taste, if I may say so myself 🙂

Miscellaneous issues

Thursday, August 4th, 2005

1. I’ve donated 5 licenses of ViEmu to Seth Dillingham’s PMC fund, which are auctioned together with other software in order to raise money for the Jimmy Fund for cancer research.

2. Some searches such as “vi keymap Visual Studio” are already landing on the ViEmu page, as logs are showing. Not many, but they are. The Google ads I set up over a week ago are not showing. And, the main ViEmu page is already #12 in Google when you look for “vi emulation visual studio”.

3. After some input from people who are checking out ViEmu, I found out that ViEmu does not show up when the “View Code” button is pressed on the project window in VS. Given that I mostly use C++, and this button is only there for VB/C#, I didn’t even know about it. Some testers have used ViEmu with C# or VB, but it seems they didn’t use it that often. I spent countless hours to find out what was happening – mind you, the only way to track such a thing is to step through the assembly code of VS (heavily COM based code), which is quite a pain. I finally found out that the C# project managing code directly calls up the standard C# editor for such a case, bypassing the general VS editor mechanism. The only way to fix it will be to intercept all code windows in the environment, which I was already planning to do. This will also allow ViEmu to be used with multi-view editors such as HTML, but it will take some work (esp. for the OLE command stuff, which is less controllable if ViEmu is not a standalone editor but an interceptor of a standard one).

4. I’ve already checked ViEmu with Visual Studio .NET 2005. Works like a charm. A person from the VS team wanted to check ViEmu out and that prompted me to try porting it. For some weird reason, the VS2005-specific release I prepared does not install on his machine. As it works on mine, I have a hard time testing it. Sigh. I guess VS2005 being still in beta helps explain it.

5. I have some days off from my day job, so I’m out of the city, and I am using a pretty slow Internet connection. And we used to call this “Internet”? I think “lame excuse” would serve it better.

Unicode, text management, and C++ techniques (and III)

Tuesday, August 2nd, 2005

If you are a C++ programmer, I recommend you read this article – the techniques discussed are quite interesting and could be useful for your own non-text-related projects.

In the last article in the series, we saw what UTF-8 is about. And I promised to cover some interesting techniques in order to handle this maze of encodings. Let’s get to it.

In order to abstract out the differences between text encodings, I decided to implement the concept of a codec: a text coding and decoding module that can convert between a particular encoding and some general interface that the main editor uses.

As we saw in the last article, using a base class with virtual functions has two important drawbacks: the first one is that all access is through costly virtual function calls (at least, quite costly compared to raw byte-size character access), and the second one is that it most probably forces us to use full unicode for the arguments and return values.

So, I decided to implement the codec as a class which will be used as a template argument. There is one such class for each supported encoding (TCodecNativeSimple, TCodecForeignSimple, TCodecUTF8, TCodecUTF16BE, TCodecUTF16LE, TCodecDBCS, and TCodecGeneral). Each such class is not meant to be instantiated, and it doubles as an access mechanism to codec-defined specifics – that is, it only has public members, and it doesn’t have any member variables with the expection of const static members (C++ idiom for constant class-wide values).

For example, each of these classes contains a TLineDecoder class. So, we can instantiate a TCodec::TLineDecoder object in order to walk a line of encoded text char by char and do whatever specifics we may need.

But the greatest strength of this technique comes from defining types within each codec. Each codec aditionally defines a TChar type, which represents the preferred type in order to manipulate such text.

For example, the native-simple codec is used for the platform-native text encoding, but only when such encoding is a single-byte-per-char encoding (eg, US and European native codepages qualify, whereas Japanese and Korean native text is not handled by this codec). This codec doesn’t require converting the characters input by the user, and can be output via native Windows API calls. And the TChar type for this codec is a simple byte.

As another example, the foreign-simple codec is used for one-byte-per-char text encodings which are foreign to the current platform (for example, US-Windows-codepage in a machine using another codepage as native, Mac text on a PC, or any of the ISO encodings such as Latin1, etc…). Given that this text cannot be reliably represented in a single byte, the TChar type in this codec maps to TUCS4Char (a full 4-byte Unicode codepoint).

We use this mechanism in order to map concepts in as many levels as we want. This allows us to map both high- and low-level concepts, so that we can have the required level of access in every part of the main application without performance getting hit. I really hate it when a concept that makes development much more comfortable makes a significant compromise in runtime performance.

Apart from operation classes (such as TLineDecoder) and basic types (such as TChar), the codec class also features some static const members, that represent features of the encoding. For example, all codecs have a c_bFixedCharWidth boolean member which indicates exactly that, whether encoded chars are all of the same byte length.

As an example of how this works, the function to find whitespace which we have used as an example may be written like this:

template<class TCODEC>
unsigned FindWhiteSpaceRight(
  const byte *psz, unsigned uLen, unsigned uOffStart
)
{
  typename TCODEC::TLineDecoder ld(psz, uLen, uOffStart);

  while (!ld.IsAtEnd())
  {
    typename TCODEC::TChar ch;

    if (ld.IsInvalid())
      ; // Handle in some way
    else
      ch = ld.GetChar();

    if (TCODEC::IsWhiteSpace(ch))
      return ld.GetCurPos();

    ld.Advance();
  }

  return uOffStart;
}

Let’s see some aspects of this code. For one, you can see that we are indeed checking for invalid characters. For encodings that may present invalid encoded characters, this function will check validity. But for encodings that can never encounter invalid encoded characters, the call IsInvalid() will be hardwired to return ‘false’, and so the compiler will optmize that part of the loop away! The same optimization happens for a function such as Advance(), which will amount to just a pointer increment for the most common one-byte-per-char encodings, while the code we have written is properly compiled to all the complex mechanic involved in decoding UTF8.

Code that checks for TCODEC::c_bFixedCharWidth with a seemingly runtime ‘if’ will also be evaluated and optimized out in the compile stage of Release builds, as the compiler is smart enough to see it is actually a compile-time constant.

And, as a final remark, we talked about TAB character decoding at the end of the last article. It turns out that having TAB characters in a file involves quite a lot of complexity, as the offset within a line loses any correlation with the graphic column. But this is not the cases for files which sport no TAB characters, and we are losing some performance because of this. One way to handle this seamlessly: have TAB handling abstracted behind a scheme such as the one above (I call TABBER the concept equivalent to a CODEC for TAB decoding), and choose between two TABBERs depending on whether the file contains TABs when loading. You can always switch to a non-null TABBER if the user inserts a TAB character. For people like me, who prefer not to use TAB characters at all, this is a win in most if not all editing sessions.