Compas Pascal: 2009

Saturday 26 December 2009

Snake game: when programming isn't compatible with modern theories

I saw a question about how to create a snake game "like on the Nokia mobile phones". The answers differed a lot from each other - but few talked objects (OOP). The general consensus was that the snake should be represented as a double-linked list, linked using pointers.

The first time that I saw the game, was on my brother's computer. He had created the computer himself, based on a Zilog Z80A CPU with 4MHz and 4kbyte RAM. Everything was home made - graphics electronics design, etching the boards, a hex keyboard for his machine code boot software, and a high level language and all the software on top of it. However, the computer had the boot software and the high level language in an EPROM (requires UV light to be erased), and there was no other persistent storage, so all programs that you wanted to use, had to be programmed before using it.

It was back in 1981, and I was allowed to use his computer for gaming, and I knew how to use the hex keyboard to instruct the boot software to start up the high level language, so that was great. I just had to program the snake game every time I wanted to play it, so I got quite good at it. And each time I programmed it, I made it a bit different, of course.

Enough about the background, here is the data structure of a snake game:

type
  TItem=(spEmpty, spFood, spPoison, spWall, spSnakeHead, spUp, spDown, spLeft, spRight);
var 
  arr:array[0..79,0..24] of TItem;
  TailX,TailY,HeadX,HeadY:integer;
  SnakeLength,LengthToAdd:integer;

This is how you add food or poison, avoiding snake positions, wall etc.:

procedure AddItem (item:TItem);
var
  x,y:integer;
begin
  // It can be argued that this loop does not
  // always end within finite time, but who cares?
  repeat
    x:=random(80);
    y:=random(25);
  until arr[x,y]=spEmpty;
  arr[x,y]:=item;
end;

Then you need this one:

procedure MoveCoor (var x,y:integer;dir:TItem);
begin
  case dir of
    spUp:dec (y);
    spDown:inc (y);
    spLeft:dec (x);
    spRight:inc (x);
  end;
end;

So when you want to move the snake's head, you do this:

arr[HeadX,HeadY]:=MoveDirection;
MoveCoor (HeadX,HeadY,MoveDirection);
found:=arr[HeadX,HeadY];
arr[HeadX,HeadY]:=spSnakeHead;
if found<>spEmpty then begin
  if found=spFood then LengthToAdd:=SnakeLength*0.5
  // Alternatively just set LengthToAdd:=1
  else (* End game, we ran into something that is not healthy *);
end;

Increasing the length is simple, just keep a counter of how much longer the snake needs to become, and every time you are about to move the tail, either decrement that counter, or when it is zero, actually move the tail. This is how to move the tail:

if LengthToAdd=0 then begin
  dir:=arr[TailX,TailY];
  arr[TailX,TailY]:=spEmpty;
  MoveCoor (TailX,TailY,dir);
end else begin
  dec (LengthToAdd);
  inc (SnakeLength);
end;

As you can see, a snake game is seriously simple, uses almost no CPU, is very short in source code and uses very little RAM. There is not a single object in all this, because OOP wasn't invented back then.

Could this be done more easily today using modern programming? I haven't found a better method, yet. If we should describe the old method with modern terminology, what would that description look like? The snake must be some kind of object, but it's data is stored in an array that represents the screen. However, the screen array doesn't just show the presence of the snake, but also the direction to go in order to find the next element of the snake. So, basically, according to modern criteria, this code is unstructured, messy, non-scalable, non-maintainable etc. However, in my opinion, as long as the code is easy to read, it is maintainable, and if it scales within required boundaries, it's scalable. It would be easy to encapsulate the "Snake game engine" into an API, hiding the implementation details, so I definitely still consider this to be great code.

Wednesday 23 December 2009

Unicode technical paper from Cary Jensen

Cary Jensen has produced a White Paper about migrating to Delphi 2009, with real stories from real migrations. View it here or download it here. The paper is the result of his call for migration stories, earlier, and a lot of people contributed with examples, including me. Besides being a good paper to read before migrating, it also explains many things that most may not be aware of.

Tuesday 22 December 2009

Kylix comeback or something better?

Heise.de reports, that the Qt toolkit is currently being ported to Google's Native Client (NaCl). Qt was known as the GUI framework for Kylix on Linux, and once Qt has been ported, it seems like a very easy thing to make Kylix able to compile GUI apps for Google NaCl, providing a development kit that creates GUI apps with native code to run as managed code inside a browser.

Does this make sense? Some of it does. Delphi/native developers create cool GUI apps, but most of them are networked. If we can create a applications, that are delivered as easily as web pages, using the same source code as our native Win32 apps, that would be extremely great. However, Qt or CLX would not be the best framework for it... so if Embarcardero delivers a CLX-based tool for Qt & NaCl, most users would probably initially not dare to use it for anything else than products with short expected support lifetime. However, the internet has a lot of these.

In order to invest a large amount of R&D money into apps developed using Delphi for Qt&NaCl, right from the first version, we need at least TCanvas support, but preferrably support of the visual components of the VCL.

I seriously hope that there is a business case for some of this. Embarcardero has a unique chance to create something great, based on existing technology. One of the cool things about Google Chrome and Google NaCl is, that they do not require administrative rights to be installed on a PC - unlike the .net runtime. And with Google Chrome Frame, even MSIE will be able to run this.

Friday 18 December 2009

Call for learning about bits and bytes

I still often encounter people with a Master's degree in Computer Science, Information Technology or whatever, who are not used to think bits and bytes. The world of computing is huge, and I fully understand that the universities need to filter and carefully select what topics they should teach their students. However, there is a huge number of topics where knowledge about bit manipulation is important: IT security, resource leaks, format compatibility, communication protocols, low level APIs, data conversion, and even such thing as understanding UTF-8 used in XML, and many other places. There almost isn't a problem that cannot be understood by looking at it as bits and bytes.

Therefore, my advice to Computer Science students is this: Learn and understand bits and bytes, addressing, data formats (including IEEE 754!), binary operators and pointers and offsets. Don't consider yourself a good programmer before you know these by heart.

Thursday 10 December 2009

When to use record instead of classes

The original use or the record construct in ObjectPascal was to group different values in a way, that can be used in arrays, parameters etc. Classes have taken over this job, but the record construct is still sometimes important.

For instance:

type
  TTest=
    class
      a,b,c:integer;
    end;

var 
  arrT:array[1..1000000] of TTest;

begin
  for i:=low(arrT) to high(arrT) do begin
    arrT[i]:=TTest.Create;
    arrT[i].a:=2;
    arrT[i].b:=3;
    arrT[i].c:=4;
  end;
end;

On my computer, this takes 161ms first time, and 91ms second time. Reading it like this, takes 9ms:

begin
  c:=0;
  for i:=low(arrT) to high(arrT) do begin
    c:=c+arrT[i].a+arrT[i].b-arrT[i].c;
  end;
end;

If you rewrite this code to use array of record, it looks like this:

type
  RTest=
    record
      a,b,c:integer;
    end;

var 
  arrR:array[1..1000000] of RTest;

begin
  for i:=low(arrR) to high(arrR) do begin
    arrR[i].a:=2;
    arrR[i].b:=3;
    arrR[i].c:=5;
  end;
end;

This takes 17ms first time and 9ms second time. Reading it takes 6ms:

begin
  for i:=low(arrR) to high(arrR) do begin
    arrR[i].a:=2;
    arrR[i].b:=3;
    arrR[i].c:=5;
  end;
end;

In other words, in this specific case, a record is about 10 times faster than a class, using Delphi 2009. Adding a string that does not receive a value, does not change the values much. Instead of 17ms, it now takes 23ms, but the record size has increased, so that makes sense:

type
  RTest=
    record
      a,b,c:integer;
      s:string;
    end;

However, if you assign a value to the string, performance drops:

begin
  for i:=low(arrR) to high(arrR) do begin
    arrR[i].a:=2;
    arrR[i].b:=3;
    arrR[i].c:=5;
    arrR[i].s:='234';
  end;
end;

In this case, the record solution takes 124ms, and the class-based solution takes 214ms. In other words, record only beats the class solution by a factor 2. The reason? Each string value assignment is as complicated as creating an object.

Conclusion: record beats classes in speed, but the benefit is only significant if you use less than one string value per record, and don't use classes inside the record. The biggest improvement is in creating the data structures, not in reading them.

Monday 7 December 2009

Why there is no app store on Windows

How long does it take to install Microsoft Office? Well, first you have to order it, and then you have to wait for it to arrive by mail, on a physical medium. Ever wondered why there is no app store on Windows?

Steve Ballmer explains it in an article on CNET: "nobody has any trouble getting apps"

I'm obviously nobody, so there must be another reason. Here is my best guess: An app store on Windows would be a target of regulation by authorities, and this would make it hard for Microsoft to design the app store in a way, that makes Microsoft applications the obvious choice for Windows users. Microsoft seems to take it for granted, that the alternative to getting an app on Windows, is to get the app on another PC that does not run Windows. For many Windows software vendors, it has always been the rule of thumb, that the user picks the application first, and the OS next - but...

as Ballmer puts it: "The whole Internet is designed basically for the Windows PC."

So, why is everyone excited about "app stores" like in Ubuntu, iPhone, Playstation, Android etc.? Ballmer explains it this way:

Ballmer: "...you need so many apps in a mobile app store is to remap Web sites that were written for the PC to look good on a mobile phone..."

Seen from a marketing point of view, there is no doubt that brands, that made it well on the web, like Facebook, are also good for marketing app stores. However, most of the apps that I downloaded from an app store, have either used the built-in GPS, used the built-in camera, improved the phones user interface, saved phone bill costs or interacted with other apps on the phone, like the contacts database. Nothing of this can be done from a web app.

Since Microsoft does not seem to be motivated for an app store, what about Google, Oracle/Sun and IBM? However, they do everything they can to make Windows substitutable, and do not have an interest in creating an app store for Windows. Sun was discussing a Java app store, but that doesn't really solve the problem. Creating an Open Source app store has 2 problems: Lack of accountability and lack of wide deployment. The app store needs to be very widely deployed in order to be attractive - you cannot ask people to download and install an app store, it needs to be there automatically, maybe as part of something else (like bundling with a browser).

There seems to be no reason to expect a Windows app store any time soon.

Sunday 29 November 2009

MSIE market share below 13%, the problem of not upgrading

MSIE is still by far the biggest browser on a world scale, but segmentation adds another picture:

* In some market segments, MSIE 6.0 is still used on more than 80% of all PCs.
* Some websites have MSIE below 50% (like W3Schools or some Delphi developer sites).

Users seem to upgrade their browsers very differently. MSIE is still used in very old versions on many PCs, Firefox users are much better at upgrading, and Chrome self-upgrades automatically. As an example, this blog has this distribution for the last 30 days:

* Firefox 52%
* MSIE 18%
* Chrome 15%
* Opera 8.7%
* Safari 4.0%

However, if you divide it by major browser versions, in order to see what standards the site needs to support, it looks like this:

* Firefox 3.5 41%
* Chrome 14%
* MSIE 7 12.7%
* Opera 9 8.7%
* Firefox 3.0 6.8%
* Safari (version > 520) 4.0%
* MSIE 8 3.2%
* MSIE 6 2.5%

As you can see, Chrome climbs to second place, and MSIE 7 is on 3rd place, but with a downwards trend. As others have noted, MSIE 7 market share on a global scale started its downwards trend before the release of MSIE 8, and MSIE 8 will probably not gain enough upwards momentum to replace regain the lost territory, any time soon:

There are good reasons for the Bring Down IE 6 campaign, even though it doesn't make sense in some industries that depend heavily on IE6. But actually, one of the most important supporters should be Microsoft... as the numbers show above, the slow adoption of new versions of MSIE is a market share killer, and for Microsoft it would make sense to ask users to upgrade to MSIE 8 asap.

Many websites do not specifically support Opera, Chrome, Safari and other browsers. Maybe they should categorize their numbers differently and reconsider which browsers they should support?

Friday 27 November 2009

Google Wave is a software development platform

Google Wave has been reviewed in multiple places, but mostly by looking at the usefulness of the GUI tool that Google has made available. Instead, this post will focus on it's ability to compete with alternatives.

Google originally launched it as "e-mail as it would have been if we designed it today". However, it actually does not compete well with e-mail, for several reasons:

* There is currently no gateway for e-mail, indicating that it may be a problem to integrate it well with other messaging systems (SMS, MMS, SMTP, etc.)
* It can be very confusing to find out, who wrote what to whom and when. The replay function does not give a quick overview.
* It can be hard to find out, where in a wave there are changes.

There are many more reasons why Google Wave doesn't compete well for mails. It also does not compete well with most IM systems:

* In Google Wave, Person A can add person B to a wave that includes person C.
* The chronology is not 100% clear.

Again, there are more reasons. Google could probably build much of the features of e-mail and IM systems into the Google Wave protocol, like "do not allow participants of this wave to include other participants" etc., but a perfect IM system, built on top of Google Wave, would probably not be much different from other IM systems.

Collaboratively editing a document works much better in Google Spreadsheet than in Google Wave, simply because Google Spreadsheet delivers more structure to the document, with columns, tabs etc. For almost every generic purpose in Google Wave, there is a specialized application that does the job better, and these specialized apps usually work well together in a session.

Gadgets change the game: Gadgets make it possible to do things like collaboratively edit a mind map. This is great stuff, but it could have been done in Google Apps, as a new mindmap tool, too. As long as Google delivers it all, and you need to sign into Google to use it, there is not a huge difference. You can also insert gadgets in collaboratively edited Google Spreadsheet documents, so inserting Gadgets in Google Wave is not a benefit per se.

If an application developer wants to create a gadget for collaboration, this can be done in Google Spreadsheet or Google Wave. In both cases, the gadget needs to be available on a central server. However, there is one big difference: With Google Spreadsheet, data is stored in a single online service, whereas Google Wave makes it possible to have the data available on multiple servers at multiple companies at the same time.

Therefore, we can define Google Wave as: A platform for online collaboration applications, that features decentralized data storage and decentralized user authentication.

Google Wave becomes interesting when one of these events happen:

* the main user interfaces gives easy access to great collaboration gadgets
* companies start adopting Google Wave internally in their organization

Saturday 21 November 2009

The power of app stores and usability

I use Vopium to reduce my phone bill when making international calls and calling back home from other countries. Very nice product, huge savings, no subscription fee, works seamlessly when making calls, and easy to install from their homepage. However, the obvious is to install such tools from the app store, right? So when I had to reinstall it this week and went for the Nokia App store, it was empty. There was just one tool in there: A new version of the Nokia app store (named Ovi Store). Using an expensive data connection in a foreign country, that's just extra costs.

Nokia's usability department seems to have had a vacation for the last couple of years, and this new version isn't better, even though it should be a high priority for Nokia to keep their smartphone market share. The online version of Ovi Store isn't much better, because when I go into the online store using my phone, it first tells me that there is a better way, than the HTML version: I can use the Nokia Ovi Store app. It asks me, if I want to use the app or continue to use the website. If I choose to use the website, it shows the first page of the mobile version of the website, and then it automatically starts the Nokia Ovi Store app, moving away from the HTML version. If I'm not allowed to use the HTML version, why did it ask?

If Nokia's market share for smartphones continues to drop, usability must be the reason. Fortunately, Vopium's homepage works perfectly in the Nokia mobile browser and solved my problem, I didn't have to use the Ovi Store.

If you are worrying about usability in your project, I can recommend the usability works of Søren Lauesen. In contrast to Jakob Nielsen, Søren's works contain more generic and deterministic methods.

Monday 16 November 2009

Std. cookie use outlawed in EU

EU has investigated internet technology, and discovered that http-cookies are an invasion of privacy. Therefore, a new directive has been made, that forces consent before using cookies. To many programmers, this seems idiotic - cookies have worked well for 15 years, and continue to do so, and many businesses require them to be able to track users around. Even more, cookie permissions could easily be handled in the browser, but most people disable it because many websites are annoying to use if the popups keep appearing. So this directive, which will become law in EU, is a game changer, and it seems to have caught much of the industry by surprise.

However, who says that we should always push the limits of what technology can do, disregarding common sense for how to build a sane society? This new directive just means that cookies are not delivered unless consent is given. How are we going to implement this? You can ask for consent for sending all the cookies you want, so that your site can continue to work as before. Or you can switch to use other methods than cookies, for handling sessions. Using URL-based session identification makes the URLs annoying longer, so changing all links to POST-requests actually makes sense, even though it's surely not nice.

Besides the consent, there is actually something new: The "informed" part. What happens when non-technical users start to learn about what cookies can do? Will they just ignore it and move forward, or will it actually reduce the amount of cookies? Will there be technical changes to how cookies work? Which other technology will be the next to be regulated for privacy?

One thing is sure: technical workarounds are not meant to be legal. If the user can be tracked, no matter if it is by cookie or something else, there must be an informed consent.

Sunday 1 November 2009

The case for Domain Specific Languages

Instead of wondering why Domain Specific Languages (DSLs) make sense, let's try to look at the number of people doing programming. According to various sources, there could be about:

* 9 million programmers in the world
* 0.5 million professional programmers in USA+Canada
* 25,000 self-employed programmers in USA+Canada

I live in Denmark, and can relate to these numbers - the situation probably looks similar here. However, a recent official statistics about internet usage in Denmark has asked the population, whether they have ever written a computer program. 18% of Danish men answer yes, 7% of Danish women answer yes, giving a total of 13% of the Danish population. This also seems very realistic, since primary schools, high schools, universities all teach programming. Compare this to the fact, that only 48% of the Danish men have tried to compress a file, and 25% of Danish women have tried the same, and connecting peripherials have been done by 72% of men and 51% of women. For background, 86% of Danish families have a computer at home, 83% have internet, 76% have high-speed internet. For families with children, 98% have a computer and 97% have internet. This is age-dependent, of course, and only 65% of the above-60 y.o. have internet at home. Education also influences percentages, and 94% of all university graduates have internet at home - and remember that 15% of the population is more than 65 y.o.

So, if the number of people, who are capable of writing a program, is far larger than the number of professional programmers, it makes sense to use the knowledge of these people to automate processes, that a professional programmer would struggle to understand. That's one place where DSLs makes sense.

Sunday 25 October 2009

How to do daylight saving time properly

EU switched to standard time this morning, and as always, this causes trouble. Clocks have to be adjusted, computers can get confused in their scheduling, and IT systems failed during the night. My harddisk TV recorder did not have any TV program after 02:59 but had a proper error message for each TV channel. Here is how you can create software that does not have these kinds of problems:

The first thing to realize, is that all timestamps in your software must contain the offset to UTC. If you have a simple TDateTime, it does not contain that, so TDateTime is simply not good enough. Because the Windows API is focused on timestamps that are not UTC compatible, and because the Windows API was never meant to be used with UTC-offset timestamps in the future or in the past, we can look at alternatives. Linux does a very good job at handling all this, so this blog post will explain how to do it the standard Linux way.

First, transport and store all timestamps as "the number of seconds since xyz", where january 1st, 00:00 UTC is a good choice. Also, dismiss the use of leap seconds, so that all hours are 3600 seconds, that makes things much easier. If you need a better resolution than seconds, use milliseconds or floating point numbers, but integers are really handy in many contexts, so keep it in integers, if you can.

Next, realize that day/month/year/hour/minute/second is only necessary when interacting with humans. So do not convert to date and time unless you need to show it to the user, and convert it from date and time as soon as possible after input. As you may note, the conversion requires a ruleset. This is usually specified as a location, like "Europe/Berlin". This way, the conversion knows about historic and planned daylight saving, and other peculiarities about time. For instance, the October revolution in Russia happened in October in Russia, but it was November in Berlin because Germany had switch calendar system, but represented as integer timestamps, that is not a problem. Until modern times, different countries used different calendars, but even in USA, some states operate with many different rulesets, depending on where you live inside the state.

If you want to show a timestamp to the user, you may consider the special case, where there are two date/time combinations that are impossible to differ without extra information. For instance, Europe/Berlin 2009-10-25 02:30. We had two of these today.

Let's take an example: You want to create a chart, where the first axis shows one day, from yesterday until today. You can specify this in multiple ways:

* Start timestamp, End timestamp
* Start timestamp, duration
* End timestamp, duration

You can choose to say "from 2009-10-24 12:00 to 2009-10-25 12:00", or you can say "from 2009-10-24 12:00 and 24 hours onwards". The first choice actually gives 25 hours (!), so you need to make a choice here. If your chart always shows 24 hours, make sure that duration is specified.

Let us assume that we want to create a 24-hour chart. Then, you can simply find the values for the X-axis from the start timestamp, by adding the desired number of seconds. If the start timestamp is the integer number 1256410800, then just add 3600 (1 hour) 24 times, and you have all the timestamps you need. In order to show the chart labels, you need to convert the integers to human readable text. 1256461200 converts to "2009-10-25 02:00" or just "02:00", but the next hour timestamp 1256464800 is also "2009-10-25 02:00". Your entire axis becomes: 12 13 14 15 16 17 18 19 20 21 22 23 00 01 02 02 03 04 05 06 07 08 09 10 11 (2 times "02" and no "12").

The next problem is, how to save this in a database. Integers are obviously easily stored in integer fields. So, is everything solved now? Definitely not. It is absolutely not easy to debug/read integer timestamps in a database using database tools that cannot convert this to something human readable, and many chart tools do not support time-axis with complex conversion to labels like described above. Linux was built for all this, but Windows isn't. When you need to support daylight saving time properly, things get complex on Windows.

Does Windows provide easy, standardized alternative? Unfortunately, not. That's why we're still struggling with a lot of IT systems that do not support daylight saving time well. However, even if you support it well, you still have the problem of users that may not understand how to handle non-existent and duplicate hours.

Monday 19 October 2009

Google NaCl roadmap

Just in case you haven't studied Google's "native code safer than JavaScript" project, based on this video from Google, here is a very short summary of Google Native Client (NaCl).

Current status is:

* Still under review, but basically works
* x86 32-bit machine code is supported
* Non-accelerated graphics
* Sandboxed
* Delivers full native code performance in your website code. A video decoder delivered as part of a webpage is almost the same speed as a native video decoder for the target operating system.
* Cross-platform runtime library that makes the same native code run on several operating systems

The future brings:

* At least as safe as JavaScript (i.e. run native code off untrusted websites)
* Built into browsers (Chrome and others)
* 64-bit x86 support, ARM CPU support
* Fast 3D graphics using O3D from native code, suitable for 3D games and CAD applications
* Real-time applications

As far as I can see, Kylix and FreePascal can compile for this already, and it seems that one of the next Delphi versions can compile for this, too.

Thursday 8 October 2009

Parallel programming requires multiple techniques, not just one

There seems to be a huge search out there for the holy grail of parallelism. Some want functional programming to be the solution, others think about other solutions. However, the scale of the problem is often ignored: Parallism is introduced on a huge number of levels, each with different solutions:

On the bit level, we can handle multiple bits at the same time. A CPU can handle 8 bits, 16 bits, 32 bits, 64 bits in one step. The more bits that are handled, the more parallism we have. However, you cannot just use 1024 bit arithmetics and get more speed, there's a limit.

On the instruction processing level, pipelines make it possible to execute multiple instructions with less time between, than the time it takes to execute one full instruction. The CPU simply divides instructions into multiple parts, and executes instruction parts in parallel, so that fewer parts of the CPU aren't running idle. However, this obviously has a limit - and I guess most readers know the tradeoff between pipeline size and speed in games.

On the CPU level, we can have multiple cores, with each their own caches etc. This makes it possible to execute two threads at the same time, although they usually access the same main RAM, and this gives a limit to parallelism... don't expect much additional performance after 8-16 cores on the same main RAM.

In the machine level, we can do NUMA architectures. Multiple CPUs do not share RAM, but can access each other's RAM with reduced performance. If we want massive parallelism, it is required that the CPUs cannot access the same RAM with the same speed. There is a performance hit when the CPUs need to exchange lots of data, so this cannot improve the speed of everything.

On the network level, we can connect CPUs that cannot look into each other's RAM. This can be a worldwide network, but it introduces even more performance hits when exchanging data.

The main focus right now seems to be on the "2-16 core on one main RAM" level. This is not fully solved using functional programming or similar techniques. The NUMA level is completely out of focus because we don't run common operating systems that allow a multithreaded application to be distributed across several CPUs without a common main RAM.

So, when searching for a good solution to the multi-core level, always remember the other levels. It's the combination of all levels that decide the final performance.

Wednesday 7 October 2009

Gaming industry has the next generation GUIs

First, for reference, this blog post is about non-game applications.

Usability and GUI design have been hot topics since the dawn of computers. The trends have always been towards increased usability and productivity. Since the world of IT is much more complex than what can be described using graphics from outside IT, GUI design has introduced a lot of mechanisms and visualizations that you need to learn before you can use a computer. A simple "Close" button on a form is actually not very intuitive, but it is one of the first thing we learn, and it's productive, so we could not imagine IT without it. I still remember once I had to teach an old guy how to use a computer. He had no clue about using a mouse, and I had to explain to him how I would use the name "button" for something on a monitor. Button was not as easy to explain as I originally thought, but try to explain to such a user how to tab through a radio button group...!! I still rarely run into people who do not use computers (!!!), and I still often run into people that are blocked from doing what they need to do on a computer, because of poor usability. Usability on computers will never be perfect, but we can do our best to improve it.

If you add 3D graphics, you can suddenly do a lot of new things:

* Moving parts of the GUI can visualize to the user, where a dialog comes from or disappears to. Think about how Windows or other systems minimize applications. You can also use it to browse in a book of multiple pages.

* Zooming into a picture, a chart or similar can be implemented by actually zooming in on the big GUI. This may not always make sense from a productivity point of view, but it certainly makes sense from a usability point of view.

* Semi-transparent layers on top of a normal GUI can be used to clearly identify a group of controls that belong together but are not located at the same place. When that semi-transparent layer moves just slightly, the human vision immediately recognizes all of the controls on that layer as belong to each other. For example, when zooming in on a chart, the axis controls can be positioned on a different semi-transparent layer, so that the user knows that we're only zooming in on the chart, but the axises are not zoomed, they just change the scale.

The current problem is, that most of us need to create software that works on computers without 3D acceleration, or via remote desktop solutions like VNC that do not perform well with 3D graphics. Also, 3D graphics need to be client-side in order to perform well, because server-side applications currently have bandwidth and latency problems.

However, once 3D graphics can be assumed to be available everywhere, the next problem is, how do you design GUI components for this? If you want to design a chart component that allows you to zoom into the chart by "moving towards your GUI", then your chart component needs to affect things outside the box that it was placed in. A good user experience in 3D requires a new infrastructure for writing GUI controls.

A good place to get inspired, is to look at 3D computer games. Imagine that you're writing a database application, with charts, grids, events, popups, comboboxes etc., and then start to look for useful GUI components in 3D games. Computer games experience a much higher pressure for usability, and at the same time there is a pressure for inventing new mechanisms and new paradigms. First person shooters are probably less useful to look at, but many other games are full of interesting solutions to simple usability problems.

Thursday 24 September 2009

Chrome platform: Now also on MSIE

Google is continuing its quest to make Google Chrome the next platform for application development.

The new invention is, that if you want to target the Google Chrome platform with your application, but your users don't want to use Google Chrome instead of Microsoft Internet Explorer (MSIE), you can ask the users to install Google Chrome Frame. It will not affect how MSIE works with other sites, it will only make MSIE activate Google Chrome as a plugin for your online application. This way, it basically works like a flash plugin, Microsoft Silverlight or similar.

Tuesday 22 September 2009

Cloud computing deficiencies

If you are considering the use of cloud systems, you may want to read this article about Google App Engine. He may be right or not, but the main problem with Cloud computing is predictability and determinism.

Wednesday 16 September 2009

Delphi in a long term perspective

I've just come back from the Copenhagen conference with Jim McKeeth. The main topic was Delphi Prism, but also a bit of Delphi 2010. It was a very nice event, Jim presented the topic well, David I from Embarcadero and Marc Hoffman from Remobjects joined online, and the Danish distributor was also present with Ole from Nohau. On top of that, we celebrated 20 year anniversary for the Danish user group, and several of the attendees have known the Delphi product range since before Borland got involved.

Looking back in history, the Delphi product line has always had ups and downs. I still remember the switch from Poly Pascal to Turbo Pascal 1.0 - it really seemed like anything else than an improvement, but fortunately Borland quickly added new features that made a huge difference. The entrance into Windows was a catastrophe, Turbo Pascal for Windows was really awful (it was basically a C/C++ like solution, very unproductive). Then, Delphi emerged, and everything was supercool. Similarly, and maybe predictably, the first attempts at doing Linux and Microsoft .net were awful.

Now, Embarcadero's Delphi Prism provides added value compared to Microsoft's tools, and they are able to keep up with the latest Microsoft technologies. They also realized that native Delphi is everything else than dead, and have provided a roadmap that positions Delphi as the best choice for many business models.

It seems that the history has always been that it took a little while to adapt to new surroundings, but it has always been worth waiting for, and if you wrote your code nicely, it could be moved easily. I expect that we can soon take source code, written in 1982 on a CP/M computer using Zilog Z80 CPU, and run it on Google Chrome OS.

Innovation in programming languages mess up the syntax

The amount of new features that go into programming languages these days, is extraordinary. There is no doubt, that the demand for multi-core programming requires innovation, but the widespread use of garbage collection also introduces new possibilities, like LINQ. Few programming tools introduce new methods at the same pace as Delphi Prism.

We may see a kind of survival of the fittest amongst all the methods, making some features survive and other not. I'm not sure that the parallel keyword in Delphi Prism has a great future - simply because parallelism shouldn't be done on a low level but on a high level, which is already nicely supported using anonymous methods. It's a kind of race, where errors are made, and one of the biggest errors is probably that many of the features introduce complexity in the language syntax, raising the learning barrier for new programmers. Personally, I very much dislike the use of "+=" for adding a handler in .net, simply because "+=" does not contain any explanatory information, like letters would. It gets much worse when you want to remove a handler, where you use "-= new". It is not intuitive to remove a handler by creating an object. If operators can be used for anything, why limit yourself to operators that already exist? Why not introduce a new "+==+" operator for something? It reminds me of the international obfuscated C code programming contest. Aspect Oriented Programming, LINQ, .net lambda expressions etc. all introduce new syntax elements, that don't look like things we have seen before. When some of these new features eventually become less used, we still have to support them, just like the "object" keyword is still supported by Delphi, even though OOP was changed with Delphi 1.

Anonymous methods and many of the new features are really cool, but in a few years it will likely be possible to design new, simple programming languages from scratch, which implement the most used features in a much nicer way. It may even be possible to create a low-complexity language like PHP or Python, that performs well and has most of these new features, is cross platform etc., becoming the preferred choice for new programmers. It will be interesting to see how programming language support for NUMA will evolve, when one piece of data cannot be accessed equally well by all threads in your app.

Monday 31 August 2009

Last callout for conference in Copenhagen with Jim McKeeth

The conference introduces experienced Delphi developers to Delphi Prism and a number of .NET technologies. Conference agenda here and registration here (or write to jsj@sejer.dk). Price is 4700 kr., (which is about €631 or $900), including a night's stay at the very nice hotel.

Wednesday 26 August 2009

Why to use Firebird+IBX

Jeff Overcash originally made it clear, that IBX doesn't officially support Firebird, and he has no intention to implement it. Many point at IBObjects and now DBExpress for Firebird support, so why use IBX? Here's a reason:

IBX supports Interbase 6.0, which is basically equivalent to Firebird 1.0. Firebird 2.1 supports the Firebird 1.0 API by providing a compatible gds32.dll, so Firebird 2.1 also works with IBX. IBX is part of Delphi, and has been part of Delphi for a long time, meaning that when there is a new Delphi version, IBX will most likely be included, too. This gives IBX a big advantage over IBObjects and others. I assume that IBX will still be around in 5-10 years. In case something should break with Firebird, the source code of IBX can be modified to fit.

Does this make IBX the best choice for doing Firebird with Delphi? No. But as long as it works well, there will be no need to switch, and I guess that a lot of people start using IBX simply because it is included with Delphi.

Thursday 13 August 2009

No dynamic memory in programming

The biggest benefit of java and .net is garbage collection because it lowers the costs of training of programmers. That's what I heard originally. However, garbage collection also allows some constructs that are not easy otherwise.

The alternative? malloc() and similar, of course. You ask the operating system for a piece of memory, and you need to free it afterwards. This is usually handled by standardizing the ways that memory is administrated, so that it is very easy to verify, that all memory is deallocated correctly. Many tools, like Delphi, preallocate memory so that multiple memory allocation requests don't trigger multiple allocation requests to the operating system.

The third model: Stack-based memory allocation. That's what is done with local variables. Every time a function is started, memory is allocated, and every time it ends, the memory is deallocated. Simple and efficient, but the stack often has a limited size.

However, there is yet another model that many programmers don't have much experience with: No dynamic memory. It's very simple: You have a program, an amount of RAM, and that's it. You cannot allocate more RAM, or less RAM. This is often preferred when there is very little RAM available, when high reliability is required, or when you need to be able to predict the behavior in all conditions. Having very little RAM is becoming a rarity, but reliability and predictability are still good reasons.

Last time I did something like this, was using C in a Fujitsu CPU, and we had 32kbyte of RAM. It was an industrial product and my own PC had many MBs of RAM, but in order to keep costs low, they didn't want to add extra RAM. The system had a normal screen, printer and keyboard. I was only writing one of the modules, so in order to ensure a good API, I decided to make it object oriented and message-based... I implemented a number of functions, where the first was a constructor, the last was a kind of destructor, and the rest where event functions, like key presses etc. However, there was no need for a pointer to the RAM, because it was preallocated and at a fixed location.

The constructor basically initialized the memory area with correct start values, ensuring a valid state, and informed the user about the activation of this module. The destructor empties buffers and handled other cleanup of non-RAM ressources. In other words, very Windows 3.0 like.

Compared to modern applications, the limit on the available memory required the user interface to handle this well. This is actually less of a problem than many people believe, especially in embedded devices. The benefits are: Increased performance and increased reliability due to lack of memory management. "Out of memory" suddenly becomes a message that you create yourself. Compile-time range checks in a programming language can increase reliability even more, without sacrificing speed.

So, if you have plenty of RAM, why would you employ such techniques? Well, reliability and predictability may be external requirements that simply disallow dynamic behavior.

There are many programmers out there that live in an environment that does not use dynamic memory allocation.

Monday 27 July 2009

3D full screen game using O3D

I just ran across this cute little 3D game on the internet:

http://blog.largeanimal.com/demo/

Remember to click the full-screen button in the bottom left, for a full experience. You will need the O3D plugin which will be included in Google Chrome, soon.

The game seems to be entirely made using JavaScript. If you want to see how it looks using Google Chrome app mode, in Windows, create a Windows shortcut like this:

"C:\Documents and Settings\Username\Local Settings\Application Data\Google\Chrome\Application\chrome.exe" --app=http://blog.largeanimal.com/demo/

The game does not adapt to the window size and still shows some HTML stuff when not in full-screen mode, and it's also a bit slow to load compared to what we can expect in the future, but even in its current state, it makes it awfully old-fashioned to run setup programs or to use MSI files.

Buying CO2 credits doesn't make IT CO2 neutral

More and more IT companies try to become CO2 neutral by buying the CO2 credits that match the amount of power they use for the IT equipment. The argument is, that if they use 1 ton CO2 to run their servers, then they buy 1 ton CO2 credit from the market, removing 1 ton of CO2 emission elsewhere.

Nice thought but that's not how it works. Buying CO2 credits like that just means that you need 2 ton CO2 credits produce 1 ton CO2, basically increasing the price for emitting CO2. If everybody did this, CO2 emissions would be cut by 50%, but not 100%. The good thing is, that the increased price of emitting CO2 generates additional incentives for developing new energy technologies. However, the IT company still emits CO2.

What makes matters even worse, is that market economy ensures, that if you can spend your money on energy during operating equipment, or energy during manufacturing of equipment, you will spend your money where you get most value. And if the equipment is produced in one of the countries outside the CO2 market (like USA or China), you will basically just push the problem out of the market, but not away from planet Earth.

It is good that companies try to use the CO2 emissions topic for profiling themselves, but nobody gets CO2 neutral by burning coal.

Friday 24 July 2009

Delphi apps as Chrome apps?

Google is doing a lot of great stuff with Chrome these days. The first version of the browser included it's own task manager, process administration, sandboxing etc., but Google now also announced accelerated 3D support and native code, sandboxed. I would not surprised to see Google Earth to be one of the first applications, that stop being Windows applications and start being Google Chrome applications: Install chrome and visit a specific URL, and it works. No more "download this app" or "install this app", except for Chrome itself.

This is all great, of course, but what development tools do we use for that? C/C++ is notoriously unproductive, and so are web apps. Web 2.0 apps are even worse. We need some tool that can create cross-platform apps, delivered using Google Chrome or similar, that perform well and are easy to write. Delphi has previously shown, that it can compile to several platforms, and Delphi Prism officially targets mono, so why not take on this one?

Jim McKeeth on Delphi Prism in Copenhagen

For those who are not fortunate to be able to read Danish: The Danish Delphi users group has a workshop on september 15th and 16th, introducing Delphi Prism to experienced Delphi developers. I assume that members have first priority, but I also assume that Jim hasn't learned himself Danish and keeps everything in English. I will be there, too.

Sunday 19 July 2009

The difference between craftsmanship and engineering in software development

(Inspired by Jeff Atwood's latest post about Software Engineering)

Definitions:
* Craftsmanship
* engineering
* Software Engineering

Instead of commenting on Jeff's article, I would like to give a real engineer's view on software engineering. I usually say, that a craftsman can be good at creating something that is similar to something that has been created before, in good quality. An Engineer can create something that has not been done before, and predict how long it takes and predict how it will work.

If we look at software development, is it often possible to assign each part to one of two types:

* Doing something that is easy to specify and/or predict (designing a GUI etc.)
* Doing something that is not easy to specify/predict (research, unknown territory)

If things are relatively easy to specify, you can do calculations like "100 forms of each n hours = 100*n hours". This will be a calculation with some uncertainty, of course, but it gives you a good idea of the size of the project. If the tools and methods are well known, you can illustrate the predicted result to others by showing them similar applications. Good craftsmanship is needed for tasks like these, and these processes can be measured easily using metrics, quality is easy to measure and control.

If things get hard to predict and/or specify, engineering starts. Then you need models, knowledge about how things work, knowledge about many options, ideas, and all the other things that we know from traditional engineering. This is also where architecture comes in - and I prefer the definition of architecture as "The decisions that we make early".

When doing Engineering, the skills and knowledge of the central decisionmakers can make a huge difference. This is where the productivity difference between two "programmers" can become more than a factor 1000 - and where wages differ a lot. QA, QC, metrics etc. are usually difficult, and the lack of predictability can be bad for a big organization's ability to cooperate. If marketing needs to prepare a big launch, they need to know when the product is finished.

A project can choose the craftsmanship path by picking an existing development platform, so that all components of a software product can be produced by craftsmen. This is usually preferred by large organizations because it is predictable and well documented. It may also be more expensive, but in the overall budget, that is less important.

The Engineering approach makes sense, if the product's specifications are not met by any existing platform, or if the software project's costs are significant to the organization. This does not mean that engineering can make your costs lower - it means that sometimes your costs can be lower using engineering.

So, what is good engineering? Basically, making the unpredictable predictable, making things really cheap, and delivering what is wanted. What do you need to know in order to do that? My list is this:

* Knowledge about computer hardware, network, programming methods, abstraction methods
* Organizational theory, management knowledge, psychology
* Mathematics, statistics, economy
* Decision theory, risk management, training in using the scientific method
* The knowledge of the users of the product (!)

You don't need all this knowledge in one person, but the combined knowledge is needed in order to achieve good engineering.

Saturday 4 July 2009

Jeff Atwood is wrong about performance

Jeff Atwood likes referring to his blog post about Hardware is cheap, programmers are expensive. where he writes: "Given the rapid advance of Moore's Law, when does it make sense to throw hardware at a programming problem? As a general rule, I'd say almost always."

I totally disagree, of course, but here is why:

* The parts of hardware that complies with Moore's law is usually so fast, that it is not the bottleneck. I cannot remember when 100Mbps ethernet was introduced, and when they started to deploy 10Gbps networks, but many new virtual servers are limited to 10Mbps these days, and this does not smell like Moores law. Does anybody except me miss the good old days, where new harddisks had a lower seek time than the old ones?

* If you have upgraded all your servers without consolidating, year after year, you will notice that your electricity bill is going through the roof. It's very simple: Today, you are running that 1995 app on an extremely fast server, even though it was built for a slower server. You are simply using more energy to solve the same problem, and that's why everybody tries to reduce the amount of hardware these days. Many data centers are looking into energy efficiency, and they won't just put up a new physical server because you want to save programmer wages.

* Many speed improvements are not what they seem. 100Mbps ethernet is not always 10 times faster than 10Mbps ethernet, it's more complicated than that. The 20-stage pipeline of the Pentium 4 was also not an improvement for everybody.

* Many performance problems are not related to the number of bits per second, but to the latency. Sometimes latency goes up when speed goes up - I have seen several examples of things getting slower as a result of upgrading performance. The most well known example is probably that the first iPhone used GPRS instead of 3G, but that GPRS would actually make some things faster than if Apple had implemented 3G right away.

* If programming generally disregards performance, the performance problem is not solved by improving hardware speed 10 times or 100 times. A large application that is written totally without regard for performance, can easily be more than 1000 times too slow. I have been a troubleshooter on several projects, where the application performed more than 1000 times faster when I left.

But here is the most important reason why programmers should be solving performance problems:

* It takes one programmer little time to design things well, but it takes technicians, users, other programmers, testers, etc. a lot of time to wait when the software is slow. Bad performance costs a huge amount of money.

Here are some good tips:

* Performance improvements of less than a factor 2 are often not worth spending time on. Do the math and find out how big the performance improvement factor really is, before spending time or money on solving it. If your best finding improves less than 2 times, then you need to search harder.

* Do the math on the benefits. If your turnover improves 2 times because your website becomes more responsive, the entire organization should support your efforts.

* The later you start caring about performance, the more expensive it gets.

Anyway, it seems that Jeff Atwood has realized, that he can run into problems, too.

Tuesday 30 June 2009

Delphi for iPhone

Will the next version of Delphi Prism officially support the iPhone?

* Miguel de Icaza announces MonoTouch

* MonoTouch

As far as I can see, Delphi Prism can be used to create iPhone apps, today, compiled to native ARM CPU machine code.

Wednesday 24 June 2009

Delphi is fast, very fast

Jesper Hald and others recently did a benchmark of a certain algorithm to figure out, which was fastest. It evolved into a kind of competition to make the fastest algorithm to solve this problem:

Fill a data structure with 1,000,000 random, unsorted values from 0-100
Run through this data structure 200 times, counting
a) number of values = 42
b) average for all 1,000,000*200 values

The benchmark was run on a new T5800 (2GHz, 800MHz) Toshiba laptop running 32-bit Windows Vista. Nothing had been done to the Vista to make it faster or behave in a special way.

The results were interesting, and our conclusions were:

* The fastest algorithm was made in Delphi (2009 version), was reasonably easy to read, and achieved to count the 42-values amongst 200,000,000 values in 55ms. That's pretty fast: 0.275 nanoseconds per value or about 0.5 clock cycles per value.

* Some of the first attempts in C# were 30-50 times slower than the winner.

* C# and Java were generally about 1.5 times slower than Delphi. Using normal generic lists in C# would make it 13 times slower than a simple Delphi implementation with static arrays. Is this comparison fair? Well, that's how the first results were made by the various programmers.

* Using unsafe code in C# seemed obvious, but actually wasn't necessary. It was possible to make it approximately as fast in C# without going unsafe.

* Delphi was approximately same speed inside and outside the IDE, whereas C# was almost 4-5 times slower when running the code inside the IDE.

* PHP was 1000-2500 times slower than Delphi.

* We gave up BASH scripting because it took too long time to fill the array with values (!)

Please do not generalize from this example, because there are many other things in this world than counting and averaging integer values. I'm not saying that Delphi is faster than C# or Java, and always remember, that a performance ratio of less than 2 in a specific algorithm is too little to make a difference in a large application.

Friday 19 June 2009

Floating point values are evil

If you want to do comparisons like > < =, always consider to convert your floating point numbers to integers and store them in integer variables. The basic problem is, that simple values like 0.1 do not exist in the double type.

There are several solutions:

* Instead of showing currencies as the number of dollars, where 0.20 is 20 cents, consider to register the number of cents, instead, so that $0.20 is stored as the integer value 20.

* Instead of comparing timestamps like "if a<b+1/86400", consider to use a time registration in seconds, like on Linux, or even milliseconds.

* Instead of storing fractions like factor:=height/width, keep storing the height and width in two integer values. If you want to calculate value:=factor*xposition, then you can convert this into an integer calculation like value:=height*xposition div width.

* Instead of adding fractions up to get an integer value, like fraction:=height/width and then "for x:=1 to width do y:=y+fraction", keep it as fractions in two integer variables: "for x:=1 to width do begin ynominator:=ynominator+height; y:=ynominator div width; end;".

The benefits are, that your code is much more deterministic, debuggable, explainable. However, it does not always make sense, sometimes it gets more difficult to read, so that's why I kept using the word "consider".

Sunday 14 June 2009

Do not set TStringList.sorted:=True with default comparison

Now, this is a nasty bug, especially for Delphi 2009, reported by an anonymous user in my other blog post here:

The Windows API CompareStr() in Windows XP SP3, using Danish locale (and probably most others), thinks that 59A < 59-A < -59-A < 5-9A < 59-A.

TStringList.Find uses binary lookups on a sorted list of strings, which means that for 1024 items in a TStringList, it does not need to make more than 10 string comparisons in order to find the index of a specific string. This is fast, but it requires the list to be sorted in a deterministic way, and it requires CompareStr() to be able to tell, what direction to go to find the string. On a Danish Windows XP SP3, this code will trigger error #1 and error #3:

var
  list1: TStringList;
  idx1:Integer;
begin
  list1 := TStringList.create;
  try
    list1.Sorted := True;
    list1.Add('59A');
    list1.Add('-59-A');
    list1.Add('5-9A');
    list1.Add('59-A');
    if list1.IndexOf('5-9A')=-1 then
      ShowMessage ('Error #1: IndexOf does not work.');
    if list1.Find('5-9A',idx1) then begin
      if list1[idx1]<>'5-9A' then
        ShowMessage ('Error #2: Find failed and found the wrong string.');
    end else
      ShowMessage ('Error #3: Find failed because it did not find the string which is present.');
  finally
    FreeAndNil (list1);
  end;
end;

Error #1 is triggered, because .IndexOf uses .Find on sorted lists in Delphi 2009. CodeGear cannot fix this problem easily, because it's actually a bug in Windows. This bug is worse for Delphi 2009, because it optimizes the .IndexOf function by using .Find, for sorted strings.

You can make Windows Explorer demonstrate the same problem. Create 4 files like this:

Date             Size  Name  
14-06-2009 09:00   54  -59-A
14-06-2009 08:58    0  5-9A
14-06-2009 09:01    4  59-A
14-06-2009 09:02  168  59A

Then, show the folder using Windows explorer in Detailed view, and do this:

* Click the column Size, to sort by size
* Click the column Name, to sort by name.
* You can now see that the order is 5-9A, 59A, 59-A, -59-A
* Click the change date column, to sort by change date
* Click the column Name, to sort by name
* The order is now 59A, 59-A, -59-A, 5-9A

I didn't try this on 32-bit Vista or 64-Vista, yet.

Wednesday 10 June 2009

The performance of the "as" operator

Some people wonder, if the "as" operator is slow or not. A programmer well known in the Danish Delphi community, Thomas Vedel, has made some investigations into this, using Delphi 2009, and I have got his permission to publish them.

The blue instructions below are those that are only executed if the type does not match, and they are therefore not relevant for normal execution. The conclusion seems to be, that the "as" operator does not slow down your application in any amount that is worth spending time on. So keep using the "as" operator!

Example 1

Main1.pas.28: TButton(Sender).Caption := 'Klik her!'; 00464808 8BC2 mov eax,edx 0046480A BA24484600 mov edx,$00464824 0046480F E8E0CFFDFF call TControl.SetText

----------

Example 2

Main1.pas.28: (Sender as TButton).Caption := 'Klik her!'; 0046480B 8BC3 mov eax,ebx 0046480D 8B15CC014300 mov edx,[$004301cc] 00464813 E8A0F4F9FF call @AsClass 00464818 BA30484600 mov edx,$00464830 0046481D E8D2CFFDFF call TControl.SetText @AsClass: 00403CB8 85C0 test eax,eax 00403CBA 7416 jz $00403cd2 00403CBC 89C1 mov ecx,eax 00403CBE 8B09 mov ecx,[ecx] 00403CC0 39D1 cmp ecx,edx 00403CC2 740E jz $00403cd2 00403CC4 8B49D0 mov ecx,[ecx-$30] 00403CC7 85C9 test ecx,ecx 00403CC9 75F3 jnz $00403cbe 00403CCB B00A mov al,$0a 00403CCD E93EF4FFFF jmp Error 00403CD2 C3 ret

----------

Example 3

Main1.pas.28: if (Sender is TButton) then 0046480B 8BC3 mov eax,ebx 0046480D 8B15CC014300 mov edx,[$004301cc] 00464813 E87CF4F9FF call @IsClass 00464818 84C0 test al,al 0046481A 740C jz $00464828 Main1.pas.29: TButton(Sender).Caption := 'Klik her!'; 0046481C BA38484600 mov edx,$00464838 00464821 8BC3 mov eax,ebx 00464823 E8CCCFFDFF call TControl.SetText @IsClass: 00403C94 53 push ebx 00403C95 56 push esi 00403C96 8BF2 mov esi,edx 00403C98 8BD8 mov ebx,eax 00403C9A 85DB test ebx,ebx 00403C9C 740D jz $00403cab 00403C9E 8BD6 mov edx,esi 00403CA0 8B03 mov eax,[ebx] 00403CA2 E875000000 call TObject.InheritsFrom 00403CA7 84C0 test al,al 00403CA9 7505 jnz $00403cb0 00403CAB 33C0 xor eax,eax 00403CAD 5E pop esi 00403CAE 5B pop ebx 00403CAF C3 ret 00403CB0 B001 mov al,$01 00403CB2 5E pop esi 00403CB3 5B pop ebx 00403CB4 C3 ret

Friday 29 May 2009

Upgrading a major project to Delphi 2009

Having finished converting a major project that involves a fairly large programming team for several years, to Delphi 2009, I'm now ready to blog about the experience.

If you want to estimate the amount of work involved to convert a project, note that it is not significant, how many lines of code you have. It is more important, what kind of code, and how segmented it is, and how consistent each segment is written. Recent user interface stuff, business logic etc., is very easy to convert. I'd almost say compile & run. Some other parts are definitely not.

Our code was clearly segmented, in the way that each unit belongs to one of these groups:

* Very old code. Some parts really had some ancient stuff in them, especially external components, but this was solved simply by renaming string=>ansistring, char=>ansichar, pchar=>pansichar. Fixing Windows API calls to use Ansi versions. After that, everything runs as before, and it is not unicode enabled.

* 3rd party components. We upgraded most of them, and some old free source code versions were upgraded by ourselves.

* User interface stuff. Besides that we upgraded some component packages, we did not have anything to change here.

* Special home-made components. These sometimes contained some optimizations, that made it necessary to adapt them to Delphi 2009, but generally, they just worked.

* Business logic. Some general things had to be changed, but there were few things and they were easy to spot and fix. It was almost like search & replace.

* Bit-manipulating units. These units need to treat each byte by itself, and the usual remedy was to convert it as if it was really old code, and then sometimes it was fairly easy to unicode-enable it afterwards.

* I/O routines. They basically had to be rewritten. We have switched some of our output text files to utf-8, in order to unicode-enable them, but others we left as they were. The main problem was with code, that tried to do unicode in Delphi 2006/2007, because it stored utf-8 in ansistring. The solution is to remove all utf-8 conversion inside the algorithm, and just apply it at the I/O point.

The hardest part was blobs. Sometimes they contain binary stuff, and sometimes they contain text. Unfortunately, .AsString was suitable for both in Delphi 2007, but now they need to be treated completely separately. The solution was to duplicate a lot of procedures, one for RawByteString and one for String, and then use the appropriate procedure on the appropriate fields.

It was not hard to do the conversion, and now we have unicode enabled a significant part of the system, with more to come, soon. However, it surely takes some refactoring experience to convert source code efficiently - if you are not good at refactoring, it will take significantly more time.

At last, I have a few tips for those, who are about to go through the same process:

* Gradually converting the code is easier than doing it all in one attempt. Make development projects that include more and more of your units, until you have converted everything.

* It is not difficult to make your code able to compile with both Delphi 2006, 2007 and Delphi 2009 at the same time. Do that, so that your team can still be productive until everything works.

* Even when you convert a full unit by making it use ansistring internally, consider to use string for the interface section, so that the conversion is hidden internally in the unit. This keeps the number of special cases lower.

* Get rid of your string warnings by fixing them, not by ignoring them. Most of them are seriously easy to fix.

* Always review your changes before committing them to your source code repository, so that you're sure that you only changed what you meant to change :-)

Thursday 28 May 2009

Deterministic but automatic memory deallocation

Now, that Delphi Prism and Delphi/Win32 have put less focus on source code sharing, it's time to see if memory allocation in Delphi/Win32 can be improved. This is what we often write in our code today:

var o:TMyObject;
begin
  o:=TMyObject.Create;
  try
    o.DoSomething;
  finally
    FreeAndNil (o);
  end;
end;

How can we avoid typing so much? The obvious solutions are:

Garbage Collection: Used in Java and .net, it often makes these consume more RAM than necessary, and is generally not very predictable. Even worse, Delphi code usually does a lot in the destructors, which is not compatible with garbage collection.

C++ objects: This is basically about avoiding the use of pointers. Delphi actually supports these kinds of objects, but then you need to avoid using TObject, and that is not really a good way forward.

There's a third solution: Add a syntax, maybe a keyword, which tells the compiler that the pointer should be freeandnil'ed before the procedure exits. Here is an example where the keyword "local" has been used:

var o:TMyObject;
begin
  o:=local TMyObject.Create;
  o.DoSomething;
end;

Here, the keyword local forces the compiler to deallocate the object before the function finished. Another example:

(local TMyForm.Create(nil)).ShowModal;

This would create the form, show it modally, and deallocate it again in a deterministic/plannable/non-random way.

Even adapting APIs can be done nicely:

a:=CreateChart (local CustomerStatisticsRetriever.Create (local CustomerRetriever.Create (databaseconnection)));

In this case, the CustomerRetriever provides an API for getting customer data out of the database connection. This is used by the CustomerStatisticsRetriever to provide statistics, which CreateChart() uses to create a chart. After doing this, the pointers are deallocated automatically because of the local keyword.

Possible variations on the topic include:

* Use a different syntax or a different keyword
* Deallocate automatically after the last reference to the pointer, instead of deallocating at the end of the procedure.

Thursday 21 May 2009

Anders Hejlsberg: We have huge amounts of memory and concurrent programming is an exception

See the video here:

http://channel9.msdn.com/shows/Going+Deep/Expert-to-Expert-Anders-Hejlsberg-The-Future-of-C/

It's quite amazing to hear Anders advocating bloat in a world where most of the world still has bad internet connections and hate large downloads, mobile phones prefer 1MB applications over 10MB apps, where RAM can be a bottleneck, where battery life is always too short and where many large organizations struggle with bad performance.

I recently had the chance to see the standard configuration for virtual servers in a large organization. In order to improve network traffic, the standard network adapter is limited to 10Mbit/sec.

Size matters.

Wednesday 1 April 2009

Deterministic memory management

With Java and .net, garbage collection (GC) has won a lot of followers. GC basically makes it possible to avoid some of the memory leak problems that often happen with unskilled programmers, and at the same time, it supports the current typical hardware configurations very well. When memory is needed, you don't need to search for a block, it's just grabbed.

However, since the current hardware configuration is doomed, the question is, how will the future of memory management look like? With 128 CPUs, do not expect them all to access the same RAM location equally fast.

This introduces a new concept: Memory pointers that have CPU preferences. As all other mechanisms that we have, this will be tweaked, too. Somebody will want to allocate a specific amount of memory, with a specific CPU preference, and it will be implemented.

For instance, one CPU (A) has a lot of encrypted data that needs to be decrypted before it is written serially to I/O. If another CPU (B) will write to I/O, CPU B will most likely have the I/O buffer in its RAM. In order to reduce RAM usage, the decrypting CPUs (C) would optimally save their data directly into CPU B's RAM. This can be done in multiple ways. They can save parts of it in their own RAM, and then copy that to CPU B, or they can pipe it directly to CPU B, which then saves it locally.

The piping mechanism is already implemented in hardware in several CPU architectures today - if CPU C accesses the RAM of CPU B, it writes to the RAM through CPU B, totally transparent to the programmer. In order to achieve this, the destination RAM must be allocated with preference for CPU B. If CPU C needs to allocate memory in CPU B's RAM, we have several problems:

1) Who makes sure that we don't allocate too much of CPU B's RAM? And if it happens, what should fail?
2) How does CPU C specify that the RAM should be allocated for CPU B? Using a thread ID? That may require the ability to lock a thread to a CPU.
3) How do we debug and profile this?
4) Will intra-CPU pipes be packetized, and what will the packet size be?
5) Will intra-CPU pipes be compromises between latency and bandwidth, or do we, as programmers, need to specify parameters for tweaking them?

I am quite sure that there is plenty of research going on in these topics, but from a commercial programmer's point of view, the mission is clear: We need debugging tools and programming language support. It must be very, very easy to specify how RAM is used, who owns it, CPU preferences, its lifetime etc. Since more and more RAM is used for caching, we also need support for making cache memory allocation, which can be profiled and deallocated by the OS. We need to be able to use all the available RAM for caching, cleverly split between all processes.

We need to put the programmer back in charge of memory management, and it needs to be easy.

Thursday 26 March 2009

Wikipedia's effect on software marketing

Recently I saw a presentation from a very skilled Microsoft guy about Cloud computing. He was really good at presenting the message, telling about the cloud, about Microsoft products etc., but his terminology definitions definitely differed from mine.

Immediately, I looked up cloud computing on wikipedia, which corresponded very well with my own perception of the concept. Several years ago, I was in charge of a hosting center with a cloud computing system. It was basically just a hosting system, where multiple customers shared a PC, but it was able to move customers from one computer to another, when more power was necessary. It was nowhere as smart as the systems we have today, but to the customer, it was a dynamically scalable resource provided as a service over the internet. Some may say that cloud computing requires the ability to scale above what one server can deliver, but seriously, that's just a service specification, and for many users it is irrelevant.

If you look into existing corporate data centers, you can also see other kinds of "Cloud systems" - for instance, big VMware servers running a large number of virtual servers. To the departments, that buy these servers, and access them via TCP/IP, it is a style of computing that is dynamically scalable, virtualized and provided as a service over the internet protocol. The users/customers don't need to have knowledge of, expertise in, or control over the technology infrastructure in this system, that supports them. The VMware systems can move a virtual server from one physical server to another, moving the IP address etc. without interrupting online services and without closing TCP connections.

Does the term cloud computing describe the VMware system? Maybe. Or maybe not. But Google Engine or Windows Azure is not a revolution - to some people it's not even new stuff. 10 years ago, Microsoft may have hijacked the term and changed it to its own favor. Today, they can bend it, but not as much as before. Online, updated definitions of terminology, like wikipedia, make this a lot more difficult. I am sure that the definition of cloud computing will change in the near future, to specifically include systems in your own data center, but the new thing is, that the definition will be changed by nerds, using a common reference in wikipedia. It will not be changed by a single company's marketing department.

Sunday 22 March 2009

Use RAM disk for many small temporary files

Many systems use small temporary files to exchange information between two applications. There can be many reasons to do so: To support network file shares, to have a queue system that is not tied to any kind of special software, or simply to keep specifications simple. There are many well known examples of this, including mail systems like sendmail and postfix.

The problem is, that most Windows computers use the NTFS file system, which supports journaling. This means that every file that is created, actually activates the physical harddisk. On a system with a load, this can cause serious latency, which slows down any system. Unfortunately, it is not possible to turn off journaling for a single directory.

The solution? Install a RAM disk. It may take a part of your system's RAM, but it's surely extremely fast at creating and deleting files. You can get a RAM disk here. If you want to see performance numbers, see this page (use Google Translate for English version).

Thursday 12 March 2009

Mark D. Hill on Amdahl's law in the Multicore Era

This is a really cool video if you're interested in multi-core CPU architectures, performance, parallel programming or just want to know what Google is doing these days.

Wednesday 11 March 2009

UTF-8 automatic detection

If you have ever worked with an environment that mixed utf-8 and the 8-bit default character set in Windows, you may have run into the desire to autodetect utf-8 text. This is actually very easy, because there are a lot of illegal byte sequences in utf-8, which usually appear in other character sets.

For instance, a Danish word like "Øl" is encoded $D8 $6C using the Danish local character set. However, $D8 (binary 11011000) indicates that it is the start of a 2-byte sequence, where the next byte is in the range $80-$BF, which it is not. In other words, even this tiny 2-byte text can be clearly identified as not being utf-8.

Originally, the main method to autodetect utf-8 is to see if the byte sequence conforms to the utf-8 method of indicating the number of bytes in a character:

* The range $80-$BF is used for bytes which are not the first in a character
* The range $C0-$DF is used for 2-byte characters
* The range $E0-$EF is used for 3-byte characters
* The range $F0-$F7 is used for 4-byte characters
* The range $F8-$FB is used for 5-byte characters
* The range $FC-$FD is used for 6-byte characters
etc.

However, there are more mechanisms that you can use:

* 5-byte and 6-byte characters are not used, even though they would be technically possible. If you experience a valid 5-byte or 6-byte combination, which is usually unlikely, you can and may detect it as being an invalid sequence.
* It is incorrect to use more bytes than necessary. For instance, if you want to encode the character 'A' (codepoint 65=$41), it is ok to encode it using 1 byte ($41) but not ok to use 2 bytes ($C0 $41).
* If your application knows that some unicode values cannot be generated by the creator of the text, you can make an application-specific exclusion of these values, too.

One of the things that makes this autodetection so beautiful, is that it almost works with 100% accuracy. In Denmark, we use the Windows-1252, which is related to Ansi, ISO-8859-1 and ISO-8859-15:

* Byte values in the $00-$7F range can be detected as either ansi or utf-8, and it doesn't matter, because these two character encoding standards are identical for these values.
* The $80-$BF range contains a lot of strange symbols not used inside words, and the $C0-$F6 range contains almost only letters. In other words, in order to have an ansi text with a valid non-ascii utf-8 byte sequence, you would need to have a strange letter followed by the right number of symbols.
* $C0-$DF range: The most likely would be a 2-byte sequence, that starts at the end of an uppercase word with an Æ or an Ø, followed by a sign like "™", something like "TRÆ™". The last two bytes would be encoded $C6 $99 in ANSI, which is a valid utf-8 combination with the unicode value $0199. However, this is a "Latin small letter k with hook", which in most Danish applications is not a valid character. This way, this text can be rejected as being utf-8 with great certainty.
* $E0-$F7 range: Here it gets very unlikely to get the right utf-8 byte sequences, but even if it happens, the encoded value would probably end up being regarded as illegal in the specific application. Remember, many applications only accept text, that can be converted to the local 8-bit character set, because it is integrated with other systems or needs to be able to save all files in 8-bit character set text files or databases.

Tuesday 10 March 2009

Avoid deadlocks in Firebird for concurrent updates

This is a small tip that can be used when you want to have separate processes to write to the same records in the database, without deadlocks. Example:

b:=StartTransaction;
UpdateDataInTable (b);
Commit (b);

If two processes try to do this at the same time, the second process will detect a possible deadlock on the update, and will wait until the first process completes its transaction. When the first transaction commits, the second raises a deadlock exception. Fortunately, there is a simple solution:

a:=StartTransaction;
UpdateDummyData (a);
b:=StartTransaction;
UpdateDataInTable (b);
Commit (b);
Rollback (a);

If two processes do this at the same time, the a transaction will make the second process wait until the first one completes. Because a is rolled back, this is purely a wait, and will not throw any exception. This way, the two b transactions will not be active at the same time.

Sunday 8 March 2009

Latency is increasing

Once internet connection latencies were above 100ms. ISDN brought it down (for me) to about 30ms, and ADSL got it below 10ms. Now, ADSL2 is becoming more widely deployed, introducing 20ms latency, and more and more homes are being equipped with 3G/wifi gateways, eliminating the copper wire, and bringing latency above 100ms again. I guess it is obvious, that this seriously impacts how to create applications.

Even worse, I have seen 10Gigabit/sec WAN connections with 10-20ms latency, which replace local <1ms ethernet networks when datacenters are consolidated. This seriously impacts the number of network requests your application can do.

This also affects user experience in a negative way - so why is latency increasing? My guess is, that new technologies and increased demand for skilled IT workers make low latency have a cost. The solution is to specify latency requirements, and to put numbers on the costs of high latency.

Saturday 7 March 2009

The coming bloat explosion

I see a future of bloat:

* Cloud systems and virtualization on very powerful servers can be exploited to shorten the time to market, at the cost of bloat.

* Windows 7 may finally make Windows XP users upgrade, many to 64-bit machines, enabling a significant growth in amount of RAM in an average PC. This will make it more acceptable that software uses more RAM.

During the last 28 years, my experience has been, that user interfaces have had similar response time all the way. Sometimes a bit slower, sometimes a bit faster. I guess the amount of bloat always adapts itself, to a compromise between user requirements and development costs. There are currently plenty of mechanisms that favor increased bloat:

* Many multicore/multitasking techniques involve more layers, more complexity, more redundant/copied/cached data, more bloat.

* The world of programming gets more complex, and young programmers increasingly use very complex building blocks without fully realizing how they work.

* One of the most common ways to exploit multi-core CPUs, is to do work in a background thread.

Basically, there will be more bloat, because it pays. However, it also means that the amount of optimization, that can be done to code, will increase. One of the few mechanisms to reduce bloat is battery power, because increased battery lifetime is a sales parameter that justifies increased development costs. The problem here is, how do you create a mobile platform, where the 3rd party applications use less battery? Applications like JoikuSpot can quickly drain a good battery. It seems that Apple has done a good job with the iPhone, where you can use a large number of applications easily without losing all your battery power. However, Apple's methods also have drawbacks, as described in this post.

I wouldn't wonder if we will see power consumption measurement for applications built into mobile platforms in the future. Imagine a message like "This application has used 10% of your battery power during the last 15 minutes. If it continues to run, your phone will be out of power in about 60 minutes. Do you want to terminate this application?"

Thursday 5 March 2009

Environment-adapted software development

Normally, when you create software, there are 4 parameters that you can specify:

* Functionality
* Price
* Deadline
* Quality

In a competitive market, the customer can only specify three of these parameters. If the functionality is defined, the deadline is defined, the quality is defined, then the price can go up sharply, and I guess everybody knows examples of this.

As Microsoft themselves described once, the trick is to be flexible, the official advice was to cut functionality if the deadline approaches more quickly than you expected. However, you get much better planning if you start by realizing that you need flexibility, before signing a contract. Maybe you even have a fixed budget and just want to maximize the value of the product?

One of the custom projects, that we did recently for a customer, was to create a piece of software where the specs were a bit unclear, but the purpose was clear. Quality was well defined, deadline was defined, too, budget was defined, but functionality was not. The result was an innovative process: We created new ideas, and the end result was better than expected.

Friday 20 February 2009

Good blog post about performance - also applies to Delphi

Delphi has a great flexibility in how databases are accessed and handled, and therefore often perform really, really well. However, the urge to improve, is never-ending, and here is a great article for those, that really want to know about performance:

http://highscalability.com/numbers-everyone-should-know

The article focuses on web apps, but actually applies to almost any kind of application.

Saturday 7 February 2009

Windows Reversi - playing online against an algorithm?

Can somebody please explain to me, why the opponents in the game of Reversi in Windows XP are almost always Greek beginners, German Intermediates or English experts? To me, they seem to be computer algorithms. If I am right, I see two possible explanations: 1) There are not enough Windows users to support online games like this. 2) Humans don't behave according to Microsoft quality standards. I tried to Google this, but didn't find a good explanation. If somebody knows the Greek guy, please tell him that there are books out there that can help him improve.

Friday 6 February 2009

Mixing waterfall and agile

Here is a typical story of waterfall vs. Agile: http://dotnet.dzone.com/news/we-dont-have-requirements-yet. If you ever run into such a situation, I can recommend to read Agile Estimating and Planning. It will solve your problem.

Thursday 5 February 2009

Parallelizing can make things slower.

I wrote the text below on another blog about parallelism, 4 months ago. It is a hypothetical example, but based on a true story, with a very good point. I guess we all know the feeling of clicking the start button in Windows after booting, and nothing happens for several seconds...

"You want data to be sorted. You may start out with a standard sort algorithm. Suddenly you find, that it is slow. You find out, that you can increase the speed 5 times by replacing quicksort with mergesort. Then, you find out, that you can increase the speed 10 times more by using an implementation, that knows how to use the CPU cache well. Then, you realize, that Mergesort can use more CPUs, so you use more CPUs. Suddenly, your application is 100 times slower, because the writes to memory from one CPU is flushing the cache from the other CPU. This didn't happen on your development system, because the dual-CPU system was made in a different way... so you realize that you need to make the parallel programming differently.

Finally, you find that you need to separate the lists, that you merge in your mergesort algorithm, better. After a while, your algorithm works well on your development system and on the target system.

Finally satisfied, you put the parallel mergesort algorithm into all places where it makes sense. After a short while, the test department comes back, and reports that the application runs slower when your multithreaded sort algorithm is used. The reason is, that it now uses multiple CPU caches, leaving less cache memory for other parts of the application, slowing down these parts more, than the speed benefit of using multiple CPUs for sorting.

You are finally asked to remove the parallel processing from your mergesort in order to increase the speed of the system."

Garbage collection or garbage piling up?

If you haven't tried .net or java, yet, you may want to have a look at this example of how garbage collection doesn't free your mind from thinking about memory allocation and object destruction.

Thursday 15 January 2009

Multi-core now, NUMA next

Quad-core PCs are in the shops, but as Sandia reports, there is a limit to how much we can grow performance by adding more cores.

The next bottleneck is memory - and the simple way to solve that problem is to split the memory, giving each CPU its own memory. This adds a new parameter to allocated memory: Which thread does this memory belong to? Actually, the technology already exists. Windows and Linux both support Non-Uniform Memory Access, NUMA. It is typically used in virtualization hosts in data centers, for instance using VMware ESX Server. If you haven't heard about it, you may want to prepare yourself for the future.

Friday 9 January 2009

Virtualization gives configurable risk profiles

As everybody knows, the risk of failure can be reduced using redundancy, so redundancy must be good. Also, redundancy is expensive, but fewer hardware boxes with lots of redundancy built-in means less costs and less risk at the same time, right? Now, here is the problem: If you virtualize 100 servers on a system, and the system fails, you have 100 virtual servers that fail.

Most organizations can survive the failure of one system. If a postal company cannot bill large customers for packages, but can bill everyone else, they still have cash flow until the problem is solved. But what happens if all systems fail at the same time?

Different organizations need different risk profiles. As software developers, we can use virtualizations to help this out. If our software supports several departments, we can choose how things should fail. If a storage system fails, should all departments lose access to the system (single point of failure = cheap), or just some of them (more expensive)? If the customer wants everything to be as cheap as possible, just put all the software on one physical server using virtualization. If the customer wants the system to be very stable, use several servers in a way that keeps employees productive even when some servers fail.

If it is too expensive to use multiple servers, use virtual servers on different host hardware, maybe even in different hosting centers. You can use 10% of one physical host in 10 different centers, and in the end, the costs are the same as using one physical host in one center.

The Google and Microsoft cloud systems try to solve this problem, too, but until further notice, virtual servers are the most advanced technology in most organizations, to solve this problem. What we can do as software developers, is to design our systems so that they work well, both distributed in several hosting centers, and on one physical server. Also, our systems must work in loosely coupled ways, that make it possible to keep the organization running for a while without a specific subsystem.