Saturday, 8 November 2008

Working with Delphi 2009

My main work is with Delphi 2006, but more and more of our source code now compiles with Delphi 2009, and I also created some tools with Delphi 2009.

Converting existing source code is extremely simple, as long as the source code is written nicely, and is about databases, user interfaces etc. However, direct Windows API calls need to be checked - especially where you have an "array of char" and pass the sizeof(array) as parameter and similar constructs... the char is now widechar and sizeof(array) is no longer the character count. Often, you will find this kind of code in 3rd party components or small code snippets that you get from other people. Sometimes you can fix things by replacing char with ansichar, sometimes you want to use the Unicode API and therefore need to replace sizeof(array) with length(array).

CodeGear has done a lot of make simple I/O simple. Many things just work, but some things don't. TStream.Write (str[1],length(str)) will fail, because the second parameter is the byte count. Rewrite to TStream.Write (str[1], sizeof(char)*length(str)) or make str an ansistring. In other words, you will need to fix Windows API calls and advanced I/O.

If you have previously created an application, that handled unicode, you may have stored utf-8 encoded stuff in TStrings objects, and used various character sets in various parts of your program. You can still do that, but some functions now only work with string and not ansistring, and the easiest solution is often to make everything use unicode, and convert to/from unicode at I/O and APIs. When fixing all this, it feels really good - it's like cleaning up your desk and the source code gets simpler.

You may think: I can just replace all "string" with "ansistring". No, you cannot. Many functions like copy(), insert() etc. will no longer work with ansistring. Delphi will convert your ansistring to string first, assumping that your ansistring contains text in the local character set, and that assumption is not always correct.

Also, there is one special character that has stopped working in string types: #0

If you assign s:='Hello'#0'World', then s will contain 'Hello'. The reason is, that string is now strictly for text purposes, and #0 is not text - it's a binary code. This was probably the most tricky problem that I have encountered, because I had to convert some code that did this. Fortunately, it only took about 5 minutes to fix. If you're searching for a replacement code, consider #12 (Form feed). I don't think anybody uses that code today, and it gets converted nicely between character sets. However note, that TCharacter.IsWhitespace() will treat it as whitespace, and that #12 is not the end of a PChar string, in case you're doing complicated byte gymnastics.

It's not plug & play to use old source code in Delphi 2009, but conversion is fairly easy, and normally you will not need to be familiar with source code in order to convert it easily. This is good, because it means that you can easily take another person's source code, and convert it.

When writing new apps in Delphi 2009, it feels really, really good. The quick startup feels good, texts and raw binary data is automatically separated into different datatypes, and mixing these up unintentionally gives nice warnings. It feels as if Delphi is helping you more now, than before. If you need advanced ways to save data, the new Generics features can make the source code much more readable than in previous versions. Simply derive from TList in the Generics.Collections unit:

type
TPairTitleId=
class
ComboBoxTitle:string;
DatabaseId:integer;
end;
TPairTitleIdList=Generics.Collections.TList<TPairTitleId>;


Now you can refer to things like:

function dummy (list:TPairTitleIdList);
var
i:integer;
begin
i:=list.Items[3].DatabaseId;
end;


It is a significant productivity enhancement if used wisely. There is no need to use the familiar "TStrings.Objects[i] as TMyClass" any more.

Delphi 2009 is a significant step forward. It does everything your old Delphi does, but significantly better and easier.

5 comments:

Andreas Hausladen said...

> If you assign s:='Hello'#0'World',
> then s will contain 'Hello'

Can you show me an example, because I cannot reproduce this.

Lars D said...

Sorry, it seems that I made a mistake with #0. I only have one example of source code where #0 introduced erroneous behavior in Delphi 2009 (dxgettext), and I quickly fixed that by replacing #0 with a more suitable byte value.

PhiS said...

Dear Lars,

>Many functions like copy(), insert() etc. will no longer work with ansistring.

That does not seem to be correct, even though consulting the help would make it appear so.

I checked the following code using some standard string manipulations:

procedure TestStrings;
var a1,a2,a3,a4:ansistring; l:longword;
begin
a1:='ABC';
a2:='DEF';
a3:=a1+a2;
a4:=concat (a1,a2);
a3:=copy(a4,2,4);
insert (a1,a3,2);
insert ('ABC',a3,2);
delete(a3,2,1);
l:= length(a4);
setlength (a4,l);
Showmessage (a3);
end;

Not only does this code compile in Delphi 2009, but also if you compile this in Delphi 2006 and 2009 and compare the assembly code in CPU debug view, you will see that the code generated is almost identical (although I admit I haven't checked the called internal functions).

For example, insert will call @LStrInsert, and copy will call @LStrCopy in both D2006 and D2009, whereas for the same code compiled for unicodestrings @UStrInsert and @UStrCopy will be called.

The differences I found are the following (showing D2009 code):

(1) l:=length(a4);
mov eax, [ebp-$10]
test eax,eax //D2009 only
jz @1 //D2009 only
mov edx,eax //D2009 only
sub edx,$0a //D2009 only
cmp word ptr [edx],$01 //D2009 only
jz @1 //D2009 only
lea eax, [ebp-$10] //D2009 only
xor ecx,ecx //D2009 only
mov edx,[ebp-$10] //D2009 only
call @InternalLStrFromUStr //D2009 only
@1:
test eax,eax
jz @2
sub eax,$04
mov eax,[eax]
@2:

(2) setlength(a4, l);
lea edx,[ebp-$10]
xor ecx,ecx //D2009 only
call @LStrSetLength

(3) Showmessage (a3);
inserts a conversion call to @UStrFromLStr, and here the compiler actually tells you about the implicit string cast from AnsiString to String.

So, in summary, most of the "old" string operations on ansistring seem to work fine, and the major difference in the above code seems to be a small block inserted in the code for Length().

Lars D said...

You're right... it seems that it is Delphi itself that simply doesn't inform me of the abilities. When you look at the possible parameters for insert() etc., this is what you get:

http://download.daintel.com/quickdownload/?show=1&key=1226406168-HGLBDPXEWVCGDCXCSTDZNIQCGJXRH

As you can see, Delphi only mentions "string" as a possible parameter, not "ansistring".

If you look at MidStr() (from StrUtils unit), then Delphi 2009SP1 reports 2 possibilities: AnsiString and WideString. No UnicodeString/string!

I guess you found a hidden feature :-)

Lars D said...

I made a new article: http://compaspascal.blogspot.com/2008/11/corrections-to-working-with-delphi-2009.html