Thursday, 28 May 2009

Deterministic but automatic memory deallocation

Now, that Delphi Prism and Delphi/Win32 have put less focus on source code sharing, it's time to see if memory allocation in Delphi/Win32 can be improved. This is what we often write in our code today:

var o:TMyObject;
begin
o:=TMyObject.Create;
try
o.DoSomething;
finally
FreeAndNil (o);
end;
end;


How can we avoid typing so much? The obvious solutions are:

Garbage Collection: Used in Java and .net, it often makes these consume more RAM than necessary, and is generally not very predictable. Even worse, Delphi code usually does a lot in the destructors, which is not compatible with garbage collection.

C++ objects: This is basically about avoiding the use of pointers. Delphi actually supports these kinds of objects, but then you need to avoid using TObject, and that is not really a good way forward.

There's a third solution: Add a syntax, maybe a keyword, which tells the compiler that the pointer should be freeandnil'ed before the procedure exits. Here is an example where the keyword "local" has been used:

var o:TMyObject;
begin
o:=local TMyObject.Create;
o.DoSomething;
end;


Here, the keyword local forces the compiler to deallocate the object before the function finished. Another example:

(local TMyForm.Create(nil)).ShowModal;


This would create the form, show it modally, and deallocate it again in a deterministic/plannable/non-random way.

Even adapting APIs can be done nicely:

a:=CreateChart (local CustomerStatisticsRetriever.Create (local CustomerRetriever.Create (databaseconnection)));


In this case, the CustomerRetriever provides an API for getting customer data out of the database connection. This is used by the CustomerStatisticsRetriever to provide statistics, which CreateChart() uses to create a chart. After doing this, the pointers are deallocated automatically because of the local keyword.

Possible variations on the topic include:

* Use a different syntax or a different keyword
* Deallocate automatically after the last reference to the pointer, instead of deallocating at the end of the procedure.

9 comments:

ajasja said...

A similar idea has been discused before (and is also easy to implement without a keyword using interfaces).
http://www.deltics.co.nz/blog/?p=391
http://www.deltics.co.nz/blog/?p=412

Anonymous said...

Of course this only helps with local variables.

An advantage of Garbage Collection is that you do not (or hardly) have to think about it. With your solution the developer still has to decide if it is safe to do so and mistakes can be made easily.

And most of the code in destructors nowdays is cleaning up code. This code can be removed altogether if there when using Garbage Collection. Of course not all destructors can be removed, like destructors who free 'unmanaged' objects like a file, database or window handle.

Using the reference counting of Interfaces has the problem of cross references. A references B and B references A. Both are referenced so both do not get cleaned up. They will get cleaned up when using Garbage Collection if no one else is referencing either A or B.

Lars D said...

Thank you very much for the links.

Both links recommend to put the mechanism on the variable.

However, there are several reasons why the mechanisms must be applied to the allocation and not the variable:

1) Local variables are not always initialized.

2) When you want to inspect the source code, to see if memory allocation is done properly, you look at the allocation code, and not at the variable definition.

3) Actions belong between begin...end, and not in the var declaration.

4) Automatic deallocation is also needed for values, which are not assigned to variables.

Lars D said...

Garbage collection has many problems: Almost all java or .net programmers that I know, have had problems with memory allocation. Typically, their app allocates too much RAM, causing Windows to swap file write buffers to disk and similar stupid stuff.

Robert Love said...

What if I did something like this...

var
o : TObject
begin
o := local TObject.Create;
GlobalList.Add(O);
end;

The Globallist now has a nil reference and may be invalid.

I would be very difficult (if not impossible) to walk a syntax tree at compile time and determine if a reference to the object was held longer than the scope of the calling method. So the only other approach would be to have a run time error on the end of the method because a reference was still held to it.

With interfaces you have reference counting, which would solves this problem, as you don't care if something held on to the object as it will be freed when no longer used.

Also, with a new keyword in the way you specify you could not have a method called "local" or what ever the new keyword was called. It would break existing code.

I understand the hope here, but the suggested solution presents too many problems.

I don't believe the cost to implement is worth the what we get out of it.

Personally I think a better solution would be reference counted objects or records. And/or Optional garbage collection on specific types such as records.

Lars D said...

I wasn't aware that "local" is a reserved keyword - my Delphi 2009 definitely does not recognize it as being one.

Your example code is buggy: You specify that an object is to be destroyed and then add it to a list that is meant to persist. You can already make that error today.

Reference counting could be another solution, but it does not work with circular references.

The local keyword solution basically removes 4-5 lines of code by adding one word to one line. Since it is 100% bidirectionally mappable to existing Delphi code, it is guaranteed to work. The only question is, whether it solves enough problems.

Jolyon Smith said...

Hi Lars,

My previously suggested variation on this theme - recycling the automated keyword and decorating the variable declaration, rather than the allocation, has some additional benefits:

Primarily it has the benefit of simply gifting an object reference variable similar behaviour that an interface reference variable already enjoys. i.e. it's easily understood.

Secondly it avoids problems where people might re-use a local variable in a procedure but forget to "declare" the behavour for a given allocation:

a := local TStringList.Create;

// do work with a, then re-use it

a := TStringList.Create;

// This time you still have to
// remember to free "a" !!


Also since it is declarative it can be applied to member variable declarations too, addressing the "this only works for local variables" problem.

And whilst it's true that local variables are not always initialised, *interface* references are and decorating a variable declaration in this way would instruct the compiler to treat it in the same way, i.e initialise it to NIL.

Lars D said...

Initializing local variables to nil/zero/whatever is really needed, because in some cases, the compiler has no idea whether they get initialized or not.

I focus a lot on the ability to visually inspect source code and see that it conforms with nice behavior. It adds an extra layer of security and prevents bugs. In this regard, I actually don't care about having the keyword on the variable or on the value, but there are many, many cases where you would want to avoid creating a variable.

Why create a variable, when all you want is to do a .ShowModal on a dynamically generated form?

procedure TMainForm.OnButtonClick (...);
begin
(local TMyForm.Create).ShowModal;
end;

The purpose is to write less, so adding 2 extra lines for variable declaration and initialization seems not to achieve the goal.

Andrea Raimondi said...

I usually use a different approach:

Type

TMemoryManager = class
private
FInternalList : TObjectList;
public
function CreateSL : TStringList;
function CreateMemStream : TMemoryStream;
function CreateObjClass( AClass : TClass) : TObject;
end;

Then use OwnsObjects = True in the constructor, place an instance on a DataModule(I use heaps of those), create it in OnCreate and free it in OnDestroy.

That's it.

Andrew