Saturday, October 11, 2008

Delphi 2009: Array is dead. Long live Array

I start learn the concept of array in high school mathematic subject.  The array has dimensions, 1D, 2D and so on.  The first programming language BASIC I learned also has array.  The concept of mathematic array and programming array match perfectly.  However, there is always a memory restriction using array in 8-bits and 16-bits world.

I start ignore array after I learn object oriented programming and design patterns.  There is always ready classes like TList or TCollection for me to use in OO world.  The OOP concept has poisoned me for years that I should always coding in OO way.  The TList class can do more than array and I almost forget that Object Pascal still has array.

I migrate my Delphi 2007 code to Delphi 2009 when it was launched.  In the migration stage, the most headache part is unicode conversion.  Due to several reasons below:

  1. Some 3rd party components aren't ready for Delphi 2009 yet and some in beta stage.  Even the component makers claim they already ready but it is still new and I don't have confident yet.
  2. My current application persistent storage (database or resource) is in ANSI format.  I need some buffer duration before I port to Unicode.

I will still stay in Delphi 2007 for a while before I am ready to release application compiled in Delphi 2009.

At this stage, I will revise all my code that aren't compatible with Delphi 2009 and amend it to compatible for both Delphi 2007 and Delphi 2009.

I have some classes perform Base16 and Base64 encoding.  These classes using string to as internal storage.  It become a problem in Delphi 2009 as it use WideChar (2 bytes) and the effect is avalanche as other part of source use the encoding classes.  I revise the code and use TBytes (array of byte) as internal storage.

A new problem raise, the lack of solid array knowledge has slow down my day to day coding practice.  My knowledge of array is still stay in high school.  I re-study the array construct in Delphi documentation to strengthen my understanding about it.  Some I already know but some don't.  Below are the result of my study.

Static Array

Static array has fixed memory allocated at compiled time.  For example,

var A, B: array[1..10] of integer;

define A and B as byte array of 10 element.

To initialize a static array:

var C: array[1..3] of integer = (1, 2, 3);

  • Length(A) return 10
  • SizeOf(A) return 40
  • Low(A) return 1
  • High(A) return 10
  • FillChar(A, Length(A), 77) or
    FillChar(A[1], Length(A), 77)
    will fill all elements with value 77
  • Move(A, B, Length(A)) or
    Move(A[1], B[1], Length(A))
    will copy all elements from A to B

Dynamic Array

As it name implied, the size of dynamic array is not fixed at compile time.  It is determine at runtime.  Thus, there are some operations distinct to static array:

var A: array of integer;

A is of type pointer to an array memory storage.  Thus, when we apply any operation against dynamic array, always treat it as pointer to reduce any confusions for operation like FillChar or Move.

Use SetLength to allocate memory storage for dynamic array:

SetLength(A, 10)

A good news is the system will manage the dynamic array storage area.  there is no need to free the storage size allocated via SetLength.

To initialize a dynamic array (undocumented feature):

type
  TDynIntegerArray = array of integer;

var C: TDynByteArray;
begin
  C := TDynByteArray.Create(1, 2, 3, 4);
end;

  • Length(A) return 10
  • SizeOf(A) return 4, same as SizeOf(Pointer).  To get the array physical storage size, use Length(A) * SizeOf(Integer)
  • Low(A) return 0 (Dynamic array always starting from 0)
  • High(A) return 9
  • FillChar(A[0], Length(A), 77)
    will fill all elements with value 77 but
    FillChar(A, Length(A), 77)
    will lead to unexpected result.
  • Move(A[0], B[0], Length(A))
    will copy all elements from A to B but
    Move(A, B, Length(A))
    will lead to unexpected result.

Array assignment

Arrays are assignment-compatible only if they are of the same type. Because the Delphi language uses name-equivalence for types, the following code will not compile:

var A: array[1..10] of integer;
    B: array[1..10] of integer;
    C: array of integer;
    D: array of integer;

begin
  B := A;
  D := C;
end;

To make it works, either do this:

var A, B: array[1..10] of integer;
    C, D: array of integer;
begin
  B := A;
  D := C;

end;

or

type TIntegerArray = array[1..10] of integer;
     TDynIntegerArray = array of integer;

var A: TIntegerArray;
    B: TIntegerArray;
    C: TDynIntegerArray;
    D: TDynIntegerArray;
begin
  B := A;
  D := C;
end;

As static arrays has pre-allocated memory storage, copy-on-write is not employed on static array assignment.  In the above example, B will copy of all elements' value from A.  Changing any value in B[i] will has no effect on A.  A and B has 2 independent storage area.

Unlike static array, dynamic array reference is a pointer.  Thus, B := A will make B point to A's storage area.  The storage area for A is also storage area for B now.  Changing value in A[i] will be reflected immediately on B[i] and vice versa.  Both A and B share same storage area.  To practice copy-on-write for dynamic array, use Copy function

B := Copy(A, 0, Length(A)):

It is not need to allocate storage for B using SetLength prior to Copy.  Now A and B has two distinct storage area. Changing B[i] or A[i] do not affect each other.

Open Array Parameters

Open Array is not static or dynamic array.  It is use as parameters in procedures or functions.

Unfortunately, it has same syntax as dynamic array and always confuse us.  Open array parameter must always declares as "array of <baseType>".  If we declare a type for it, it is not an open array.

This is open array parameter:

procedure MyProc(A: array of integer);
begin
end;

This is not open array parameter:

type
  TIntegerArray = array of integer;

procedure MyProc(A: TIntegerArray);
begin
end;

Open array has several rules:

  • They are always zero-based. The first element is 0, the second element is 1, and so forth. The standard Low and High functions return 0 and Length1, respectively. The SizeOf function returns the size of the actual array passed to the routine.
  • They can be accessed by element only. Assignments to an entire open array parameter are not allowed.
  • They can be passed to other procedures and functions only as open array parameters or untyped var parameters. They cannot be passed to SetLength.
  • Instead of an array, you can pass a variable of the open array parameter's base type. It will be treated as an array of length 1.

For example,

procedure MyProc(A: array of integer);
begin
  WriteLn('SizeOf(A)=', SizeOf(A));
  WriteLn('Length(A)=', Length(A));
end;

var A: array[0..9] of integer;
begin
  MyProc(A);
end;

The output is

  • SizeOf(A)=40
  • Length(A)=10

We can pass variable to open array parameter of a routine, it will be treated as a single element array:

var i: integer;
begin
  MyProc(i);
end;

The output is

  • SizeOf(A)=4
  • Length(A)=1

Conclusion

For simple construct, using array is efficient and easy.  It consume less system resources than collection classes.

No comments: