RAD Studio (Common)
ContentsIndex
PreviousUpNext
Internal Data Formats

The following topics describe the internal formats of Delphi data types.

The format of an integer-type variable depends on its minimum and maximum bounds.

  • If both bounds are within the range 128..127 (Shortint), the variable is stored as a signed byte.
  • If both bounds are within the range 0..255 (Byte), the variable is stored as an unsigned byte.
  • If both bounds are within the range 32768..32767 (Smallint), the variable is stored as a signed word.
  • If both bounds are within the range 0..65535 (Word), the variable is stored as an unsigned word.
  • If both bounds are within the range 2147483648..2147483647 (Longint), the variable is stored as a signed double word.
  • If both bounds are within the range 0..4294967295 (Longword), the variable is stored as an unsigned double word.
  • Otherwise, the variable is stored as a signed quadruple word (Int64).
Note: a "word" occupies two bytes.

On the Win32 platform, AnsiChar, or a subrange of a AnsiChar type is stored as an unsigned byte. A WideChar is stored as an unsigned word.

A Boolean type is stored as a Byte, a ByteBool is stored as a Byte, a WordBool type is stored as a Word, and a LongBool is stored as a Longint

A Boolean can assume the values 0 (False) and 1 (True). ByteBool, WordBool, and LongBool types can assume the values 0 (False) or nonzero (True).

An enumerated type is stored as an unsigned byte if the enumeration has no more than 256 values and the type was declared in the {$Z1} state (the default). If an enumerated type has more than 256 values, or if the type was declared in the {$Z2} state, it is stored as an unsigned word. If an enumerated type is declared in the {$Z4} state, it is stored as an unsigned double-word.

The real types store the binary representation of a sign (+ or -), an exponent, and a significand. A real value has the form 

+/- significand * 2exponent 

where the significand has a single bit to the left of the binary decimal point. (That is, 0 <= significand < 2.) 

In the figures that follow, the most significant bit is always on the left and the least significant bit on the right. The numbers at the top indicate the width (in bits) of each field, with the left-most items stored at the highest addresses. For example, for a Real48 value, e is stored in the first byte, f in the following five bytes, and s in the most significant bit of the last byte.

The Real48 type

On the Win32 platform, a 6-byte (48-bit) Real48 number is divided into three fields:

1  
39  
8  
s  
f  
e  

If 0 < e <= 255, the value v of the number is given by 

v = (1)^s * 2^(e129) * (1.f) 

If e = 0, then v = 0. 

The Real48 type can't store denormals, NaNs, and infinities. Denormals become zero when stored in a Real48, while NaNs and infinities produce an overflow error if an attempt is made to store them in a Real48.

The Single type

A 4-byte (32-bit) Single number is divided into three fields

1  
8  
23  
s  
e  
f  

The value v of the number is given by 

if 0 < e < 255, then v = (1)^s * 2^(e127) * (1.f) 

if e = 0 and f <> 0, then v = (1)^s * 2^(126) * (0.f) 

if e = 0 and f = 0, then v = (1)^s * 0 

if e = 255 and f = 0, then v = (1)^s * Inf 

if e = 255 and f <> 0, then v is a NaN

The Double type

An 8-byte (64-bit) Double number is divided into three fields

1  
11  
52  
s  
e  
f  

The value v of the number is given by 

if 0 < e < 2047, then v = (1)^s * 2^(e1023) * (1.f) 

if e = 0 and f <> 0, then v = (1)^s * 2^(1022) * (0.f) 

if e = 0 and f = 0, then v = (1)^s * 0 

if e = 2047 and f = 0, then v = (1)^s * Inf 

if e = 2047 and f <> 0, then v is a NaN

The Extended type

A 10-byte (80-bit) Extended number is divided into four fields:

1  
15  
1  
63  
s  
e  
i  
f  

The value v of the number is given by 

if 0 <= e < 32767, then v = (1)^s * 2^(e16383) * (i.f) 

if e = 32767 and f = 0, then v = (1)^s * Inf 

if e = 32767 and f <> 0, then v is a NaN

The Comp type

An 8-byte (64-bit) Comp number is stored as a signed 64-bit integer.

The Currency type

An 8-byte (64-bit) Currency number is stored as a scaled and signed 64-bit integer with the four least-significant digits implicitly representing four decimal places.

A Pointer type is stored in 4 bytes as a 32-bit address. The pointer value nil is stored as zero.

A string occupies as many bytes as its maximum length plus one. The first byte contains the current dynamic length of the string, and the following bytes contain the characters of the string. 

The length byte and the characters are considered unsigned values. Maximum string length is 255 characters plus a length byte (string[255]).

A string variable of type UnicodeString or AnsiString occupies four bytes of memory which contain a pointer to a dynamically allocated string. When a string variable is empty (contains a zero-length string), the string pointer is nil and no dynamic memory is associated with the string variable. For a nonempty string value, the string pointer points to a dynamically allocated block of memory that contains the string value in addition to information describing the string. The table below shows the layout of a long-string memory block.  

String dynamic memory layout (Win32 only)  

Offset  
Contents  
-12  
16-bit codepage of string data  
-10  
16-bit element size of string data  
-8  
32-bit reference-count  
-4  
length in bytes  
0..Length - 1  
character string of element sized data  
Length*Element Size  
NULL character  

The NULL character at the end of a string memory block is automatically maintained by the compiler and the built-in string handling routines. This makes it possible to typecast a string directly to a null-terminated string. 

For string constants and literals, the compiler generates a memory block with the same layout as a dynamically allocated string, but with a reference count of -1. When a string variable is assigned a string constant, the string pointer is assigned the address of the memory block generated for the string constant. The built-in string handling routines know not to attempt to modify blocks that have a reference count of -1.

On Win32, a wide string variable occupies four bytes of memory which contain a pointer to a dynamically allocated string. When a wide string variable is empty (contains a zero-length string), the string pointer is nil and no dynamic memory is associated with the string variable. For a nonempty string value, the string pointer points to a dynamically allocated block of memory that contains the string value in addition to a 32-bit length indicator. The table below shows the layout of a wide string memory block on Windows.  

Wide string dynamic memory layout (Win32 only)  

Offset  
Contents  
-4  
32-bit length indicator (in bytes)  
0..Length -1  
character string  
Length  
NULL character  

The string length is the number of bytes, so it is twice the number of wide characters contained in the string. 

The NULL character at the end of a wide string memory block is automatically maintained by the compiler and the built-in string handling routines. This makes it possible to typecast a wide string directly to a null-terminated string.

A set is a bit array where each bit indicates whether an element is in the set or not. The maximum number of elements in a set is 256, so a set never occupies more than 32 bytes. The number of bytes occupied by a particular set is equal to (Max div 8) (Min div 8) + 1, where Max and Min are the upper and lower bounds of the base type of the set. The byte number of a specific element E is (E div 8) (Min div 8) and the bit number within that byte is E mod 8, where E denotes the ordinal value of the element. When possible, the compiler stores sets in CPU registers, but a set always resides in memory if it is larger than the generic Integer type or if the program contains code that takes the address of the set.

On the Win32 platform, a static array is stored as a contiguous sequence of variables of the component type of the array. The components with the lowest indexes are stored at the lowest memory addresses. A multidimensional array is stored with the rightmost dimension increasing first.

On the Win32 platform, a dynamic-array variable occupies four bytes of memory which contain a pointer to the dynamically allocated array. When the variable is empty (uninitialized) or holds a zero-length array, the pointer is nil and no dynamic memory is associated with the variable. For a nonempty array, the variable points to a dynamically allocated block of memory that contains the array in addition to a 32-bit length indicator and a 32-bit reference count. The table below shows the layout of a dynamic-array memory block.  

Dynamic array memory layout (Win32 only)  

Offset  
Contents  
-8  
32-bit reference-count  
-4  
32-bit length indicator (number of elements)  
0..Length * (size of element) -1  
array elements  

When a record type is declared in the {$A+} state (the default), and when the declaration does not include a packed modifier, the type is an unpacked record type, and the fields of the record are aligned for efficient access by the CPU. The alignment is controlled by the type of each field and by whether fields are declared together. Every data type has an inherent alignment, which is automatically computed by the compiler. The alignment can be 1, 2, 4, or 8, and represents the byte boundary that a value of the type must be stored on to provide the most efficient access. The table below lists the alignments for all data types.  

Type alignment masks (Win32 only)  

Type  
Alignment  
Ordinal types  
size of the type (1, 2, 4, or 8)  
Real types  
2 for Real48, 4 for Single, 8 for Double and Extended  
Short string types  
1  
Array types  
same as the element type of the array.  
Record types  
the largest alignment of the fields in the record  
Set types  
size of the type if 1, 2, or 4, otherwise 1  
All other types  
determined by the $A directive.  

To ensure proper alignment of the fields in an unpacked record type, the compiler inserts an unused byte before fields with an alignment of 2, and up to three unused bytes before fields with an alignment of 4, if required. Finally, the compiler rounds the total size of the record upward to the byte boundary specified by the largest alignment of any of the fields. 

If two fields share a common type specification, they are packed even if the declaration does not include the packed modifier and the record type is not declared in the {$A-} state. Thus, for example, given the following declaration

type
  TMyRecord = record
    A, B: Extended;  
    C: Extended;
  end;

A and B are packed (aligned on byte boundaries) because they share the same type specification. The compiler pads the structure with unused bytes to ensure that C appears on a quadword boundary. 

When a record type is declared in the {$A-} state, or when the declaration includes the packed modifier, the fields of the record are not aligned, but are instead assigned consecutive offsets. The total size of such a packed record is simply the size of all the fields. Because data alignment can change, it's a good idea to pack any record structure that you intend to write to disk or pass in memory to another module compiled using a different version of the compiler.

On the Win32 platform, file types are represented as records. Typed files and untyped files occupy 592 bytes, which are laid out as follows:

type
TFileRec = packed record
  Handle: Integer;
  Mode: word;
  Flags: word;
    case Byte of
        0: (RecSize: Cardinal);
        1: (BufSize: Cardinal;
            BufPos: Cardinal;
            BufEnd: Cardinal;
            BufPtr: PChar;
            OpenFunc: Pointer;
            InOutFunc: Pointer;
            FlushFunc: Pointer;
            CloseFunc: Pointer;
            UserData: array[1..32] of Byte;
            Name: array[0..259] of Char; );
end;

Text files occupy 848 bytes, which are laid out as follows:

type
        TTextBuf = array[0..127] of Char;
        TTextRec = packed record
                Handle: Integer;
                Mode: word;
                Flags: word;
                BufSize: Cardinal;
                BufPos: Cardinal;
                BufEnd: Cardinal;
                BufPtr: PChar;      
                OpenFunc: Pointer;
                InOutFunc: Pointer;
                FlushFunc: Pointer;
                CloseFunc: Pointer;
                UserData: array[1..32] of Byte;
                Name: array[0..259] of Char;
                Buffer: TTextBuf;
end;

Handle contains the file's handle (when the file is open). 

The Mode field can assume one of the values

const
    fmClosed = $D7B0;
    fmInput= $D7B1;
    fmOutput = $D7B2;
    fmInOut= $D7B3;

where fmClosed indicates that the file is closed, fmInput and fmOutput indicate a text file that has been reset (fmInput) or rewritten (fmOutput), fmInOut indicates a typed or untyped file that has been reset or rewritten. Any other value indicates that the file variable is not assigned (and hence not initialized). 

The UserData field is available for user-written routines to store data in. 

Name contains the file name, which is a sequence of characters terminated by a null character (#0). 

For typed files and untyped files, RecSize contains the record length in bytes, and the Private field is unused but reserved. 

For text files, BufPtr is a pointer to a buffer of BufSize bytes, BufPos is the index of the next character in the buffer to read or write, and BufEnd is a count of valid characters in the buffer. OpenFunc, InOutFunc, FlushFunc, and CloseFunc are pointers to the I/O routines that control the file; see Device functions. Flags determines the line break style as follows:

bit 0 clear  
LF line breaks  
bit 0 set  
CRLF line breaks  

All other Flags bits are reserved for future use.

On the Win32 platform, a procedure pointer is stored as a 32-bit pointer to the entry point of a procedure or function. A method pointer is stored as a 32-bit pointer to the entry point of a method, followed by a 32-bit pointer to an object.

On the Win32 platform, a class-type value is stored as a 32-bit pointer to an instance of the class, which is called an object. The internal data format of an object resembles that of a record. The object's fields are stored in order of declaration as a sequence of contiguous variables. Fields are always aligned, corresponding to an unpacked record type. Any fields inherited from an ancestor class are stored before the new fields defined in the descendant class. 

The first 4-byte field of every object is a pointer to the virtual method table (VMT) of the class. There is exactly one VMT per class (not one per object); distinct class types, no matter how similar, never share a VMT. VMT's are built automatically by the compiler, and are never directly manipulated by a program. Pointers to VMT's, which are automatically stored by constructor methods in the objects they create, are also never directly manipulated by a program. 

The layout of a VMT is shown in the following table. At positive offsets, a VMT consists of a list of 32-bit method pointersone per user-defined virtual method in the class typein order of declaration. Each slot contains the address of the corresponding virtual method's entry point. This layout is compatible with a C++ v-table and with COM. At negative offsets, a VMT contains a number of fields that are internal to Delphi's implementation. Applications should use the methods defined in TObject to query this information, since the layout is likely to change in future implementations of the Delphi language.  

Virtual method table layout (Win32 Only)  

Offset  
Type  
Description  
-76  
Pointer  
pointer to virtual method table (or nil)  
-72  
Pointer  
pointer to interface table (or nil)  
-68  
Pointer  
pointer to Automation information table (or nil)  
-64  
Pointer  
pointer to instance initialization table (or nil)  
-60  
Pointer  
pointer to type information table (or nil)  
-56  
Pointer  
pointer to field definition table (or nil)  
-52  
Pointer  
pointer to method definition table (or nil)  
-48  
Pointer  
pointer to dynamic method table (or nil)  
-44  
Pointer  
pointer to short string containing class name  
-40  
Cardinal  
instance size in bytes  
-36  
Pointer  
pointer to a pointer to ancestor class (or nil)  
-32  
Pointer  
pointer to entry point of SafecallException method (or nil)  
-28  
Pointer  
entry point of AfterConstruction method  
-24  
Pointer  
entry point of BeforeDestruction method  
-20  
Pointer  
entry point of Dispatch method  
-16  
Pointer  
entry point of DefaultHandler method  
-12  
Pointer  
entry point of NewInstance method  
-8  
Pointer  
entry point of FreeInstance method  
-4  
Pointer  
entry point of Destroy destructor  
0  
Pointer  
entry point of first user-defined virtual method  
4  
Pointer  
entry point of second user-defined virtual method  

On the Win32 platform, a class-reference value is stored as a 32-bit pointer to the virtual method table (VMT) of a class.

The following discussion of the internal layout of variant types applies to the Win32 platform only. Variants rely on boxing and unboxing of data into an object wrapper, as well as Delphi helper classes to implement the variant-related RTL functions.  

On the Win32 platform, a variant is stored as a 16-byte record that contains a type code and a value (or a reference to a value) of the type given by the code. The System and Variants units define constants and types for variants. 

The TVarData type represents the internal structure of a Variant variable (on Windows, this is identical to the Variant type used by COM and the Win32 API). The TVarData type can be used in typecasts of Variant variables to access the internal structure of a variable. The TVarData record contains the following fields:

  • VType contains the type code of the variant in the lower twelve bits (the bits defined by the varTypeMask constant). In addition, the varArray bit may be set to indicate that the variant is an array, and the varByRef bit may be set to indicate that the variant contains a reference as opposed to a value.
  • The Reserved1, Reserved2, and Reserved3 fields are unused.
The contents of the remaining eight bytes of a TVarData record depend on the VType field as follows:
  • If neither the varArray nor the varByRef bits are set, the variant contains a value of the given type.
  • If the varArray bit is set, the variant contains a pointer to a TVarArray structure that defines an array. The type of each array element is given by the varTypeMask bits in the VType field.
  • If the varByRef bit is set, the variant contains a reference to a value of the type given by the varTypeMask and varArray bits in the VType field.
The varString type code is private. Variants containing a varString value should never be passed to a non-Delphi function. On Win32, Delphi's Automation support automatically converts varString variants to varOleStr variants before passing them as parameters to external functions.

Copyright(C) 2009 Embarcadero Technologies, Inc. All Rights Reserved.
What do you think about this topic? Send feedback!