RAD Studio (Common)
ContentsIndex
PreviousUpNext
Understanding Assembler Syntax (Win32 Only)

The inline assembler is available only on the Win32 Delphi compiler. The following material describes the elements of the assembler syntax necessary for proper use.

  • Assembler Statement Syntax
  • Labels
  • Instruction Opcodes
  • Assembly Directives
  • Operands

This syntax of an assembly statement is

Label: Prefix Opcode Operand1, Operand2

where Label is a label, Prefix is an assembly prefix opcode (operation code), Opcode is an assembly instruction opcode or directive, and Operand is an assembly expression. Label and Prefix are optional. Some opcodes take only one operand, and some take none. 

Comments are allowed between assembly statements, but not within them. For example,

     MOV AX,1 {Initial value}           { OK }
     MOV CX,100 {Count}                                                         { OK }
     
                    MOV {Initial value} AX,1;                               { Error! }
     MOV CX, {Count} 100                                                    { Error! }

Labels are used in built-in assembly statements as they are in the Delphi language by writing the label and a colon before a statement. There is no limit to a label's length. As in Delphi, labels must be declared in a label declaration part in the block containing the asm statement. The one exception to this rule is local labels. 

Local labels are labels that start with an at-sign (@). They consist of an at-sign followed by one or more letters, digits, underscores, or at-signs. Use of local labels is restricted to asm statements, and the scope of a local label extends from the asm reserved word to the end of the asm statement that contains it. A local label doesn't have to be declared.

The built-in assembler supports all of the Intel-documented opcodes for general application use. Note that operating system privileged instructions may not be supported. Specifically, the following families of instructions are supported:

  • Pentium family
  • Pentium Pro and Pentium II
  • Pentium III
  • Pentium 4
In addition, the built-in assembler supports the following instruction sets
  • AMD 3DNow! (from the AMD K6 onwards)
  • AMD Enhanced 3DNow! (from the AMD Athlon onwards)
For a complete description of each instruction, refer to your microprocessor documentation.

RET instruction sizing

The RET instruction opcode always generates a near return.

Automatic jump sizing

Unless otherwise directed, the built-in assembler optimizes jump instructions by automatically selecting the shortest, and therefore most efficient, form of a jump instruction. This automatic jump sizing applies to the unconditional jump instruction (JMP), and to all conditional jump instructions when the target is a label (not a procedure or function). 

For an unconditional jump instruction (JMP), the built-in assembler generates a short jump (one-byte opcode followed by a one-byte displacement) if the distance to the target label is 128 to 127 bytes. Otherwise it generates a near jump (one-byte opcode followed by a two-byte displacement). 

For a conditional jump instruction, a short jump (one-byte opcode followed by a one-byte displacement) is generated if the distance to the target label is 128 to 127 bytes. Otherwise, the built-in assembler generates a short jump with the inverse condition, which jumps over a near jump to the target label (five bytes in total). For example, the assembly statement

JC    Stop

where Stop isn't within reach of a short jump, is converted to a machine code sequence that corresponds to this:

     JNC    Skip
     JMP    Stop
     Skip:

Jumps to the entry points of procedures and functions are always near.

The built-in assembler supports three assembly define directives: DB (define byte), DW (define word), and DD (define double word). Each generates data corresponding to the comma-separated operands that follow the directive.

Directive 
Description 
DB 
Define byte: generates a sequence of bytes. Each operand can be a constant expression with a value between 128 and 255, or a character string of any length. Constant expressions generate one byte of code, and strings generate a sequence of bytes with values corresponding to the ASCII code of each character.  
DW 
Define word: generates a sequence of words. Each operand can be a constant expression with a value between 32,768 and 65,535, or an address expression. For an address expression, the built-in assembler generates a near pointer, a word that contains the offset part of the address.  
DD 
Define double word: generates a sequence of double words. Each operand can be a constant expression with a value between 2,147,483,648 and 4,294,967,295, or an address expression. For an address expression, the built-in assembler generates a far pointer, a word that contains the offset part of the address, followed by a word that contains the segment part of the address.  
DQ 
Define quad word: defines a quad word for Int64 values.  

The data generated by the DB, DW, and DD directives is always stored in the code segment, just like the code generated by other built-in assembly statements. To generate uninitialized or initialized data in the data segment, you should use Delphi var or const declarations. 

Some examples of DB, DW, and DD directives follow.

 asm
     DB         FFH                                                                                             { One byte }
     DB         0,99                                                                                        { Two bytes }
     DB         'A'                                                                                             { Ord('A') }
     DB         'Hello world...',0DH,0AH        { String followed by CR/LF }
     DB         12,'string'                                                             { Delphi style string }
     DW         0FFFFH                                                                              { One word }
     DW         0,9999                                                                                  { Two words }
     DW         'A'                                                                                             { Same as DB  'A',0 }
     DW         'BA'                                                                                        { Same as DB 'A','B' }
     DW         MyVar                                                                                   { Offset of MyVar }
     DW         MyProc                                                                              { Offset of MyProc }
     DD         0FFFFFFFFH                                                              { One double-word }
     DD         0,999999999                                                             { Two double-words }
     DD         'A'                                                                                             { Same as DB 'A',0,0,0 }
     DD         'DCBA'                                                                              { Same as DB 'A','B','C','D' }
     DD         MyVar                                                                                   { Pointer to MyVar }
     DD         MyProc                                                                              { Pointer to MyProc }
 end;

When an identifier precedes a DB, DW , or DD directive, it causes the declaration of a byte-, word-, or double-word-sized variable at the location of the directive. For example, the assembler allows the following:

     ByteVar        DB       ?
     WordVar        DW      ?
     IntVar         DD      ?
                    .
                    .
                    .     
                MOV         AL,ByteVar
                MOV         BX,WordVar
                MOV         ECX,IntVar

The built-in assembler doesn't support such variable declarations. The only kind of symbol that can be defined in an inline assembly statement is a label. All variables must be declared using Delphi syntax; the preceding construction can be replaced by

var
     ByteVar: Byte;
     WordVar: Word;
     IntVar: Integer;
                    .
                    .
                    .   
  
    asm
     MOV AL,ByteVar
     MOV BX,WordVar
     MOV ECX,IntVar
    end;

SMALL and LARGE can be used determine the width of a displacement:

MOV EAX, [LARGE $1234]

This instruction generates a 'normal' move with a 32-bit displacement ($00001234).

MOV EAX, [SMALL $1234]

The second instruction will generate a move with an address size override prefix and a 16-bit displacement ($1234). 

SMALL can be used to save space. The following example generates an address size override and a 2-byte address (in total three bytes)

 MOV EAX, [SMALL 123]

as opposed to

 MOV EAX, [123]

which will generate no address size override and a 4-byte address (in total four bytes). 

Two additional directives allow assembly code to access dynamic and virtual methods: VMTOFFSET and DMTINDEX. 

VMTOFFSET retrieves the offset in bytes of the virtual method pointer table entry of the virtual method argument from the beginning of the virtual method table (VMT). This directive needs a fully specified class name with a method name as a parameter (for example, TExample.VirtualMethod), or an interface name and an interface method name. 

DMTINDEX retrieves the dynamic method table index of the passed dynamic method. This directive also needs a fully specified class name with a method name as a parameter, for example, TExample.DynamicMethod. To invoke the dynamic method, call System.@CallDynaInst with the (E)SI register containing the value obtained from DMTINDEX.

Note: Methods with the message directive are implemented as dynamic methods and can also be called using the DMTINDEX technique. For example:

  TMyClass = class
     procedure x; message MYMESSAGE;
  end;

The following example uses both DMTINDEX and VMTOFFSET to access dynamic and virtual methods:

program Project2;
   type
    TExample = class
     procedure DynamicMethod; dynamic;
        procedure VirtualMethod; virtual;
    end;
     
                procedure TExample.DynamicMethod;
    begin
        
             end;
     
                procedure TExample.VirtualMethod;
    begin
   
             end;
     
                procedure CallDynamicMethod(e: TExample);
     asm
         // Save ESI register
        PUSH    ESI
     
                        // Instance pointer needs to be in EAX
        MOV     EAX, e
     
                        // DMT entry index needs to be in (E)SI
        MOV     ESI, DMTINDEX TExample.DynamicMethod
     
                        // Now call the method
        CALL    System.@CallDynaInst
     
                        // Restore ESI register
        POP ESI
     
                    end;
     
                    procedure CallVirtualMethod(e: TExample);
     asm
            // Instance pointer needs to be in EAX
            MOV     EAX, e
        
                            // Retrieve VMT table entry
            MOV     EDX, [EAX]
     
                            // Now call the method at offset VMTOFFSET
            CALL    DWORD PTR [EDX + VMTOFFSET TExample.VirtualMethod]
     
                        end;
     
                    var
        e: TExample;
     begin
        e := TExample.Create;
            try
                CallDynamicMethod(e);
                CallVirtualMethod(e);
            finally
                e.Free;
     end;
    end.

Inline assembler operands are expressions that consist of constants, registers, symbols, and operators. 

Within operands, the following reserved words have predefined meanings:  

Built-in assembler reserved words  

AH  
CL  
DX  
ESP  
mm4  
SHL  
WORD  
AL  
CS  
EAX  
FS  
mm5  
SHR  
xmm0  
AND  
CX  
EBP  
GS  
mm6  
SI  
xmm1  
AX  
DH  
EBX  
HIGH  
mm7  
SMALL  
xmm2  
BH  
DI  
ECX  
LARGE  
MOD  
SP  
xmm3  
BL  
DL  
EDI  
LOW  
NOT  
SS  
xmm4  
BP  
CL  
EDX  
mm0  
OFFSET  
ST  
xmm5  
BX  
DMTINDEX  
EIP  
mm1  
OR  
TBYTE  
xmm6  
BYTE  
DS  
ES  
mm2  
PTR  
TYPE  
xmm7  
CH  
DWORD  
ESI  
mm3  
QWORD  
VMTOFFSET  
XOR  

Reserved words always take precedence over user-defined identifiers. For example,

var
  Ch: Char;
        .
        .
        .     
asm
  MOV   CH, 1
end;

loads 1 into the CH register, not into the Ch variable. To access a user-defined symbol with the same name as a reserved word, you must use the ampersand (&) override operator:

MOV&Ch, 1

It is best to avoid user-defined identifiers with the same names as built-in assembler reserved words.

Copyright(C) 2008 CodeGear(TM). All Rights Reserved.
What do you think about this topic? Send feedback!