RAD Studio (Common)
|
This topic introduces the Delphi language character set, and describes the syntax for declaring:
The Delphi language uses the Unicode character encoding for its character set, including alphabetic and alphanumeric Unicode characters and the underscore. It is not case-sensitive. The space character and control characters (U+0000 through U+001F including U+000D, the return or end-of-line character) are blanks.
The RAD Studio compiler will accept a file encoded in UCS-2 or UCS-4 if the file contains a byte order mark. The speed of compilation may be penalized by the use for formats other than UTF–8, however. All characters in a UCS-4 encoded source file must be representable in UCS-2 without surrogate pairs. UCS-2 encodings with surrogate pairs (including GB18030) are accepted only if the codepage compiler option is specified.
Fundamental syntactic elements, called tokens, combine to form expressions, declarations, and statements. A statement describes an algorithmic action that can be executed within a program. An expression is a syntactic unit that occurs within a statement and denotes a value. A declaration defines an identifier (such as the name of a function or variable) that can be used in expressions and statements, and, where appropriate, allocates memory for the identifier.
On the simplest level, a program is a sequence of tokens delimited by separators. A token is the smallest meaningful unit of text in a program. A separator is either a blank or a comment. Strictly speaking, it is not always necessary to place a separator between two tokens; for example, the code fragment
Size:=20;Price:=10;
is perfectly legal. Convention and readability, however, dictate that we write this as
Size := 20; Price := 10;
Tokens are categorized as special symbols, identifiers, reserved words, directives, numerals, labels, and character strings. A separator can be part of a token only if the token is a character string. Adjacent identifiers, reserved words, numerals, and labels must have one or more separators between them.
Special symbols are non-alphanumeric characters, or pairs of such characters, that have fixed meanings. The following single characters are special symbols:
# $ & ' ( ) * + , - . / : ; < = > @ [ ] ^ { }
The following character pairs are also special symbols:
(* (. *) .) .. // := <= >= <>
The following table shows equivalent symbols:
Special symbol |
Equivalent symbols |
[ |
(. |
] |
.) |
{ |
(* |
} |
*) |
The left bracket [ is equivalent to the character pair of left parenthesis and period (.
The right bracket ] is equivalent to the character pair of period and right parenthesis .)
The left brace { is equivalent to the character pair of left parenthesis and asterisk (*.
The right brace } is equivalent to the character pair of right parenthesis and asterisk *)
Identifiers denote constants, variables, fields, types, properties, procedures, functions, programs, units, libraries, and packages. An identifier can be of any length, but only the first 255 characters are significant. An identifier must begin with an alphabetic character or an underscore (_) and cannot contain spaces; alphanumeric characters, digits, and underscores are allowed after the first character. Reserved words cannot be used as identifiers.
CalculateValue calculateValue calculatevalue CALCULATEVALUE
Since unit names correspond to file names, inconsistencies in case can sometimes affect compilation. For more information, see the topic, Unit References and the Uses Clause.
When you use an identifier that has been declared in more than one place, it is sometimes necessary to qualify the identifier. The syntax for a qualified identifier is
identifier1.identifier2
where identifier1 qualifies identifier2. For example, if two units each declare a variable called CurrentValue, you can specify that you want to access the CurrentValue in Unit2 by writing
Unit2.CurrentValue
Qualifiers can be iterated. For example,
Form1.Button1.Click
calls the Click method in Button1 of Form1.
If you don't qualify an identifier, its interpretation is determined by the rules of scope described in Blocks and scope.
You might encounter identifiers (e.g. types, or methods in a class) having the same name as a Delphi language keyword. For example, a class might have a method called begin. Another example is the CLR class called Type, in the System namespace. Type is a Delphi language keyword, and cannot be used for an identifier name.
If you qualify the identifier with its full namespace specification, then there is no problem. For example, to use the Type class, you must use its fully qualified name:
var TMyType : System.Type; // Using fully qualified namespace // avoides ambiguity with Delphi language keyword.
As a shorter alternative, the ampersand (&) operator can be used to resolve ambiguities between identifiers and Delphi language keywords. If you encounter a method or type that is the same name as a Delphi keyword, you can omit the namespace specification if you prefix the identifier name with an ampersand. For example, the following code uses the ampersand to disambiguate the CLR Type class from the Delphi keyword type
var TMyType : &Type; // Prefix with '&' is ok.
The following reserved words cannot be redefined or used as identifiers.
Reserved Words
add |
else |
initialization |
program |
then |
and |
end |
inline |
property |
threadvar |
array |
except |
interface |
raise |
to |
as |
exports |
is |
record |
try |
asm |
file |
label |
remove |
type |
begin |
final |
library |
repeat |
unit |
case |
finalization |
mod |
resourcestring |
unsafe |
class |
finally |
nil |
seled |
until |
const |
for |
not |
set |
uses |
constructor |
function |
not |
shl |
var |
destructor |
goto |
of |
shr |
while |
dispinterface |
if |
or |
static |
with |
div |
implementation |
out |
strict private |
xor |
do |
in |
packed |
strict protected |
|
downto |
inherited |
procedure |
string |
|
In addition to the words above, private, protected, public, published, and automated act as reserved words within class type declarations, but are otherwise treated as directives. The words at and on also have special meanings, and should be treated as reserved words.
Directives are words that are sensitive in specific locations within source code. Directives have special meanings in the Delphi language, but, unlike reserved words, appear only in contexts where user-defined identifiers cannot occur. Hence -- although it is inadvisable to do so -- you can define an identifier that looks exactly like a directive.
Directives
absolute |
export |
name |
protected |
scopedenums |
abstract |
external |
near |
public |
stdcall |
assembler |
far |
nodefault |
published |
stored |
automated |
forward |
overload |
read |
varargs |
cdecl |
implements |
override |
readonly |
virtual |
contains |
index |
package |
register |
write |
default |
inline |
pascal |
reintroduce |
writeonly |
deprecated |
library |
platform |
requires |
|
dispid |
local |
pointermath |
resident |
|
dynamic |
message |
private |
safecall |
|
Integer and real constants can be represented in decimal notation as sequences of digits without commas or spaces, and prefixed with the + or - operator to indicate sign. Values default to positive (so that, for example, 67258 is equivalent to +67258) and must be within the range of the largest predefined real or integer type.
Numerals with decimal points or exponents denote reals, while other numerals denote integers. When the character E or e occurs within a real, it means "times ten to the power of". For example, 7E2 means 7 * 10^2, and 12.25e+6 and 12.25e6 both mean 12.25 * 10^6.
The dollar-sign prefix indicates a hexadecimal numeral, for example, $8F. Hexadecimal numbers without a preceding - unary operator are taken to be positive values. During an assignment, if a hexadecimal value lies outside the range of the receiving type an error is raised, except in the case of the Integer (32-bit integer) where a warning is raised. In this case, values exceeding the positive range for Integer are taken to be negative numbers in a manner consistent with 2's complement integer representation.
For more information about real and integer types, see Data Types. For information about the data types of numerals, see True constants.
A label is a standard Delphi language identifier with the exception that, unlike other identifiers, labels can start with a digit. Numeric labels can include no more than ten digits - that is, a numeral between 0 and 9999999999.
Labels are used in goto statements. For more information about goto statements and labels, see Goto statements.
A character string, also called a string literal or string constant, consists of a quoted string, a control string, or a combination of quoted and control strings. Separators can occur only within quoted strings.
A quoted string is a sequence of characters, from an ANSI or multibyte character set, written on one line and enclosed by apostrophes. A quoted string with nothing between the apostrophes is a null string. Two sequential apostrophes in a quoted string denote a single character, namely an apostrophe.
The string is represented internally as a Unicode string encoded as UTF-16. Characters in the Basic Multilingual Plane (BMP) take 2 bytes, and characters not in the BMP require 4 bytes.
For example,
'CodeGear' { CodeGear } 'You''ll see' { You'll see } 'アプリケーションを Unicode 対応にする' '''' { ' } '' { null string } ' ' { a space }
A control string may also be written as a sequence of one or more integers, each of which consists of the # symbol followed by an unsigned integer constant from 0 to 65,535 (decimal) or from $0 to $FFFF (hexadecimal) in UTF-16 encoding, and denotes the character corresponding to a specified code value. Each integer is represented internally by 2 bytes in the string. This is useful for representing control characters and multibyte characters. The control string
#89#111#117
is equivalent to the quoted string
'You'
You can combine quoted strings with control strings to form larger character strings. For example, you could use
'Line 1'#13#10'Line 2'
to put a carriage-return line-feed between 'Line 1' and 'Line 2'. However, you cannot concatenate two quoted strings in this way, since a pair of sequential apostrophes is interpreted as a single character. (To concatenate quoted strings, use the + operator or simply combine them into a single quoted string.)
A character string is compatible with any string type and with the PChar type. Since an AnsiString type may contain multibyte characters, a character string with one character, single or multibyte, is compatible with any character type. When extended syntax is enabled (with compiler directive {$X+}), a nonempty character string of length n is compatible with zero-based arrays and packed arrays of n characters. For more information, see Datatypes.
Comments are ignored by the compiler, except when they function as separators (delimiting adjacent tokens) or compiler directives.
There are several ways to construct comments:
{ Text between a left brace and a right brace constitutes a comment. } (* Text between a left-parenthesis-plus-asterisk and an asterisk-plus-right-parenthesis is also a comment *) // Any text between a double-slash and the end of the line constitutes a comment.
Comments that are alike cannot be nested. For instance, {{}} will not work, but (*{}*)will. This latter form is useful for commenting out sections of code that also contain comments.
Here are some recommendations about how and when to use the three types of comment characters:
{$WARNINGS OFF}
tells the compiler not to generate warning messages.
Copyright(C) 2009 Embarcadero Technologies, Inc. All Rights Reserved.
|
What do you think about this topic? Send feedback!
|