Blackfish SQL
|
This chapter provides an overview of Blackfish SQL system architecture.
Blackfish SQL for Windows and Blackfish SQL for Java are compatible in these ways:
Compatibility is restricted in the following ways:
The following Blackfish SQL for Java features are not yet supported in the Windows version:
Windows applications can use one or more of the following connectivity solutions to access a Blackfish SQL database programmatically:
See Establishing Connections for instructions and code examples for using these drivers.
Java applications can use one or more of the following connectivity solutions to access a Blackfish SQL database programmatically:
See Establishing Connections for instructions and code examples for using these drivers.
NOTE: This feature is available only with Blackfish SQL for Java.
DataExpress is a set of JavaBean runtime components that provide functionality not addressed by the JDBC standard. JavaBean is an industry-standard component architecture for Java. The JavaBean standard specifies many important aspects of components needed for RAD development environments. JavaBean components can be designed in a visual designer and can be customized with the properties, methods, and events that they expose.
DataExpress is included in the component palette of CodeGear JBuilder Visual Designer. For information on using DataExpress from within JBuilder, see the JBuilder Help.
Because DataExpress is a set of runtime components, you need not use JBuilder to develop and deploy applications that use DataExpress.
The majority of DataExpress JavaBean components are those required to build both server-side and client-side database applications. Client-side applications require high quality data binding to visual components such as grid controls, as well as support for reading and writing data to a database.
Server-side applications require data access components to help with reading and writing data to a database, but presentation is typically handled by a web page generation system such as Java Server Pages (JSPs). Even though DataExpress has extensive support for client-side data binding to visual component libraries such as dbSwing and JBCL, the DataExpress design still separates the presentation from the data access layer. This allows DataExpress components to be used as a data access layer for other presentation paradigms such as the JSP/servlet approach employed by JBuilder InternetBeans Express technology.
The DataExpress architecture allows for a pluggable storage interface to cache the data that is read from a data source. Currently, there are only two implementations of this interface, MemoryStore (the default), and DataStore. By setting just two properties on a StorageDataSet JavaBean component, a Blackfish SQL table can be directly navigated and edited with a StorageDataSet JavaBean. By setting the DataSet property of a dbSwing grid control, the entire contents of large tables can be directly browsed, searched, and edited at high speed. This effectively provides an ISAM-level data access layer for Blackfish SQL tables.
There are many DataExpress components that can be used to automate administrative tasks. Commonly-used components are:
DataExpress Administrative Components
Task |
Component |
Custom server start and shutdown |
com.borland.datastore.jdbc.DataStoreServer |
Database backup, restore, and pack |
com.borland.datastore.DataStoreConnection.copyStreams() |
Security administration |
com.borland.datastore.DataStoreConnection |
Transaction management |
com.borland.datastore.TxManager com.borland.datastore.DataStore |
JBuilder provides a source code .jar file that includes a large portion of the DataExpress JavaBean components. This enables you to more easily debug your applications and gain a better understanding of the DataExpress JavaBean components.
The primary difference between using local and remote drivers with Blackfish SQL is:
A local Blackfish SQL driver provides the following benefits:
You can use a remote Blackfish SQL driver to execute Blackfish SQL in a separate database server process. However, before the application can use a remote driver, the Blackfish SQL server process must be started. Executing the Blackfish SQL database kernel in a separate database server process provides the following benefits:
Using both the local and remote driver to access the same database can give you the best of both worlds. A Blackfish SQL database file can be held open by only one operating system process. When you connect using the local driver, the process that uses the local driver holds the database file open. When the remote driver makes the connection, the Blackfish SQL server process holds the database file open.
Since the local driver causes the database file to be open in the same process, it prevents connections from the remote driver. However, if the process that uses the local driver also starts a Blackfish SQL server in the same process, then other processes using the remote driver can access the same database as the local driver.
The Blackfish SQL server can be started inside an application by using a single line of Java code that instantiates a DataStoreServer component and executes its start method. The DataStoreServer runs on a separate thread and services connection requests from processes that use the remote driver to access a Blackfish SQL database on the computer that on which the DataStoreServer was started.
In addition, the local driver can be used by the application that launched the DataStoreServer for faster in-process driver calls into the Blackfish SQL database engine.
These files are created and used by Blackfish SQL:
A Blackfish SQL database can still be used if the anchor or status log files do not exist.
A non-transactional (read only) database only needs the .jds database file.
The specifications for Blackfish SQL database file capacity are:
Blackfish SQL Database File Capacity
Specification |
Value |
Minimum block size: |
1 KB |
Maximum block size: |
32 KB |
Default block size |
4 KB |
Maximum Blackfish SQL database file size |
2 billion blocks. For the default block size, that yields a maximum of 8,796,093,022,208 bytes (8TB). |
Maximum number of rows per table stream |
281,474,976,710,656 |
Maximum row length |
1/3 of the block size. Long Strings, objects, and input streams that are stored as Blobs instead of occupying space in the row. |
Maximum Blob size |
2GB each |
Maximum file stream size |
2GB each |
A Blackfish SQL database file can contain these types of data streams:
A single Blackfish SQL database can contain all stream types.
Streams are organized in a file system directory. The ability to store both tables and arbitrary files in the same file system allows all of the data for an application to be contained in a single portable, transactional file system. A Blackfish SQL database can also be encrypted and password protected.
The specifications for Blackfish SQL database file systems are:
Blackfish SQL Database File System Specifications
Specification |
Value |
Directory separator character for streams |
/ |
Maximum stream name length |
192 bytes
|
Reserved names |
Stream names that begin with SYS are reserved. Blackfish SQL has the following system tables:
|
Some table streams and all file streams are currently only accessible from Java applications.
If the resolvable property for the table is set, all insert, update, and delete operations made against the table are recorded. This edit tracking feature enables DataExpress components to synchronize changes from a replicated table to the database from which the table was replicated.
File streams are random-access files. File streams can be further broken down into two different categories:
Each stream is identified by a case-sensitive name referred to as a storeName in the API. The name can be up to 192 bytes long. The name is stored along with other information about the stream in the internal directory of the Blackfish SQL database. The forward slash (/) is used as a directory separator in the name to provide a hierarchical directory organization. JdsExplorer uses this structure to display the contents of the directory in a tree.
For the simple persistent storage of arbitrary files and objects, using the Blackfish SQL file system has a number of advantages over using the JDK classes in the java.io package:
Note: Currently, the directory for the Blackfish SQL database can be accessed only from Java applications. Fortunately, most applications do not need to access the directory directly.
The JdsExplorer tree provides a hierarchical view of the the Blackfish SQL directory. The Blackfish SQL directory can also be opened programmatically with a DataExpress StorageDataSet component. This provides a tabular view of all streams stored in the Blackfish SQL file system. The directory table has the following structure:
Blackfish SQL Directory Table Columns
# |
Name |
Constant |
Type |
Contents |
1 |
State |
DIR_STATE |
short |
Whether the stream is active or deleted |
2 |
DeleteTime |
DIR_DEL_TIME |
long |
If deleted, when; otherwise zero |
3 |
StoreName |
DIR_STORE_NAME |
String |
The store name |
4 |
Type |
DIR_TYPE |
short |
Bit fields that indicate the type of streams |
5 |
Id |
DIR_ID |
int |
A unique ID number |
6 |
Properties |
DIR_PROPERTIES |
String |
Properties and events for a DataSet stream |
7 |
ModTime |
DIR_MOD_TIME |
long |
Last time the stream was modified |
8 |
Length |
DIR_LENGTH |
long |
Length of the stream, in bytes |
9 |
BlobLength |
DIR_BLOB_LENGTH |
long |
Length of the Blobs in a table stream, in bytes |
You can reference the columns by name or by number. There are constants defined as DataStore class variables for each of the column names. The best way to reference these columns is to use these constants. The constants provide compile-time checking to ensure that you are referencing a valid column. Constants with names ending with the suffix _STATE exist for the different values for the State column. There are also constants for the different values and bit masks for the Type column, with names ending with the suffix _STREAM. See the online help for the DataStore class for a listing of these constants.
Time columns in the Blackfish SQL directory are Coordinated Universal Time (UTC).
As with many file systems, when you delete a stream in Blackfish SQL, the space it occupied is marked as available, but the contents and the directory entry that points to it are not immediately reused for new allocations. This means you can sometimes restore a deleted stream if it has not been overwritten.
For more information on deleting and restoring streams, see Deleting Streams, How Blackfish SQL Reuses Blocks, and Restoring Streams.
The Type column indicates whether a stream is a file or table stream, but there are also many internal table stream subtypes (for example, for indices and aggregates). These internal streams are marked with the HIDDEN_STREAM bit to indicate that they should not be displayed. Of course, when you are reading the directory, you can decide whether these streams should be hidden or visible.
These internal streams have the same StoreName as the table stream with which they are associated. This means that the StoreName alone does not always uniquely identify a stream. Some internal stream types can have multiple instances. The ID for each stream is always unique; however, the StoreName is sufficiently unique for the storeName parameter used at the API level. For example, when you delete a table stream, all the streams with that StoreName are deleted.
The directory table is sorted by the first five columns. Because of the values stored in the State column, all active streams are listed first in alphabetical order by name. These are then followed by all deleted streams ordered by their delete time, oldest to most recent.
NOTE: You cannot use a DataSetView to create a different sort order.
Database contents are stored in a single file. If the database has transaction support enabled, there are additional files for transactional logs.
A database file has a block size property that defaults to 4096 bytes. The database block size property is the unit size used for new allocations in the database. This size also determines the maximum storage size of a Blackfish SQL database. The formula for computing the maximum database file size is:
<bytes-per-block> * 2^31
For a block size of 4096 bytes, this is about 8.8 terabytes.
A Blackfish SQL database file does not automatically shrink as data is deleted or removed from it. However, new allocations reuse the space from deleted allocations. Deleted space in the file system is made available to new allocations in two ways:
On average, partially allocated blocks are kept at least 50 percent full. The file system goes to great lengths to ensure this is true for all stream types in the Blackfish SQL file system. The one exception to this rule occurs when a stream has a small number of blocks allocated.
A Blackfish SQL database file can be compacted to remove all deleted space and to defragment the file system so that blocks for each stream are located in contiguous regions. To compact a database using JdsExplorer, choose Tools > Pack. You can accomplish this programmatically by using the DB_ADMIN.COPY_USERS and DB_ADMIN.COPY_STREAMS methods.
Deleting a stream does not actually overwrite or clear the stream contents. As in most file systems, the space used by the stream is marked as available, and the directory entry that points to that space is marked as deleted. The time at which the stream was deleted is recorded in the directory. Over time, new stream allocations overwrite the space that was formerly occupied by the deleted stream, making the content of the deleted streams unrecoverable.
You can use JdsExplorer to delete streams, or you can delete streams programmatically using the DataStoreConnection.deleteStream() method, which takes as an argument the name of the stream to delete.
Blocks in the Blackfish SQL database file that were formerly occupied by deleted streams are reclaimed according to the following rules:
Blackfish SQL allows deleted streams to be restored if their space has not be consumed by new allocations as described in above. You can restore a stream either by using JdsExplorer to restore it, or by calling the DataStoreConnection.undeleteStream() method.
Because table streams have multiple streams with the same name, the stream name alone is not sufficient for attempting to restore a stream programmatically. You must use a row from the Blackfish SQL directory. The row contains enough information to uniquely identify a particular stream.
The DataStoreConnection.undeleteStream() method takes such a row as a parameter. You can pass the directory dataset itself. The current row in the directory dataset is used as the row to restore.
If you create a new stream with the name of a deleted stream, you cannot restore that stream while the same name is being used by an active stream.
The lifecycle of a transaction begins with any read or write operation through a connection. Blackfish SQL uses stream locks to control access to resources. To read a stream or modify any part of a stream (e.g., a byte in a file, a row in a table), a connection must acquire a lock on that stream. Once a connection acquires a lock on a stream, it holds the lock until the transaction is committed or rolled back.
In single-connection applications, transactions primarily provide crash recovery and allows an application to undo changes. Or, you may decide to make a Blackfish SQL database transactional so that it can be accessed through JDBC. If you want to access that Blackfish SQL database using DataExpress, you must deal with transactions.
Blackfish SQL supports all four isolation levels specified by the ANSI/ISO SQL (SQL/92) standards.
The serializable isolation level provides complete transaction isolation. An application can employ a weaker isolation level to improve performance or to avoid lock manager deadlocks. Weaker isolation levels are susceptible to one or more of the following isolation violations:
SQL-92 defines four levels of isolation in terms of the behavior that a transaction running at a particular isolation level is permitted to experience, which are:
SQL Isolation Level Definitions
Isolation Level |
Dirty Read |
Nonrepeatable Read |
Phantom Read |
Read uncommitted |
Possible |
Possible |
Possible |
Read committed |
Not Possible |
Possible |
Possible |
Repeatable read |
Not possible |
Not possible |
Possible |
Serializable |
Not possible |
Not possible |
Not possible |
Guidelines for choosing an isolation level for a connection include:
Isolation Level Guidelines
Isolation level |
Description |
Read Uncommitted |
This isolation level is suitable for single-user applications for reports that allow transactionally inconsistent views of the data. It is especially useful when browsing Blackfish SQL tables with dbSwing and DataExpress DataSet components. This isolation level incurs minimal locking overhead. |
Read Committed |
This isolation level is commonly used for high-performance applications. It is ideal for data access models that use Optimistic Concurrency Control. In these data access models, read operations are generally performed first. In some cases, read operations are actually performed in a separate transaction than write operations. |
Repeatable Read |
This isolation level provides more protection for transactionally consistent data access without the reduced concurrency of TRANSACTION_SERIALIZABLE. However, this isolation level results in increased locking overhead because row locks must be acquired and held for the duration of the transaction. |
Serializable |
This isolation level provides complete serializability of transactions at the risk of reduced concurrency and increased potential for deadlocks. Although row locks can still be used for common operations with this isolation level, some operations cause the Blackfish SQL lock manager to escalate to using table locks. |
The locks used by the Blackfish SQL Lock Manager are:
Blackfish SQL Locking
Lock |
Description |
Critical section locks |
These are internal locks used to protect internal data structures. Critical section locks are usually held for a short period of time. They are acquired and released independent of when the transaction is committed. |
Row locks |
Row locks are used to lock a row in a table. These locks support shared and exclusive lock modes. Row locks are released when the transaction commits. |
Table locks |
Table locks are used to lock an entire table. These locks support shared and exclusive lock modes. Table locks are released when the transaction commits. |
DDL table locks |
DDL table locks are locks acquired when database metadata is created, altered, or dropped. These support shared and exclusive lock modes:
|
You can specify case-insensitive connection properties to control locking behavior. The property names are:
Case-Insensitive Connection Properties for Controlling Locking Behavior
Property |
Behavior |
tableLockTables |
Specifies the tables for which row locking is to be disabled. This can be a list of tables, defined as a string of semicolon-separated, case-sensitive table names. Set this property to “*”. |
maxRowLocks |
Specifies the maximum number of row locks per table that a transaction should acquire before escalating to a table lock. The default value is 50. |
lockWaitTime |
Specifies the maximum number of milliseconds to wait for a lock to be released by another transaction. When this timeout period expires, an appropriate exception is thrown. |
readOnlyTxDelay |
Specifies the maximum number of milliseconds to wait before starting a new read-only view of the database. For details, see the discussion on read only transaction, in Tuning Blackfish SQL Concurrency Control Performance |
The use of table locks and row locks varies between the different isolation levels. The tableLockTables connection property disables row locking and affects all isolation levels. Critical section and DDL locks are applied in the same manner for all isolation levels.
All isolation levels acquire at least an exclusive row lock for row update, delete, and insert operations. In some lock escalation scenarios, an exclusive table lock occurs instead.
The row locking behavior of the Blackfish SQL connection isolation levels are:
Lock Use and Isolation Levels
Connection isolation level |
Row locking behavior |
Read Uncommitted |
This level does not acquire row locks for read operations. It also ignores exclusive row locks held by other connections that have inserted or updated a row. |
Read Committed |
This level does not acquire row locks for read operations. A transaction using this isolation level blocks when reading a row that has an exclusive lock held by another transaction for an insert or update operation. This block terminates when one of the following occurs: the write transaction commits, a deadlock is detected, or the lockTimeOut time limit has expired. |
Repeatable Read |
This level acquires shared row locks for read operations. |
Serializable |
This level acquires shared row locks for queries that select a row based on a unique set of column values such as a primary key or INTERNALROW column. In SQL, the WHERE clause determines whether or not unique column values are being selected. Exclusive row locks are also used for insert operations and update/delete operations on rows that are identified by a unique set of column values. The following operations escalate to a shared table lock:
|
Note that although lock escalation from row locks to table locks occurs in some situations for Serializable as described above, it also occurs for all isolation levels if the maxRowLocks property is exceeded.
Blackfish SQL database files created with earlier versions of Blackfish SQL continue to use table locking for concurrency control. There are, however, some minor concurrency control improvements for older database files. These include:
In an application that opens and closes many database connections, it is efficient to keep unused Connection objects in a pool for future reuse. This saves the overhead of having to create a new physical connection each time a connection is needed.
Use the TDBXPool delegate driver to provide connection pooling support for DBXClient connections.
By default, the .NET client drivers implement a connection pool that is always in effect. Each ConnectionString has its own pool. There are connection properties that affect the maximum number of connections in a pool, and other attributes. There is also a ConnectionPool property that provides access to all active connection pools.
Blackfish SQL provides several components for dealing with JDBC DataSources, connection pooling, and distributed transaction (XA) support. These features require J2EE. If you are running with a JRE version less than 1.4, download the J2EE.jar file from java.sun.com, and add it to the classpath.
The JdbcConnectionPool object supports pooled XA transactions. This feature allows Blackfish SQL to participate in a distributed transaction as a resource manager. Blackfish SQL provides XA support by implementing these standard interfaces specified in the Java Transaction API (JTA) specification:
To acquire a distributed connection to a Blackfish SQL database from a JdbcConnectionPool, call JdbcConnectionPool.getXAConnection(). The connection returned by this method works only with the Blackfish SQL JDBC driver. XA support is useful only when combined with a distributed transaction manager, such as the one provided by Borland Enterprise Server.
Under normal operation, all global transactions should be committed or rolled back before the associated XAConnection is closed. If a connection is participating in a global transaction that is not yet in a prepared state but is in a successful started or suspended state, the transaction will be rolled back during any crash recovery that may occur.
Blackfish SQL provides heuristicCompletion, an extended JDBC property that allows you to control the behavior when one or more databases fail during a two-phase commit. When XA transactions are prepared but not completed (no commit or rollback has been performed), the behavior is determined by one of these property settings:
The heuristic commit and rollback options allow for more efficient execution, because the locks can be released sooner and less information is written to the transaction log file.
NOTE: This feature is currently available only for Blackfish SQL for Java. It is not currently available in Blackfish SQL for Windows.
One of the most important areas of concern for any database application is eliminating single points of failure. The Blackfish SQL server provides a broad range of capabilities for making a database application fail-safe by avoiding application down time and loss of critical data. The High Availability server uses database mirroring technologies to ensure data availability in the event of software or hardware failure, and to provide a method of routine incremental backup. While a more general database replication scheme could provide similar protection, a mirroring approach provides advantages in terms of simplicity, ease of use, and performance.
A more general data replication solution could be employed to solve many of the same problems that the Blackfish SQL High Availability server addresses. Even though a more general solution would solve a broader variety of synchronization needs, it would do so at a much higher set of costs, including greater complexity and slower performance.
The Blackfish SQL database engine uses its transactional log files to maintain read-only mirror images of a database. The same TCP/IP database connections used for general database access are also used to synchronize mirrored databases.
These mirror types can be used by an application:
The primary mirror is the only mirror type that can accept both read and write operations to the database. Only one primary mirror is allowed at any time.
There can be any number of read-only mirrors. Connections to these databases can only perform read operations. Read-only mirrors provide a transactionally consistent view of the primary mirror database. However, a read-only mirror database might not reflect the most recent write operations against the primary mirror database. Read-only mirrors can be synchronized with changes to the primary mirror instantly, on a scheduled basis, or manually. Instant synchronization is required for automatic failover. Scheduled and manual synchronization can be used for incremental synchronization or backup.
Directory mirror databases mirror only the mirror configuration table and the tables needed for security definition. They do not mirror the actual application tables in the primary mirror.
There can be any number of directory mirrors. Connections to these databases can perform read operations, only. The storage requirements for a directory mirror database are very small, since they contain only the mirror table and security tables. Directory mirrors redirect read-only connection requests to read-only mirrors. Writable connection requests are redirected to the primary mirror.
Blackfish SQL Engine failover handling capabilities include the following:
Blackfish SQL uses transaction log records to incrementally update mirrored databases. It transmits these log records to mirrors at high speed during synchronization operations. The same mechanism used for crash recovery and rollback is used to apply these changes. Existing code is used for all synchronization. The existing Blackfish SQL support for read-only transactions provides a transactionally consistent view of the mirrored data while the mirror is being synchronized with contents of more recent write transactions from the primary mirror.
When a primary mirror that is configured with two or more automatic failover mirrors fails, one of the read-only mirrors that is configured for automatic failover is promoted to be the primary mirror. The application can be affected in one of two ways:
Unlike automatic failover, manual failover is performed only on request. Any read-only mirror can become the primary mirror. This is useful when the computer the primary server is executing on needs to be taken off line for system maintenance.
The Blackfish SQL High Availability server provides a broad range of benefits, including:
NOTE: This feature is supported only for Blackfish SQL for Java.
The replication support provided by DataExpress for Blackfish SQL easier to use and deploy than most replication solutions. This replication solution is also heterogeneous because it uses JDBC for database access.
The replication topology provided is best described as a simple client-server relationship. Blackfish SQL does not require the server-side software or database triggers required for more complex publish-subscribe solutions. Complex multi-level hierarchies and network topologies are not directly supported.
There are three distinct phases in the replication cycle when using DataExpress JavaBean components with Blackfish SQL for a disconnected or mobile computing model:
A StorageDataSet provider implementation initially replicates database contents from the server into a client. The client is always a Blackfish SQL database. The server is typically some server that can be accessed with a JDBC driver. The JDBC provider uses either a SQL query or stored procedure to provide data that will be replicated in the client-side Blackfish SQL database. Since there is no server-side software running in this architecture, there is no support for incremental updates from the server to the client. If the client needs to be refreshed, the same SQL query/stored procedure used to provide that the initial replication must be re-executed.
A StorageDataSet provider is a pluggable interface. QueryDataSet and ProcedureDataSet are extensions of StorageDataSet, which preconfigure JdbcProviders that can execute SQL queries and stored procedures to populate a StorageDataSet. For memory-constrained clients such as PDAs, a DataSetData component can be used to provide data. The DataSetData component uses Java Serialization to create a data packet that can be easily transmitted between a client and server.
The provide operation for a collection of database tables can be performed in a single transaction, so that a transactionally consistent view of a collection of tables can be replicated.
Once the provide phase is complete, both DataExpress and JDBC APIs can read and write to the Blackfish SQL tables that are replicating the database. All insert/update/delete write operations since the last provide operation are automatically tracked by the Blackfish SQL storage system. Part of the StorageDataSet store interface contract is that all insert/update/delete operations must be recorded if the StorageDataSet property resolvable is set.
DataExpress provides an automatic mechanism for using SQL DML or stored procedures to save all changes made on the client back to a server, via a JDBC driver. An optimistic concurrency approach is used to save the changes back to the server. One or more tables can be resolved in a single transaction. By default, any conflicts, such as two users updating the same row, cause the transaction to roll back. However, there is a SqlResolutionManager JavaBean component that you can use to customize the handling of resolution errors. The SqlResolutionManager has event handlers that enable an application to respond to an error condition with an ignore, retry, abort, log, or other appropriate response.
There are also higher-level DataStorePump and DataStoreSync components that you can use to perform provide and resolve operations for a collection of tables. For details, see Administering Blackfish SQL.
Copyright(C) 2009 Embarcadero Technologies, Inc. All Rights Reserved.
|
What do you think about this topic? Send feedback!
|