Deadlocks with Alias Files
by
, 21-Aug-2009 at 08:00 AM (3976 Views)
Help, my program hangs on Lock/Reread/Begin_Transaction! That, or some sort of variation is probably the single most revisited topic in the forums over all the years. Probably about 99% of all instances with any problem like this is due to deadlocks with alias files.
This is not related to SQL style deadlocks that are resolved by the server, but an inherent complication from alias files/tables that is mostly only seen with the embedded(DataFlex) database. It's when the application just totally freezes on a Lock, Reread or Begin_Transaction command, it stops responding and eventually gives up with a lock timeout error(if you wait long enough). We've all experienced this problem at one time or another, but exactly how and why this occurs is sometimes a bit of a mystery.
This kind of deadlock occurs because the application is attempting to place a lock on the same physical table multiple times, tripping over itself so to speak, without realizing it's actually the same table. The very definition of alias tables is referencing and opening the same physical table multiple times, with different logical names and table numbers. As far as the application is concerned it's two separate tables, it has no knowledge of the fact that the two tables are actually referring to the same physical table. Now we can see how the potential for this problem can exist.
The obvious solution to this problem is to avoid trying to lock the same physical table twice, afterall, the second lock is not needed since the application already has a lock on the table. This is accomplished by setting the DF_FILE_ALIAS attribute to DF_FILE_IS_ALIAS of all alias tables. The consequence of setting this attribute is similar to setting the file mode to DF_FILEMODE_NO_LOCKS, thus it will not lock the physical table. This means that when you issue a Lock, Reread or Begin_Transaction command, the master table is locked but the alias tables are not locked, thereby effectively avoiding locking the same physical table multiple times, hence no deadlock occurs.
The most common response from anyone who has experienced this problem is "but, there's no alias file involved in this transaction". The common mistake is not realizing that all tables that can be locked are locked on any Lock, Reread or Begin_Transaction command (note that this and most of this article only applies to the embedded(DataFlex) database, locks are handled differently for SQL databases for example). This is a very important and often overlooked consequence of these commands, no matter whether a table is modified or not, no matter whether it's conceptually part of the transaction or not, every logical table that is open and currently not in read-only mode or any other mode that disables locking, will be locked immediately. Thus if the DF_FILE_ALIAS mode is currently not set up correctly, you run the risk of an alias file deadlock, even if the alias table is not used in that particular piece of code.
The second most common response is "but, I have set DF_FILE_ALIAS mode in the corresponding .dd source files". The problem in this case usually occurs before the code for setting the alias attributes is executed. Often this kind of problem occurs in code executing very early in the program, such as handling a login dialog or similar. Sometimes it can be very tough to figure it out, for example, the table could be opened directly with an Open command without even including the .dd source file in the specified program (a common occurrence in reports for example). The bottom line is that somehow when the lock occurs, the alias table is opened but it's not (yet) configured properly, the tricky thing is tracking it down.
Usually the problem is easily rectified by adding the appropriate DF_FILE_ALIAS settings to all corresponding .dd source files, often it was missed in one place or another. Other times it's much easier to move the DF_FILE_ALIAS settings out to a different source file, and then USE'ing that source file very early in the program.
Another common question that comes up is why these problems usually don't occur within DD transactions. And the answer lies in an optimization that is part of the Data Dictionary transaction mechanism, referred to as smart table locking or smart filemode, where only the tables participating in the DD transaction are locked, by manipulating the filemode attributes.