CTS IN SQL

http://www.solidq.com/sqj/Pages/Relational/Microsoft-SQL-Server-2008-R2-Introducing-Change-Tracking-Part-1-of-3.aspx

CHANGE TRACKING SYSTEM
Microsoft SQL Server 2008 R2: Introducing Change Tracking - Part 1 of 3
​Keeping track of the INSERT, DELETE and UPDATE operations performed on data in one or more SQL Server tables is a regular activity in both the development and administration phases of many applications.
For a long time, the absence of a feature that would track changes automatically has forced developers to implement custom solutions which use triggers, timestamps and/or store the primary keys of these changed rows in ad hoc tables.
SQL Server 2008 has a new feature that greatly simplifies life for database developers: Change Tracking (CT).
With this solution, it is now possible to store the changes by using only few simple and effective functions. This makes both the processes and the applications that use them very fast and flexible.
The aim of this, the first of a series of three articles on Change Tracking, is to provide an overview and, through examples, to demonstrate how to enable and disable this feature, while also describing all the properties and objects involved in these activities.

Change Tracking: an overview
Change Tracking automatically detects the information related to all the INSERT, UPDATE and DELETE operations (also known as DML - Data Manipulation Language commands) performed on those tables on which it has been activated. These pieces of information are stored in internal system tables located in the tracked database, and are accessible through special functions provided by SQL Server. Furthermore, they are associated with the tracked tables through their primary key values, in orderto identify the updated, new or deleted rows (with the related information), or to visualize the whole of the changed rows through join operations.
Unlike Change Data Capture (more information on this technology can be found on Microsoft MSDN webpage, at http://msdn.microsoft.com/en-us/library/bb522489.aspx), Change Tracking does not store intermediate changes, nor indicate how many times these "evolve" over time. Instead, for each record, it only stores the information about the last change, overwriting it each time you perform a new DML operation on the same row (except in the case of SELECT). The overwriting operations occur synchronously; this means, in real time with respect to the completion of the tasks. This approach allows us to obtain simpler solutions than the ones that use Change Data Capture. Synchronous operation does not generate significant performance overhead; performance is generally affected little or not at all by Change Tracking.
Because of its potential, Change Tracking can be used in a number of different processes and applications, performing, in some circumstances, a key role. For example, this is the case with:
  1. - ETL (Extract, Transform, Load);
  2. - data replication solutions;
  3. - synchronization between SQL Server databases and applications (also non -SQL Server, in the case of one-way sync mode), both online and offline (Figure 1).
 Figure 1 diagram of a synchronization process based on the Change Tracking (from Microsoft MSDN site).png
Figure 1: diagram of a synchronization process based on the Change Tracking (from Microsoft MSDN site)
In fact, and it’s no accident, Change Tracking is used automatically and transparently by the Sync Framework: http://msdn.microsoft.com/en-us/sync/bb887608.
Generally speaking, Change Tracking is particularly suitable for those processes (or applications) that are executed regularly, since it can be used at the beginning of each run to identify and retrieve, in the various operations, only the data that have changed with respect to the previous execution. This means you do not need to transfer data every time and overwrite unchanged data to the destination. The example in Figure 1 may make it easier to understand the above scenario. By only including the new values in a process, we can make substantial savings in both execution time and resources.
The fact that it is fast and “light” is not the only advantage of using Change Tracking. Other benefits include:
  1. - development time is greatly reduced, as there is no longer any need to implement custom solutions
  2.   involving the use of temporary tables to store the deleted rows, triggers and timestamps; to record
  3.   and display all the necessary information, it’s sufficient to use the built-in functions and the internal
  4.   tables;
  5. - the structure of the database doesn't change, for the reasons given above;
  6. - the configuration and management of Change Tracking are standardized, thanks to the tools (like
  7.   SQL Server Management Studio) and the objects (views, functions, etc..) which are available in SQL
  8.   Server 2008;
  9. - there is an automatic and configurable mechanism for cleaning up the tracked changes; this, again,
  10.   avoids the need to  implement custom solutions, and contributes to reducing the development time;
  11. - there is an  overhead for DML operations, but it is lower than the one generated by any custom
  12.   solutions, especially when triggers are present;
  13. - through the use of transactions that, in background mode, intercept and handle DML operations; the
  14.   order in which the tracked changes are saved is determined by the COMMIT time. This feature
  15.   contributes to reducing development time by removing the need to, for example, add timestamps to
  16.   the tables;
  17. - in ETL processes, the execution time is reduced, because only the changed data are involved in these
  18.   activities (this type of loading is called incremental);
  19. - in ETL processes, you can transfer data at a higher rate as a result of the reduced loading time.
  20.   Thus, one can have data warehouses that are updated more frequently and more efficiently.

The sample database: CTExampleDB
All the scripts used as examples in the subsequent sections are based on a sample database created exclusively for this article: CTExampleDB. The code to implement it is shown below:
-- Creation of CTExampleDB database and setting of the Primary (.Mdf) and Log
--(.Ldf) files
CREATE DATABASE CTExampleDB
ON PRIMARY (
      NAME        = 'CTExampleDB_mdf'
,     FILENAME    = 'C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\CTExampleDB_Sys.mdf'
,     SIZE        = 10MB
,     MAXSIZE     = 50MB
,     FILEGROWTH  = 10MB
)
LOG ON (
      NAME        = 'CTExampleDB_log'
,     FILENAME    = 'C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\CTExampleDB_Log.ldf'
,     SIZE        = 10MB
,     MAXSIZE     = 500MB
,     FILEGROWTH  = 100MB
);

-- Adding USERTABLES Filegroup
ALTER DATABASE    CTExampleDB
ADD FILEGROUP     USERTABLES;

-- Adding file for the example tables
ALTER DATABASE    CTExampleDB
ADD FILE (
      NAME        = 'CTExampleDB_userTables'
,     FILENAME    = 'C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\CTExampleDB_userTables.ndf'
,     SIZE        = 10MB
,     MAXSIZE     = 100MB
,     FILEGROWTH  = 10MB
)
TO FILEGROUP USERTABLES;

We create the Employees table in the database, on which Change Tracking will be enabled:
USE [CTExampleDB];
-- Creation of the table dbo.Employees and definition of its primary key,
-- essential for the CT in order to work
CREATE TABLE dbo.Employees (
      EmployeeID  int               NOT NULL
,     LastName    nvarchar(20)      NOT NULL
,     FirstName   nvarchar(10)      NOT NULL

,     CONSTRAINT  PK_Employees
      PRIMARY KEY (EmployeeID)
      ON USERTABLES
) ON USERTABLES

Enabling Change Tracking
Activating Change Tracking takes place in two phases, in this precise order:
  1. 1. Enabling it at the database level
  2. 2. Enabling it on each table for which you want to track changes
If you reverse these two steps, you will receive the following error message:
Change tracking must be enabled on database 'CTExampleDB' before it can be enabled on table 'Employees'.

Enabling at the Database Level
The enabling at database level can be achieved in two ways:
  1. 1. By using the ALTER DATABASE statement
  2. 2. By using the Database Properties dialog box (on Change Tracking page)
Enabling via the ALTER DATABASE statement
Execute the following code:
ALTER DATABASE CTExampleDB
 SET CHANGE_TRACKING = ON      (CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON);
Through the CHANGE_RETENTION and AUTO_CLEANUP options (both optional), you can set the minimum amount of time for which the information on the changes will be stored in the database before it is automatically deleted.
In detail:
  1. • AUTO_CLEANUP: if set to ON (default), the mechanism to auto clean the information will be
  2.    activated; otherwise, the information will always be available, unless the cleanup is set to ON at a
  3.    later time or  Change Tracking is disabled;
  4. • CHANGE_RETENTION: this indicates the minimum amount of time that the information on the
  5.    changes will be available (it is expressed in days, hours or minutes). For example, if you set
  6.    CHANGE_RETENTION = 2 DAYS (corresponding to the default value) the information will be kept
  7.    for at least two days and not, as you might expect, for two days exactly. At the end of the second
  8.    day, the system will take into account the availability of resources, and then determine whether or
  9.    not to delete the information. You can specify days, hours and minutes, respectively, using the
  10.    keywords DAYS, HOURS and MINUTES.
  1.    For example:
  2. CHANGE_RETENTION = 2 DAYS (2 days)
    CHANGE_RETENTION = 30 HOURS (30 hours)
    CHANGE_RETENTION = 25 MINUTES (25 minutes)
  1.    You can assign a value between 1 and 2,147,483,647 minutes to these units of time. This means
  2.    that:
    1. • DAYS accepts a value between 1 and 1491308;
    2. • HOURS accepts a value between 1 and 35791394;
    3. • MINUTES accepts a value between 1 and 2147483647.
  1.    If you go outside these ranges, you will receive the following error message:
  2. The value for change tracking option 'change_retention' is not valid. The value must be between 1 and 2147483647 minutes. (Microsoft SQL Server, Error: 5092)
  1.    The amount of time set by this option takes effect only if the option AUTO_CLEANUP is set to ON; if
  2.    AUTO_CLEANUP = OFF, it will completely ignore any attempt to set it (as if it had never been
  3.    declared!). This behavior, however, does not produce any error messages, warnings or anomalies.
As previously mentioned, the options AUTO_CLEANUP and CHANGE_RETENTION are not mandatory; therefore, to enable Change Tracking, it is enough to simply execute the following command:
ALTER DATABASE CTExampleDB
SET CHANGE_TRACKING = ON;
In this case, the default values (CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON) will be assigned to the options.
Finally, a very important consideration: to determine the exact retention period for the changes, we must take into account how frequently the processes or applications that use Change Tracking are executed. The specified retention period must not be less than the maximum amount of elapsed time between each execution. Otherwise, information will be lost before the next "launch", and the results produced will be erroneous.
Enabling via Dialog Box
Another alternative is to enable Change Tracking through SQL Server Management Studio:
  1. 1. Go to the Object Explorer window, expand the Databases folder and right-click on the desired
  2.     database; then select Properties;
  3. 2. Choose the Change Tracking page. Now, in the dialog box, you can configure the following options
  4.     (Figure 2):
    1. • Change Tracking: if set to True, Change Tracking is enabled; the default value is False;
    2. • Retention Period: indicates how long, at a minimum, the information on changes will remain
    3.    available in the database. The default value is 2. If the Auto Cleanup option is set to False,
    4.    the Retention Period, even if fixed, will be completely ignored. The acceptable values for
    5.    hours, days and minutes are the same as those allowed for enabling through the ALTER
    6.    DATABASE command;
    7. • Retention Period Units: indicates the time unit to be associated with the specified values in
    8.    the Retention Period option. You can choose between Days, Hours and Minutes. The default
    9.    value is Days;
    10. • Auto Cleanup: If set to True (default), indicates that the change tracking information will be
    11.    automatically removed at the end of the Retention Period. Every time you re-assign the value
    12.    True, the Retention Period and Retention Period Units options are automatically reset back to
    13.    their default values (respectively 2 and Days).
 Figure 2 Dialog box to activate Change Tracking at database level.png
Figure 2: Dialog box to activate Change Tracking at database level
Enabling Change Tracking on tables
It can be enabled in two ways:
  1. 1. By using the ALTER TABLE statement
  2. 2. Through the Table Properties Dialog Box (Change Tracking page)
Remember that to enable Change Tracking on a table, the primary key must have been defined. Otherwise, you will receive the following error message:
Cannot enable change tracking on table 'Employees'. Change tracking requires a primary key on the table. Create a primary key on the table before enabling change tracking.
Enabling via the ALTER DATABASE statement
To accomplish this task, just run the following code:
ALTER TABLE dbo.Employees
ENABLE CHANGE_TRACKING
      WITH (TRACK_COLUMNS_UPDATED = ON);

TRACK_COLUMNS_UPDATED property is optional. If set to ON, it is possible to find out which individual columns will be affected by each UPDATE operation: if omitted, its value is OFF (the default); be careful because, once Change Tracking is enabled, you cannot change the value of this option. If you do wish to change it, you must disable CT on the table and repeat the enabling procedure.
Furthermore, by enabling this feature, you may experience a slight drop in performance levels.
If you want to enable Change Tracking while omitting the TRACK_COLUMNS_UPDATED option, just run the following script:
ALTER TABLE dbo.Employees
ENABLE CHANGE_TRACKING;

Enabling via Dialog Box
This operation can be performed by SQL Server Management Studio:
  1. 1. Go to the Object Explorer window, expand the Databases folder, then the relevant database (in this
  2.     case CTExampleDB) and the Tables subfolder; at this point, right-click on the desired table
  3.     (dbo.Employees) and select Properties;
  4. 2. Choose Change Tracking page. Through the dialog box you can configure the following options
  5.     (Figure 3):
    1. • Change Tracking: the default value is False; if set to True, Change Tracking will be enabled on
    2.    the table;
    3. • Track Columns Updated: If set to True, further information will be stored that allows you to
    4.    know which columns are changed after each UPDATE. The default value is False.
 Figure 3 Dialog box to activate Change Tracking on the table.png
Figure 3: Dialog box to activate Change Tracking on the table
Where is the Change Tracking enabled? The System Views
 
The system views sys.change_tracking_databases and sys.change_tracking_tables are present in each user database and let you know where Change Tracking has been enabled.
The sys.change_tracking_databases view (Figure 4) provides a list of all the databases of the SQL Server instance which have CT enabled. Each row of this view corresponds to one database and consists of the following fields (their data types are indicated in brackets):
  1. • database_id (int): ID of the database;
  2. • is_auto_cleanup_on (bit): indicates whether the auto cleanup functionality is enabled (1) or
  3.    disabled (0);
  4. • retention_period (int): indicates how many days, hours or minutes, at a minimum, the information
  5.    on changes will be stored in the database;
  6. • retention_period_units (tinyint): indicates the time unit associated with the retention_period
  7.    field: 1 (minutes), 2 (hours), 3 (days);
  8. • retention_period_units_desc (nvarchar (60)): contains the description of the selected time unit
  9.    (retention_period_units): MINUTES, HOURS, DAYS;
  10. • max_cleanup_version (bigint): this column is not documented; for internal use at Microsoft.
 Figure 4 Content of the sys change_tracking_databases system view.png
Figure 4: Content of the sys.change_tracking_databases system view; the row having database_id equal to 19 contains the information for the CTExampleDB database.
The sys.change_tracking_tables view (Figure 5) provides a list of all the tables in the database which have CT enabled. Each line corresponds to one table and contains the following information (the data type of each field is given in brackets):
  1. • object_id (int): ID of the table having a change register; notice that the change register can be
  2.    present even if Change Tracking is disabled;
  3. • is_track_columns_updated_on (bit): indicates whether the detection of changed columns after
  4.    each UPDATE is enabled (1) or disabled (0);
  5. • begin_version (bigint): a counter stores the total number of changes made on the tracked tables of
  6.    the database. This value corresponds to the Change Tracking version. The purpose of this field is to
  7.    indicate the current CT version at the moment of its activation on a table. Notice that this value is
  8.    reset each time you run a TRUNCATE TABLE;
  9. • cleanup_version (bigint): indicates which version of the CT auto cleanup may have removed the
  10.    information on the changes. This information may, therefore, no longer be accessible;
  11. • min_valid_version (bigint): specifies the version from which information on changes is available to
  12.    this table:
Figure 5 Content of the sys change_tracking_tables system view.png 
Figure 5: Content of the sys.change_tracking_tables system view; the single row shown refers to the
dbo.Employees table (object_id 2105058535)
The Internal Table: where the information is “hidden”
For each table on which you activate Change Tracking, a new internal table is created automatically in the database that holds the table: this is the place where all the information on the changes will be saved. The internal tables are hidden and their content is accessible via the CT functions. Their existence can be verified by querying the sys.internal_tables system view: as shown in Figure 6, the tables in question are those identified by the name change_tracking_ [object_id of the tracked table] and by the values ‘CHANGE_TRACKING' and 209, which are contained, respectively, in the internal_type_desc and internal_type fields.
In our example, the object_id of the dbo.Employees table is 2105058535 (reported in the parent_object_id field, see figure 6); the internal table name will therefore be called change_tracking_2105058535. Each internal table of Change Tracking type is composed of the following fields:
  1. • sys_change_xdes_id: ID of the transaction that changed the rows in the table;
  2. • sys_change_xdes_id_seq: number that identifies the sequence of internal operations with respect
  3.    to the transaction;
  4. • SYS_CHANGE_OPERATION: specifies the type of the operation performed: I (INSERT), U (UPDATE), D
  5.    (DELETE);
 
 Figure 6 some fields in  the sys internal_tables system view.png
Figure 6: some fields in  the sys.internal_tables system view; the data related to the internal table for the dbo.Employees table is in red
  1. • sys_change_columns: If the Track Columns Updated option is enabled, for each UPDATE operation
  2.    performed, the field will contain a reference indicating which column has changed. In case of
  3.    INSERT and DELETE, the value is NULL;
  4. • sys_change_context: if DML operations are performed using the WITH CHANGE_TRACKING_CONTEXT
  5.    function, this field will contain a reference to the context in which it occurred; otherwise, its value
  6.    is NULL;
  7. • k_ [key column name] _ [ordinal position in key]: each column that makes up the primary key of
  8.    the tracked table has one field of this type associated with it; [key column name] indicates the
  9.    column name of the primary key, while [ordinal position in key] indicates the ordinal position that it
  10.    occupies in the primary key. In our example, the key for the dbo.Employees table is the field
  11.    EmployeeID and it takes the first position in this key (it cannot be otherwise, because it is the only
  12.    field making up the key). The internal table change_tracking_2105058535 will have, therefore, only
  13.    one field of this type, called k_EmployeeID_00000001 (Figure 7).
NOTE: because this is an internal table, the content of the change_tracking_2105058535 table is visible only through a DAC connection (see figure 7). For more information go to: http://msdn.microsoft.com/en-us/library/ms178068.aspx). This situation, however, is not inherent to Change Tracking, and it shouldn’t be considered as a solution to meet every kind of need arising from Change Tracking.
Figure 7 contents of internal table change_tracking_2105058535.png 
Figure 7: contents of internal table change_tracking_2105058535
Disabling Change Tracking
Disabling at Database Level
To disable Change Tracking at the database level, it is necessary to firstly deactivate it from all the tables on which it is active.
Otherwise, SQL Server will return the following error message:
Change tracking is enabled for one or more tables in database 'CTExampleDB'. Disable change tracking on each table before disabling it for the database. Use the sys.change_tracking_tables catalog view to obtain a list of tables for which change tracking is enabled.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.
As with enabling, this task can be accomplished either by the ALTER DATABASE statement or the Database Properties Dialog Box (page Change Tracking).
Disabling via the command ALTER DATABASE
You can do this by executing the following code:
ALTER DATABASE CTExampleDB
SET CHANGE_TRACKING = OFF

Disabling via the Dialog Box
From SQL Server Management Studio:
  1. 1. Go to the Object Explorer window, expand the Databases folder, right-click on the desired database
  2.     and select Properties;
  3. 2. Choose the Change Tracking page and, for the option Change tracking, select False.
Disabling from the tables
Both the ALTER DATABASE command and the Database Properties Dialog Box (Change Tracking page) can be used to disable CT from the tables.
Disabling via the command ALTER DATABASE
Just run the following code:
ALTER TABLE dbo.Employees
DISABLE CHANGE_TRACKING

Comments