SQL Server as Source - DSS 6 | Data Source Solutions Documentation

Documentation: SQL Server as Source - DSS 6 | Data Source Solutions Documentation

SQL Server as Source

Capture

Data Source Solutions DSS allows you to Capture changes from SQL Server database. This section describes the configuration requirements for capturing changes from SQL Server location. For the list of supported SQL Server versions from which DSS can capture changes, see Capture changes from location in Capabilities.

DSS uses the Microsoft ODBC Driver for SQL Server to capture changes from an SQL Server location.

Table Types

DSS supports capture from the following table types in SQL Server:

  • clustered (row/page compressed and uncompressed)
  • heap (row/page compressed and uncompressed)

DSS does not support capture from memory optimized tables and trigger-based capture from temporal tables.

Sparse Columns

DSS supports capture from SQL Server sparse columns starting from version 6.2.5/7. While capture from sparse columns may appear to work in earlier DSS versions, it can result in data integrity issues.

To ensure data integrity when replicating tables with sparse columns, you must perform an index rebuild if any columns have been dropped or altered since the last index rebuild.

DSS will prompt for an index rebuild if it detects that sparse column IDs cannot be reliably mapped. This happens when either of the following has occurred since the last index rebuild:

  • Columns were dropped.
  • Columns were altered in a way that causes SQL Server to internally drop and recreate them (e.g., changing column nullability).

If you are upgrading the DSS Hub and DSS Agent from a version earlier than 6.2.5/7, you must perform Activate Replication with the Table Enrollment option selected, or run dssactivate with the -oe option.

Capture Methods

DSS allows the following methods for capturing (Capture) changes from SQL Server:

  • Direct Transaction Log Access
    Capture changes directly from SQL Server's logging files. This capture method is supported on:

    • The primary database server with a connection to the active database.
    • A secondary node of an AlwaysOn cluster connected to the local read-only database.
  • Archive Only
    Capture changes directly from SQL Server's transaction log backup files (similar to the Direct Transaction Log Access method). This capture method is supported on any server with access to the log backups.

  • SQL Access
    Capture changes over an SQL connection. This capture method is supported on the primary active database or on a read-only secondary database of an AlwaysOn cluster.

  • Database Triggers <b>Since</b> v6.1.0/3
    Capture changes from tables using database triggers.

The trigger-based capture method (Capture_Method=DB_TRIGGER) has been deprecated since 6.2.0/0.

Grants for Log-Based Capture

This section lists the grants/permissions required for capturing changes from SQL Server database. Based on your requirement, either of the permissions listed in this section can be used.

For simplicity, we have categorized the required grants into the following three permission models:

  • SysAdmin: The easiest to configure and offers broad permissions and authority over the entire SQL Server instance. While easy to set up, it carries potential security risks.
  • DbOwner: Provides extensive permissions but is restricted to a specific database. It requires more setup effort than the SysAdmin model.
  • Minimal: Requires the least permissions and does not need sysadmin or db_owner rights. However, it is the most complex to configure.
  • Until version 6.1.0/12, only the SysAdmin model is supported for the Direct Transaction Log Access capture method. Since version 6.1.0/13, all three permission models are supported for the Direct Transaction Log Access capture method.
  • All three permission models are supported for the Archive Only and SQL Access capture methods.
  • The DbOwner and Minimal models are only available for SQL Server 2012 and above. For the older versions of SQL Server, the SysAdmin model must be used.

SysAdmin

In this permission model, the DSS database user must be granted sysadmin role.

DbOwner

In this permission model, the DSS database user must be granted the db_owner role. However, a user with sysadmin privileges is required for certain configurations detailed in this section, such as executing specific SQL statements, creating procedures, or granting special permissions to the DSS database user.

Configure the DbOwner permission model

DSS must already be installed before performing the following steps.

  • Create a special 'wrapper' SQL procedure:

This step is needed once when DSS is installed for an SQL Server instance. But if Always On AG is installed, then this step is needed on each Always On AG node.

A user with **sysadmin** privilege must create special 'wrapper' SQL procedures called **sp_dss_dblog** (replaced with **sp_dss_dblog2** in `v6.1.5/5` and later), **sp_dss_dbtable**, and **sp_dss_loginfo**, to enable DSS to call the SQL Server's read-only function **fn_dump_dblog**. This must be done inside the SQL Server database's special database **msdb**, not the actual capture database. The SQL query to create these procedures is available in the file **dsscapsysadmin.sql** in the **DSS_HOME\\dbms\\sqlserver** directory. The DSS database user must then be allowed to execute this procedure. For this, the DSS database user must be added to the special **msdb** database and the following grants must be provided:

<div class="callout callout-important">

Replace sp_dss_dblog with sp_dss_dblog2 for v6.1.5/5 and later.

```shell
USE msdb;
CREATE USER <em>username</em> FOR LOGIN <em>username</em>;
GRANT EXECUTE ON sp_dss_dblog TO <em>username</em>;
GRANT EXECUTE ON sp_dss_dbtable TO <em>username</em>;
GRANT EXECUTE ON sp_dss_loginfo TO <em>username</em>;
```

This step is needed once when DSS is installed for an SQL Server instance. But if Always On AG is installed, then this step is needed on each Always On AG node.

A **sysadmin** user must grant the DSS user login a special read-only privilege in the **master** database.

```shell
USE master;
GRANT VIEW SERVER STATE TO <em>username</em>;
```

In SQL Server 2022, the VIEW SERVER STATE permission was split into two permissions: VIEW SERVER PERFORMANCE STATE and VIEW SERVER SECURITY STATE. When capturing from SQL Server 2022, the user only needs VIEW SERVER PERFORMANCE STATE.


Minimal

In this permission model, the DSS database user is not granted, nor does it require, sysadmin or db_owner roles at runtime. However, a user with sysadmin or db_owner privileges is required for certain configurations detailed in this section, such as executing the required SQL statements, creating procedures, and granting the necessary permissions to the DSS database user.

Configure the Minimal permission model

DSS must already be installed before performing the following steps.

This step is needed once when DSS is installed for an SQL Server instance. But if Always On AG is installed, then this step is needed on each Always On AG node.

A user with **sysadmin** privilege must create special 'wrapper' SQL procedures called **sp_dss_dblog** (replaced with **sp_dss_dblog2** in `v6.1.5/5` and later), **sp_dss_dbtable**, and **sp_dss_loginfo** to enable DSS to call the SQL Server's read-only function **fn_dump_dblog**. This must be done inside the SQL Server database's special database **msdb**, not the actual capture database. The SQL query to create these procedures is available in the file **dsscapsysadmin.sql** in the **DSS_HOME\\dbms\\sqlserver** directory. The DSS database user must then be allowed to execute this procedure. For this, the DSS database user must be added to the special **msdb** database and the following grants must be provided:

<div class="callout callout-important">

Replace sp_dss_dblog with sp_dss_dblog2 for v6.1.5/5 and later.

```shell
USE msdb;
CREATE USER <em>username</em> FOR LOGIN <em>username</em>;
GRANT EXECUTE ON sp_dss_dblog TO <em>username</em>;
GRANT EXECUTE ON sp_dss_dbtable TO <em>username</em>;
GRANT EXECUTE ON sp_dss_loginfo TO <em>username</em>;
```

This step is needed once when DSS is installed for an SQL Server instance. But if Always On AG is installed, then this step is needed on each Always On AG node.

A **sysadmin** user must grant the DSS user login a special read-only privilege in the **master** database.

```shell
USE master;
GRANT VIEW SERVER STATE TO <em>username</em>;
```

In SQL Server 2022, the VIEW SERVER STATE permission was split into two permissions: VIEW SERVER PERFORMANCE STATE and VIEW SERVER SECURITY STATE. When capturing from SQL Server 2022, the user only needs VIEW SERVER PERFORMANCE STATE.

This step is needed once when each new source database is being set up.

A user with **db_owner** (or **sysadmin**) privilege must create 'wrapper' SQL procedures in each capture database so that DSS can call the SQL Server's read-only procedures **sp_helppublication**, **sp_helparticle**, and **fn_dblog**. The SQL query to create these five read-only procedures is available in the file **dsscapdbowner.sql** in the **DSS_HOME\\dbms\\sqlserver** directory. The DSS database user must then be allowed to execute these procedures.

The following grants must be granted to the DSS database user inside each capture database:

 <div class="callout callout-important">

Replace sp_dss_dblog with sp_dss_dblog2 for v6.1.5/5 and later.

```shell
USE <em>source_database</em>;
GRANT EXECUTE ON sp_dss_check_publication TO <em>username</em>;
GRANT EXECUTE ON sp_dss_check_article TO <em>username</em>;
GRANT EXECUTE ON sp_dss_dblog TO <em>username</em>;
GRANT EXECUTE ON sp_dss_repldone TO <em>username</em>;
GRANT EXECUTE ON sp_dss_repltrans TO <em>username</em>;
```

This step is needed once when each new source database is being set up.

A user with **db_owner** (or **sysadmin**) privilege must grant the DSS database user a read-only privilege:

```shell
USE <em>source_database</em>;
ALTER ROLE db_datareader ADD MEMBER <em>username</em>;
```

Supplemental Logging

Supplemental logging is required to ensure table updates can be replicated correctly using SQL statements on a target database.

Transaction log files are generally used for database recovery and normally SQL Server only logs data that is needed for recovery. For example, full row image (all columns) are usually not written into the transaction log but only the fragments that are actually changed are written. For DSS to perform log-based capture, SQL Server needs to write full row images as well as write some additional information into the transaction log, which is referred to as "Supplemental Logging" in DSS.

SQL Server does not allow enabling supplemental logging directly. However, you can enable it by using the native features of SQL Server such as 'CDC tables' or 'replication articles'. To do log-based capture, DSS requires either a CDC table or a replication article to be created for each source table.

When performing Activate Replication (dssactivate -ol), if option Supplemental Logging is selected/enabled, DSS will check whether supplemental logging is enabled in the database. If supplemental logging is not enabled in the database, then DSS will create either a CDC table or a replication article depending on the option selected/defined in the location property Supplemental_Logging.

The location property Supplemental_Logging allows you to select what action should be performed to enable supplemental logging for tables.

When supplemental logging is enabled and if a user attempts DDL such as

Available options for SUPPLEMENTAL LOGGING are:

If existing articles are used and there are no subscribers while the @allow_anonymous property is set to false, replication may not work.

Configuration/Installation Required for Using Articles for Supplemental Logging

To use articles for supplemental logging, the following steps must be performed (once when DSS is installed for an SQL Server instance).

  1. Install SQL Server Replication Components:
    For log-based capture from the SQL Server Enterprise Edition or Developer Edition, if articles are used (see above), DSS requires that the SQL Server Replication Components are installed.
  2. Create a distribution database:
    • If articles are used (see above), a user with sysadmin privilege must create a distribution database for the SQL Server instance, unless one already exists. To do this, a user with sysadmin privilege should run SQL Server wizard ConfigureDistributionWizard, which can be run by clicking Replication Configure Distribution... Any database name can be supplied (click Next Next Next).
    • If Always On AG is installed and articles are used, then only one distribution database should be configured. Either this can be set up inside the first node and the other nodes get a distributor that points to it, or the distribution database can be located outside the Always On AG cluster and each node gets a distributor that points to it there.

Dropping the Source Database

Depending on the location property SUPPLEMENTAL LOGGING (Supplemental_Logging) defined, DSS will use some of SQL Server's own 'replication components' or it will use SQL Server's Change Data Capture (CDC) feature. Based on this, DSS may enable the 'publish' replication option for the source database. When this replication database option is enabled, attempts to drop the database will give an SQL Server error.

When performing Deactivate Replication (dssactivate -d -ol), if the option Supplemental Logging is selected/enabled, DSS will check if there are any replication articles remaining in the database. If none are remaining, it will also disable the 'publish' replication database option (in SQL Server database). The database can then be dropped.

If articles are used for supplemental logging to drop the database immediately (without running Deactivate Replication first), the sysadmin must run the following SQL statement:

EXEC sp_replicationdboption '<em>source_database</em>', 'publish', 'false';

Log Truncater

The LOG TRUNCATER option in the DSS UI enables DSS to manage SQL Server transaction log truncation. This option specifies the method used to advance the SQL Server transaction log truncation point (i.e., truncate the log) and provides seamless, automated log management. You can select this option when creating a new SQL Server location or by editing the existing location's source and target properties.

For more information, see our FAQ page How to Manage Truncation Points in SQL Server.

The equivalent location property for this option is Log_Truncater.

The following are the supported log truncater methods:

CAP_JOB

This method is used to indicate that the capture job regularly calls sp_repldone to unconditionally release the hold of the truncation point for replication. When this option is selected and Activate Replication (dssactivate) is run, depending on the value of the location property SUPPLEMENTAL LOGGING, DSS will also drop/disable the SQL Server agent jobs that are created when CDC tables are created through the CDC stored procedures and the agent jobs related to data distribution. As a result, the additional impact of auxiliary database objects to enable supplemental logging is minimized. For example, the CDC tables are not populated (and the space allocation increased) because the agent job to do this is dropped. Multiple Capture jobs can be set up on a single database with option CAP_JOB selected. However, note that if no Capture job is running with the CDC tables and/or Articles in place, the transaction log will grow because the truncation point for replication is not released. Do not set this option if there is another data replication solution or the database uses CDC tables.

CAP_JOB_RETAIN

This method should be used when capturing from a SQL Server database with the recovery mode set to Simple Recovery. The capture job moves the truncation point of the transaction log forward by calling the stored procedure sp_repldone at the end of each sub-cycle. Only part of the transaction log that has already been processed (captured) is marked for truncation (this is different from the CAP_JOB mode, where all records in the transaction log are marked for truncation, including those that have not been captured yet). This value is not compatible with multi-capture and does not allow for coexistence with a third party replication solution. This setting will also result in SQL Server's agent jobs being dropped/disabled, so the transaction log will grow when the capture job is not running and CDC tables and/or Articles are still in place. Do not set this option if another data replication solution is in place or CDC tables are used in the database.

LOGRELEASE_TASK

This method should be used if a separate job/task is created to release the truncation point for replication. For example, schedule a separate SQL Server Agent job to unconditionally call sp_repldone at an interval. Choosing the option LOGRELEASE_TASK will also result in SQL Server's agent jobs being dropped/disabled. However, as long as the scheduled log release task runs, the truncation point for replication is released, even if the capture job(s) is(are) not running. This option should only be used in conjunction with another replication or CDC solution if the log release task that is scheduled is aware of the requirements of the other solution.

NATIVE_DBMS_AGENT

This method should be used if native replication and/or CDC tables are used on the database. With this option, DSS will not drop/disable the native SQL Server agent jobs that are created when CDC tables and/or Articles are created. DSS will also not interfere with the release of the truncation point for replication. If CDC tables are used to enable supplemental logging it may cause I/O overhead (SQL Server jobs copy each change to a CDC table, which no-one uses).

Change Archive Directory

When using the Direct Transaction Log Access and SQL Access capture method, you can define the directory for transaction log backup files. Normally, DSS locates the transaction log backup files by querying the backup history table in the msdb database. Defining this parameter tells DSS to search for the log backup files in the directory specified in this property/field. It is recommended to define this parameter when using Always On Availability Groups.

Backup Directory

DSS will search for the log archives in the directory specified in this field/property.

When ARCHIVE WATCH DIRECTORY (Archive_Log_Path) is defined, it is mandatory to define FILE NAME FORMAT (Archive_Log_Format).

This field/property describes the filename format (template) of the transaction log archive files stored in the ARCHIVE WATCH DIRECTORY (Archive_Log_Path).

This parameter has no default, so it must be defined if ARCHIVE WATCH DIRECTORY (Archive_Log_Path) is defined.

This field/property accepts the following format variables:

Capturing from SQL Server Always On availability groups

DSS allows you to capture from SQL Server Always On availability groups (AG) - a technology that provides High-Availability and Disaster-Recovery solutions in SQL Server.

All capture methods are supported for capturing changes from SQL Server Always On AG.

DSS does not support capturing from secondary nodes when using SQL Server Standard Edition because secondary nodes in this edition are non-readable.

Requirements for Capturing from Always On AG secondary

When you activate replication against a read-only secondary node, DSS cannot enable supplemental logging (create CDC tables) because this can only be done when connected to a primary node. In this case, DSS will generate a script named supp_log_add.sql on the source machine. This script contains commands to enable supplemental logging on the source database. The script must be executed manually on the primary node by a user with sysadmin or db_owner privileges.

Prior to version 6.1.0/30, DSS did not automatically create the supp_log_add.sql script for the above scenario. Hence, for log-based setup, we recommend activating replication using the Minimal permission model to generate the script. This will make it easy to manage supplemental logging.

The CDC tables will not be used by Microsoft or contain any data. The only reason they are created is so that Microsoft could log the primary key during updates. This will not cause any additional overhead on the primary node nor is it similar to running Microsoft's native CDC replication.

If you are not using CDC tables or any other type of replication, then you will need to create a job on the primary node to call the sp_repldone procedure on a regular basis. The sp_repldone procedure will move the truncation point within the transaction log so that the log does not continue to grow. This can be executed every time you perform the transaction log backup.

Configuration steps
  1. Create a channel and location to connect to the SQL Server Always On AG standby node.

  2. Configure location property Log_Truncater to be set to LOGRELEASE_TASK.

  3. Activate replication in the channel with SQL Server Always On AG. The activation will fail and produce a script called supp_log_add.sql stored in the DSS_CONFIG\hubs\hub_name\channels\channel_name\locs\location_name\initsql directory.

  4. Manually execute the supp_log_add.sql script on the primary node. Initially, a user with sysadmin privilege must run the script. Subsequently, a user with db_owner privilege can perform this action.

  5. Create a job to run the sp_repldone procedure to move the truncation point.

    EXEC sp_repldone @xactid = NULL, @xact_segno = NULL, @numtrans = 0, @time = 0, @reset = 1.
    

The sp_repldone procedure does not truncate anything, the backup does that. The sp_repldone procedure just moves the truncation point.

  1. Activate replication with option Supplemental Logging selected. The activation should complete without any errors.

Configuring Failover for Connections to SQL Server Always On AG

If DSS is connecting using DSS Agent to a database inside an SQL Server Always On AG, the following is required:

  1. Create an DSS Agent Listener on each node.
  2. Inside Failover Cluster Manager, configure a Role with a static IP address that controls the DSS Agent Listener service on all nodes.
    1. Run Configure Role Wizard
    2. Click Next Generic Service DSS Agent Listener Name Next Next Next Finish.
    3. To configure a static IP address, click on Resources Name IP address then change it to an available static address.
  3. Configure a Group Listener inside the Availability Group.
    1. Start SQL Server Management Studio on the primary node.
    2. Click AlwaysOn High Availability Availability Groups Name Availability Group Listeners Add Listener ….
  4. For Always On AG inside Azure,
    1. Configure an Internal or External Azure load balancer for the DSS Agent Listener using Powershell. Then attach each Azure node to this load balancer.
    2. In DSS UI, enter the IP address of the Azure load balancer into the Agent Host field available in the Agent Connection section while creating a location or editing an existing location's agent connection properties.

DSS can now connect to Node as configured in step 1 or step 4 and Server as configured in Step 3.

Configuring Backup Mode and Transaction Archive Retention

DSS log-based capture requires that the source database is in Full recovery model and a full database backup has been done since this was enabled. Normally DSS reads changes from the 'online' transaction log file, but if DSS is interrupted (say for 2 hours) then it must be able to read from transaction log backup files to capture the older changes. DSS is not interested in full or incremental backups; it only reads transaction log backup files.

Transaction log (archive) retention: If a backup process has already moved these files to tape and deleted them, then DSS capture will give an error and a Refresh will be needed before replication can be restarted. The amount of 'retention' needed (in hours or days) depends on organization factors (how real-time must it be?) and practical issues (does a refresh take 1 hour or 24 hours?).

DSS normally locates the transaction log backup files by querying the backup history tables in the msdb database. But if Always On AG is configured then this information source is not available on all nodes. So when DSS is used with Always On AG, the transaction log backups must be made on a directory which is both accessible from all Always On AG nodes and also from the machine where the DSS capture process is running (if this is different) via the same path name. DSS should be configured to find these files by defining the following location properties ARCHIVE WATCH DIRECTORY (Archive_Log_Path) and FILENAME FORMAT (Archive_Log_Format).

Capturing from SAP Source

DSS allows you to capture changes from a SQL Server database which is used by an SAP ECC system. To enable capture using SAP dictionary, the location property SAP Source (SAP_Source_Schema) must be defined while creating a location or by editing the existing location's source and target properties. Then while adding tables to a channel, the Table Selection dialog will display the SAP tables defined in the SAP dictionaries.

The SAP Unpack license is required for adding the pool, cluster, and long text (STXL) tables to a channel using the Table Selection dialog.

When SAP pool, cluster, and long text (STXL) tables are added to a channel using the Table Selection dialog, the following actions are automatically defined:

Irrespective of the number of tables, only a single action Transform is defined.

SAP columns are non-nullable by default. They will be described as nullable in the hub's repository and thus as nullable in the target. This is valid for the non-key columns. Key columns will remain non-nullable.

SAP Data Types Conversion

<b>Since</b> v6.1.0/7

This option enables conversion/mapping of SAP specific data types (available in SAP dictionary meta-data) in source location to corresponding data type in the target location. The SAP specific data type will be localized with the source DB's data type and then mapped to DSS's Repository data type. For example, if you have an SAP system on SQL Server, the DATS data type will be localized as SQL Server's Date type, and then it is mapped to DSS Repository type ansidate.

This feature is supported for the following SAP specific data types:

If the SAP Data Types Conversion option is NOT selected, SAP specific data types are mapped to various other DSS Repositry data types. For more information, see Data Type Mapping for SAP NetWeaver (or SAP dictionary).

If the SAP Data Types Conversion option is selected during location creation, DSS will automatically define action TableProperties with parameters CoerceErrorPolicy and CoerceErrorType.

If the SAP Source (SAP_Source_Schema) location property/option is selected during location creation, by default, the SAP Data Types Conversion option also gets selected.

However, to enable SAP data type conversion for an existing location, select the SAP Data Types Conversion option by editing the location's source and target properties and then manually define action TableProperties with parameters CoerceErrorPolicy and CoerceErrorType.

Capture Limitations

Compare and Refresh from SQL Server

DSS allows you to perform only Compare and Refresh from SQL Server database (without using Capture). This section describes the configuration requirements for performing only Compare and Refresh from SQL Server database.

Grants for Compare and Refresh from SQL Server

This section lists the grants required for performing only Compare and Refresh from SQL Server database. Based on your requirement, the DSS database user must be granted either of the permission models listed below.

If the DSS database user needs to select from tables in another schema (for example if action TableProperties is defined with parameter Schema), then the following <b>select</b> privileges should be granted.

GRANT SELECT TO <em>username</em>; -- Let DSS read all tables

GRANT SELECT ON SCHEMA::dbo TO <em>username</em>; -- Let DSS only read DBO tables

See Also

Other related topics: