AgentPlugin - DSS 6 | Data Source Solutions Documentation

Documentation: AgentPlugin - DSS 6 | Data Source Solutions Documentation

AgentPlugin

An agent plugin is a block of user–supplied logic which is executed by Data Source Solutions DSS during replication. An agent plugin can be an operating system command or a database procedure. Each time DSS executes an agent plugin it passes parameters to indicate what stage the job has reached (e.g. start of capture, end of integration etc.). If action AgentPlugin is defined on a specific table, then it affects the entire job including data from other tables for that location.

By default, DSS will only execute binaries and scripts available inside DSS_CONFIG/plugin/agent, DSS_CONFIG/plugin/transform, DSS_CONFIG/plugin/authentication, and DSS_CONFIG/plugin/rewrite. These directories are not created by default, and must be manually created if required. It is recommended to save custom scripts/agent plugins in these directories. DSS can also execute binaries and scripts available inside other directories if they are safelisted. Directories can be safelisted by defining the property Allowed_Plugin_Paths in file DSS_CONFIG/etc/dssosaccess.conf. For reference, the sample configuration file dssosaccess.conf_example can be found in DSS_HOME/etc/dssosaccess.conf_example.

Parameters

This section describes the parameters available for action AgentPlugin.

Following are the two tabs/ways, which you can use for defining action parameters in this dialog:

Regular: Allows you to define the required parameters by using the UI elements like checkbox and text field.
Text: Allows you to define the required parameters by specifying them in the text field. You can also copy-paste the action definitions from DSS documentation, emails, or demo notes.

Command

Argument: path

Description: Name of the agent plugin command. This can be a script or an executable.

Scripts can be shell scripts on Unix and batch scripts on Windows or can be files beginning with a 'magic line' (shebang) containing the interpreter for the script e.g. #!perl.

Argument path can be an absolute or a relative pathname. If a relative pathname is supplied the agents should be located in DSS_HOME/plugin/agent, DSS_HOME/plugin/transform, DSS_CONFIG/plugin/agent, DSS_CONFIG/plugin/transform, DSS_CONFIG/plugin/authentication, DSS_CONFIG/plugin/rewrite, or in a safelisted directory.

This field is disabled when parameter DbProc is selected.

DbProc

Argument: dbproc

Description: Call database procedure dbproc during replication jobs.

The database procedures are called in a new transaction, except Burst integrate with BurstCommitFrequency set to CYCLE (the default), when it runs as part of the transaction that updates the destination table. During dssrefresh, it is executed after all tables are refreshed and committed, but before their indexes are recreated.

DbProc cannot be used with parameters Command, and ExecOnHub.

This field is disabled when parameter Command is selected.

UserArgument

Argument: userarg

Description: Pass extra argument userarg to each agent plugin execution.

ExecOnHub

Description: Execute agent plugin on hub machine instead of location's machine.

This field is disabled when parameter DbProc is selected.

Order

Argument: int

Description: Order of executing the agent plugin.

Context

Argument: context

Description: Action AgentPlugin is effective/applied only if the context matches the context defined in Compare or Refresh. For more information about using Context, see our concept page Refresh or Compare context.

The value should be a context name, specified as a lowercase identifier. It can also have form !context, which means that the action is effective unless the matching context is enabled for Compare or Refresh..

One or more contexts can be enabled for Compare and Refresh.

Agent Plugin Arguments

If an agent plugin is defined, it is called several times at different points of the replication job. On execution, the first argument that is passed indicates the position in the job, for example cap_begin for when the agent plugin is called before capture.

Argument mode is either cap_begin, cap_end, integ_begin, integ_end, refr_read_begin, refr_read_end, refr_write_begin or refr_write_end depending on the position in the replication job where the agent plugin was called. Agent plugins are not called during Compare.

Modes cap_end and integ_end are passed information about whether data was actually replicated.

Command agent plugins can use $DSS_TBL_NAMES or $DSS_FILE_NAMES and database procedure agent plugins can use parameter dss_changed_tables. An exception if an integrate job is interrupted; the next time it runs it does not know anymore which tables were changed so it will set these variables to an empty string or -1.

Command procedure agent plugins (specified in parameter Command) are called as follows:

<em> agent mode chn_name loc_name userarg</em>

Database procedure agent plugins (specified in parameter DbProc) are called as follows:

In Ingres,

execute procedure <em>agent</em> (dss_agent_mode='<em>mode</em>', dss_chn_name='<em>chn_name</em>', dss_loc_name='<em>loc_name</em>', dss_agent_arg='<em>userarg</em>', dss_changed_tables=<em>N</em>);

In Oracle,

<em>agent</em> (dss_agent_mode$=>'<em>mode</em>', dss_chn_name$=>'<em>chn_name</em>', dss_loc_name$=>'<em>loc_name</em>', dss_agent_arg$=>'<em>userarg</em>', dss_changed_tables$=<em>N</em>);

In SQL Server,

execute <em>agent</em> @dss_agent_mode='<em>mode</em>', @dss_chn_name='<em>chn_name</em>', @dss_loc_name='<em>loc_name</em>', @dss_agent_arg='<em>userarg</em>', @dss_changed_tables=<em>N</em>;

The parameter dss_changed_tables specifies the number (N) of tables that were changed.

Agent Plugin Interpreter

If the agent plugin is a script, DSS will consider its shebang line to execute it with an interpreter. It is recommended that only the interpreter program name is specified here (for example, #!perl or #!python). It is not required to specify the absolute path in the shebang line. DSS will automatically determine the path for the specified interpreter using the environment variable PATH.

Agent Plugin Environment

An agent plugin inherits the environment of its parent process. On the hub, the parent of the agent plugin's parent process is the Scheduler. On a remote Unix machine it is the inetd daemon. On a remote Windows machine it is the DSS Remote Listener service. Differences with the environment of the parent process are as follows:

Environment variable $DSS_TBL_NAMES is set to a colon–separated list of tables for which the job is replicating (for example DSS_TBL_NAMES=tbl1:tbl2:tbl3). Also variable $DSS_BASE_NAMES is set to a colon–separated list of table 'base names', which are prefixed by a schema name if action TableProperties is defined with parameter Schema (for example DSS_BASE_NAMES=base1:sch2.base2:base3).
For modes cap_end and integ_end these variables are restricted to only the tables actually processed. Environment variables $DSS_TBL_KEYS and $DSS_TBL_KEYS_BASE are colon–separated lists of keys for each table specified in $DSS_TBL_NAMES (e.g. k1,k2:k:k3,k4). The column list is specified in $DSS_COL_NAMES and $DSS_COL_NAMES_BASE.
Environment variable $DSS_CONTEXTS is defined with a comma–separated list of contexts defined with DSS Refresh or Compare (option –Cctx).
Environment variables $DSS_VAR_XXX are defined for each context variable supplied to DSS Refresh or Compare (option –Vxxx=val).
For database locations, environment variable $DSS_LOC_DB_NAME, $DSS_LOC_DB_USER (unless no value is necessary).
For Oracle locations, the environment variables $DSS_LOC_DB_USER, $ORACLE_HOME and $ORACLE_SID are set and $ORACLE_HOME/bin is added to the path.
For Ingres locations the environment variable $II_SYSTEM is set and $II_SYSTEM/ingres/bin is added to the path.
For SQL Server locations, the environment variables $DSS_LOC_DB_SERVER, $DSS_LOC_DB_NAME, $DSS_LOC_DB_USER and $DSS_LOC_DB_PWD are set (unless no value is necessary).
For file locations variables $DSS_FILE_LOC and $DSS_LOC_STATEDIR are set to the file location's top and state directory respectively.
- For modes cap_end and integ_end variable $DSS_FILE_NAMES is set to a colon–separated list of replicated files, unless this information is not available because of recovery. For mode integ_end, the following environment variables are also set: $DSS_FILE_NROWS containing colon-separated list of number of rows per file for each file specified in $DSS_FILE_NAMES (for example DSS_FILE_NROWS=1005:1053:1033); $DSS_TBL_NROWS containing colon-separated list of number of rows per table for each table specified in $DSS_TBL_NAMES; $DSS_TBL_CAP_TSTAMP containing colon-separated list of first row's capture timestamp for each table specified in $DSS_TBL_NAMES; $DSS_TBL_OPS containing colon-separated list of comma-separated dss_op=count pairs per table for each table specified in $DSS_TBL_NAMES (for example DSS_TBL_OPS=1=50,2=52:1=75,2=26:1=256). If the number of files or tables replicated are extremely large then these values are abbreviated and suffixed with "...". If the values are abbreviated, refer to $DSS_LONG_ENVIRONMENT for the actual values.
Environment variables with too long values for operating system are abbreviated and suffixed with "...". If the values are abbreviated, DSS creates a temporary file containing original values of these environment variables. The format for this temporary file is a JSON map consisting of key value pairs and the absolute path of this file is set in $DSS_LONG_ENVIRONMENT.
Any variable defined by action Environment is also set in the agent plugin's environment.
The current working directory for local file locations (not FTP, SFTP, SharePoint/WebDAV, HDFS or S3) is the top directory of the file location. For other locations (e.g. database locations) it is DSS_TMP, or DSS_CONFIG/tmp if this is not defined.
stdin is closed and stdout and stderr are redirected (via network pipes) to the job's logfiles.

If a command agent plugin encounters a problem it should write an error message and return with exit code 1, which will cause the replication job to fail. If the agent does not want to do anything for a mode or does not recognize the mode (new modes may be added in future DSS versions) then the agent should return exit code 2, without writing an error message.

Examples

This section lists few examples of agent plugin scripts:

Example 1: An agent plugin script (in Perl), which prints "hello world"

#!perl # Exit codes: 0=success, 1=error, 2=ignore_mode if($ARGV[0] eq "cap_begin") \{ print "Hello World\n"; exit 0; \} else \{ exit 2; \}

Example 2: An agent plugin script (in Perl), which prints out arguments and environment at the end of every integrate cycle

#!perl require 5; if ($ARGV[0] eq "integ_end") \{ print "DEMO INTEGRATE END AGENT ("; foreach $arg (@ARGV) \{ print "$arg "; \} print ")\n"; # print current working directory use Cwd; printf("cwd=%s\n", cwd()); # print (part of the) environment printf("DSS_FILE_NAMES=$ENV\{DSS_FILE_NAMES\}\n"); printf("DSS_FILE_LOC=$ENV\{DSS_FILE_LOC\}\n"); printf("DSS_LOC_STATEDIR=$ENV\{DSS_LOC_STATEDIR\}\n"); printf("DSS_TBL_NAMES=$ENV\{DSS_TBL_NAMES\}\n"); printf("DSS_BASE_NAMES=$ENV\{DSS_BASE_NAMES\}\n"); printf("DSS_TBL_KEYS=$ENV\{DSS_TBL_KEYS\}\n"); printf("DSS_TBL_KEYS_BASE=$ENV\{DSS_TBL_KEYS_BASE\}\n"); printf("DSS_COL_NAMES=$ENV\{DSS_COL_NAMES\}\n"); printf("DSS_COL_NAMES_BASE=$ENV\{DSS_COL_NAMES_BASE\}\n"); printf("PATH=$ENV\{PATH\}\n"); exit 0; # Success \} else \{ exit 2; # Ignore mode \}

Example 3: An agent plugin script (in Python), which utilizes $DSS_LONG_ENVIRONMENT to print environment variables at the end of every integrate cycle

 #!python

 import os
 import sys
 import json

 if __name__ == "__main__":
     if sys.argv[1] == 'integ_end':
         if 'DSS_LONG_ENVIRONMENT' in os.environ:
             with open(os.environ['DSS_LONG_ENVIRONMENT'], 'r') as f:
                 long_environment= json.loads(f.read())
         else:
             long_environment= \{\}  # empty dict

         # print (part of the) environment
         if 'DSS_FILE_NAMES' in long_environment:
             print 'DSS_FILE_NAMES=\{0\}'.format(long_environment['DSS_FILE_NAMES'])
         elif 'DSS_FILE_NAMES' in os.environ:
             print 'DSS_FILE_NAMES=\{0\}'.format(os.environ['DSS_FILE_NAMES'])
         else:
             print 'DSS_FILE_NAMES=<not set>'
         if 'DSS_TBL_NAMES' in long_environment:
             print 'DSS_TBL_NAMES=\{0\}'.format(long_environment['DSS_TBL_NAMES'])
         elif 'DSS_TBL_NAMES' in os.environ:
             print 'DSS_TBL_NAMES=\{0\}'.format(os.environ['DSS_TBL_NAMES'])
         else:
             print 'DSS_TBL_NAMES=<not set>'

         if 'DSS_BASE_NAMES' in long_environment:
             print 'DSS_BASE_NAMES=\{0\}'.format(long_environment['DSS_BASE_NAMES'])
         elif 'DSS_BASE_NAMES' in os.environ:
             print 'DSS_BASE_NAMES=\{0\}'.format(os.environ['DSS_BASE_NAMES'])
         else:
             print 'DSS_BASE_NAMES=<not set>'

         sys.exit(0)  # Success
     else:
         sys.exit(2)  # Ignore mode

Example 4: A database procedure agent plugin that populates table order_line after a refresh.

 create procedure [dbo].[refr_agent] (
    @dss_agent_mode varchar(20),
    @dss_chn_name varchar(20),
    @dss_loc_name varchar(20),
    @dss_agent_arg varchar(20),
    @dss_changed_tables numeric
 ) as
 begin
    if @dss_agent_mode = 'refr_write_begin'
    begin
        begin try
            delete from order_line
        end try
        begin catch
           -- ignore errors; nothing to delete
        end catch
    end
    else if @dss_agent_mode = 'refr_write_end'
    begin
        insert into order_line
            SELECT i.item_id,
              i.item_no,
              i.description,
              i.attribute,
              i.item_type,
              p.id,
              p.date_set,
              p.price
              FROM items i, price p
    end
 end