Thursday, July 19, 2012

Row Level Error Logging in Informatica


Row Error Logging in Informatica

Informatica Inbuilt Error Logging feature can be utilized to implement Row Error logging in a central location.  When a row level error occurs, the Integration service logs the error information which can be used to determine the cause and source of the error.

These row level errors can be captured either in a relational table or in a flat file. When the error logging is enabled, the Integration service creates the error table or error log file the first time when it runs the session.  If the error table or error log file exists already, then the error data will be appended.

Following are the activities that need to be performed to implement the Informatica Row Error Logging:

1.  In the “Config object” tab of “Error Handling” option , set the “Error Log type” attribute to  “Relational database” or “Flat File”.  By Default error logging is disabled.

2.  SET Stop On Errors = 1

3.  If the  Error Log Type is set to “Relational”, specify the Database connection & Table Name Prefix



Following are the tables which will be created by Integration service and which will be populated as and when the error occurs.

PMERR_DATA
Stores data and metadata about a transformation row error and its corresponding source row.

PMERR_MSG
Stores metadata about an error and the error message.

PMERR_SESS
Stores metadata about the session.

PMERR_TRANS
Stores metadata about the source and transformation ports, such as name and datatype, when a transformation error occurs.

4.  If the  Error Log Type is set to “Flatfile”, specify the “Error log file directory” and “Error log file name”
Database Error Messages and the Error messages that Integration service writes to Bad File/ Reject file can also be captured and stored in the Error log tables / Flat files.
Following are the few database error messages which will be logged in the Error Log Tables / Flat files.


Error Messages
Cannot Insert the value NULL into column ‘<<Column name>>’, table ‘<<Table_name>>’
Violation of PRIMARY KEY constraint ‘<<Primary key constraint name>>’
Violation of UNIQUE KEY constraint ‘<<Unique Key Constraint>>’
Cannot Insert Duplicate key in object ‘<<Table_name>>’


Row Error Logging Implementation
Advantages
Since the Informatica Inbuilt feature is leveraged, the Error log information would be very accurate with very minimal development effort.

Drawback
Enabling Error logging will have an impact to performance, since the integration service processes one row at a time instead of block of rows i.e. row-level processing as opposed to block processing. In order to overcome this there is a another approach which I will explain in another post.

Additional Info:

DDL for the creating Informatica Error Tables:
__________________________________________________________
PMERR_MSG:

CREATE TABLE PMERR_MSG
(
  REPOSITORY_GID VARCHAR2(240 ),
WORKFLOW_RUN_ID NUMBER(15, 0) NOT NULL,
WORKLET_RUN_ID NUMBER(15, 0),
SESS_INST_ID NUMBER(15, 0) NOT NULL,
TRANS_MAPPLET_INST VARCHAR2(240 ),
TRANS_NAME VARCHAR2(240 ) NOT NULL,
TRANS_GROUP VARCHAR2(240 ),
TRANS_PART_INDEX NUMBER(15, 0),
TRANS_ROW_ID NUMBER(15, 0) NOT NULL,
ERROR_SEQ_NUM NUMBER(15, 0) NOT NULL,
ERROR_TIMESTAMP TIMESTAMP(6),
ERROR_UTC_TIME NUMBER(15, 0),
ERROR_CODE NUMBER(15, 0),
ERROR_MSG VARCHAR2(2000 ),
ERROR_TYPE NUMBER(15, 0),
LINE_NO NUMBER(15, 0) NOT NULL,
CONSTRAINT PMERR_MSG_PK PRIMARY KEY
  (
    WORKFLOW_RUN_ID,
SESS_INST_ID,
TRANS_NAME,
TRANS_ROW_ID,
ERROR_SEQ_NUM,
LINE_NO
  )
)
__________________________________________________________

PMERR_SESS:

CREATE TABLE PMERR_SESS
(
  REPOSITORY_GID VARCHAR2(240 ),
WORKFLOW_RUN_ID NUMBER(15, 0) NOT NULL,
WORKLET_RUN_ID NUMBER(15, 0),
SESS_INST_ID NUMBER(15, 0) NOT NULL,
SESS_START_TIME TIMESTAMP(6),
SESS_UTC_TIME NUMBER(15, 0),
REPOSITORY_NAME VARCHAR2(240 ),
FOLDER_NAME VARCHAR2(240 ),
WORKFLOW_NAME VARCHAR2(240 ),
TASK_INST_PATH VARCHAR2(240 ),
MAPPING_NAME VARCHAR2(240 ),
LINE_NO NUMBER(15, 0) NOT NULL,
CONSTRAINT PMERR_SESS_PK PRIMARY KEY
  (
    WORKFLOW_RUN_ID,
    SESS_INST_ID,
LINE_NO
  )
)
__________________________________________________________ 

PMERR_TRANS:

CREATE TABLE PMERR_TRANS
(
  REPOSITORY_GID VARCHAR2(240 ),
WORKFLOW_RUN_ID NUMBER(15, 0) NOT NULL,
WORKLET_RUN_ID NUMBER(15, 0),
SESS_INST_ID NUMBER(15, 0) NOT NULL,
TRANS_MAPPLET_INST VARCHAR2(240 ),
TRANS_NAME VARCHAR2(240 ) NOT NULL,
TRANS_GROUP VARCHAR2(240 ),
TRANS_ATTR VARCHAR2(2000 ),
SRC_MAPPLET_INST VARCHAR2(240 ),
SOURCE_NAME VARCHAR2(240 ),
SOURCE_ATTR VARCHAR2(2000 ),
LINE_NO NUMBER(15, 0) NOT NULL,
CONSTRAINT PMERR_TRANS_PK PRIMARY KEY
  (
    WORKFLOW_RUN_ID,
SESS_INST_ID,
TRANS_NAME,
LINE_NO
  )
)
__________________________________________________________

PMERR_DATA:

CREATE TABLE PMERR_DATA
(
  REPOSITORY_GID VARCHAR2(240 ),
WORKFLOW_RUN_ID NUMBER(15, 0) NOT NULL,
WORKLET_RUN_ID NUMBER(15, 0),
SESS_INST_ID NUMBER(15, 0) NOT NULL,
TRANS_MAPPLET_INST VARCHAR2(240 ),
TRANS_NAME VARCHAR2(240 ) NOT NULL,
TRANS_GROUP VARCHAR2(240 ),
TRANS_PART_INDEX NUMBER(15, 0),
TRANS_ROW_ID NUMBER(15, 0) NOT NULL,
TRANS_ROWDATA VARCHAR2(2000 ),
SOURCE_ROW_ID NUMBER(15, 0),
SOURCE_ROW_TYPE NUMBER(15, 0),
SOURCE_ROW_DATA VARCHAR2(2000 ),
LINE_NO NUMBER(15, 0) NOT NULL,
CONSTRAINT PMERR_DATA_PK PRIMARY KEY
  (
    WORKFLOW_RUN_ID,
SESS_INST_ID,
TRANS_NAME,
TRANS_ROW_ID,
LINE_NO
  )
)
__________________________________________________________


Thursday, July 5, 2012

Informatica Post Session Email Task

Since three years, I have been working in Informatica and I came across scenarios where we need to send Email on successful completion of Sessions or Workflows. But in some cases we not only send the successful status but we need to send session statistics also i.e. total rows loaded, total rows rejected etc.
Steps to create Reusable Email Task.
1) Go to components tag of session properties
2) Select reusable in the type field of Success or failure email
3) Click the Open button in the Value column to select the reusable Email task.
4) Note: If you edit the Email task, the edits only apply to this session.
Sample Post session Email:
Session: %s completed successfully..
Email Text -
Session Name– %s
Mapping – %m
Repo – %d
%l
%r
%e
%b
%c
Sample output of the above email task:
Session: s_sample completed successfully..

Session Name– s_sample
Mapping – m_sample
Repo – repo_dev
Total Rows Loaded = 1000
Total Rows Rejected = 0
Completed
Start Time: Tue Nov 30 08:26:00 2008
Completion Time: Tue Nov 30 09:27:11 2008

Following are the email variables that can be used with the email task:
Variable
Description
%a
Attach the named file. The file must be local to the Integration Service. The following file names are valid: %a or %a. The email does not display the full path for the file. Only the attachment file name appears in the email.
Note: The file name cannot include the greater than character (>) or a line break.
%b
Session start time.
%c
Session completion time.
%d
Name of the repository containing the session.
%e
Session status.
%g
Attach the session log to the message. The Integration Service attaches a session log if you configure the session to create a log file. If you do not configure the session to create a log file or if you run a session on a grid, the Integration Service creates a temporary file in the PowerCenter Services installation directory and attaches the file. If the Integration Service does not use operating system profiles, verify that the user that starts Informatica Services has permissions on PowerCenter Services installation directory to create a temporary log file. If the Integration Service uses operating system profiles, verify that the operating system user of the operating system profile has permissions on PowerCenter Services installation directory to create a temporary log file.
%i
Session elapsed time (session completion time-session start time).
%l
Total rows loaded.
%m
Name of the mapping used in the session.
%n
Name of the folder containing the session.
%r
Total rows rejected.
%s
Session name.
%t
Source and target table details, including read throughput in bytes per second and write throughput in rows per second. The Integration Service includes all information displayed in the session detail dialog box.
%u
Repository user name.
%v
Integration Service name.
%w
Workflow name.
%y
Session run mode (normal or recovery).
%z
Workflow run instance name.