Saturday, 7 March 2015

​How to use Custom SQL in Lookup_Ext Function?

I recently saw people asking how to use Custom SQL in Lookup_Ext. I searched for it on line but could not find much written on this subject. So this article will explain in short how we can use this feature in Lookup_Ext function which is less known and uncommon among DS Developers (at least I have not seen many developers using this feature) but could be very effective and will improve performance in some scenario.

Purpose:
The purpose of the Custom SQL in the Lookup_Ext function is to improve the performance by limiting data while retrieving into memory, in case doing pre-load cache.

Example:
If you have a DIM_DATE table with 100 years of data but you had transactions only covering the previous week requiring a look up. You may add the Custom SQL to only select the rows from DIM_DATE covering the past week and reducing the size of the lookup table in memory from millions rows to a limited no. of rows.

You can use aggregate functions as well and perform calculations. If anything looks complex to implement in DS but easier at database level then this feature can be useful.

But, you have to ensure that the SQL here selects all the columns you need in the Lookup as well as whatever you have put in the WHERE clause.

Implementation:
Example to use an aggregate function using Custom SQL to calculate a sum of salary for particular Employee and Department.
Code.PNG
SQL Code inside Custom SQL:

SQL.PNG
You can observe that we still need to map columns and select the lookup table as well however we are writing this as a part of code in Custom SQL but this is needed as DS to know which data source to use and override code generated by DI.

How to identify if Custom SQL being used in your Lookup_Ext?
You can see a big exclamation mark in your Lookup_ext screen. If you are not using Custom SQL this exclamation mark does not appear.
Code1.PNG

Limitations:
The Custom SQL is inflexible as there is no way to pass any runtime variables into the Custom SQL. The SQL seems to be fixed at the time of development.

Hope this helps.

How to Create System Configuration in Data Services

Why do we need to have system configuration at first place? Well, the advantage of having system configuration is that you can use it for the lifetime in a project. In general all projects have multiple environments to load the data when project progresses over the period of time. Examples are DEV, Quality and Production Environments.

There are two ways to execute your Jobs in multiple environments:
  • Edit the Datastore’s configuration manually for  executing Jobs in different environment and default it to latest environment
  • Create the system configuration one time and select the appropriate environment while executing of the Job from the ‘Execution Properties’ window. We are going to discuss this option in this blog.

Followings are the steps to create system configuration in Data Services.

Prerequisite to setup the System Configuration:
  • You need to have at least two configurations ready in any of your datastores pointing to two different databases. For example, one for staged data and another for target data. This can be done easily by editing the datastore. Right click the datastore and select ‘Edit’.

Step 1: Execute any of the existing Job to check if your repository does not have any system configuration already created. Below dialog box shall appear once you execute any Job. Do not click on the OK Button to execute. This is just to check the execution properties.

If you look at the below dialog box, there is no system configuration to select.

1.png
Step 2:
Cancel the above Job execution and Click on the Tool menu bar as shown below and select System Configurations.
2.png

Step 3: You can see the below dialog box now. Click on the icon (red circle) as shown in the below dialog box to‘Create New Configuration’. This dialog box will show all the data stores available in your repository.
3.png
Step 4: Once clicked on the above button it will show the below dialog box with default config details for all datastores. Now you can rename the system config name (by default it is System_Config_1, System_Config_1 etc. ).

Select an appropriate configuration Name against each data stores for your system config. I have taken the DEV and History DB as an example for configuration.  Note that these configs should be available in your datastores.

See the below dialog box how it is selected. You can create more than one configuration (Say it one for DEV, another for History).

Once done, click the OK Button. Now your system configuration is ready to use.

4.png
Step 5: Now execute the any of the existing Job again. You can see System Configuration added to the 'Execution Properties' Window which was not available before. From the drop down list you can select appropriate environment to execute your Job.

5.png
Do let me know if you find it useful. Feel free to revert in case you face any issue while configuring. Hope this helps.

How to capture error log in a table in BODS

I will be walking you through (step by step procedure) how we can capture error messages if any dataflow fails in a Job. I have taken a simple example with few columns to demonstrate.

Step 1: Create a Job and name it as ‘ERROR_LOG_JOB’

Step 2: Declare following four global variables at the Job level. Refer the screen shot below for Name and data types.
GV.png
Step 3: Drag a Try Block, Dataflow and Catch block in work area and connect them as shown in diagram below. Inside dataflow you can drag any existing table in your repository as a source and populate few columns to a target table. Make sure target table is a permanent table. This is just for demo.
Job.png

Step 4: Open the Catch block and Drag one script inside Catch Block and name it as shown in below diagram.
Catch Inside.png

Step 5: Open the scrip and write below code inside as shown in the diagram below.
Script Msg.png

The above script is to populate the values in global variables using some in-built BODS functions as well as calling a custom function to log the errors into a permanent table. This function does not exits at this moment. We will be creating this function in later steps.

Step 6: Go to Custom Function section in your repository and create a new custom function and name it as under.

Fun sec.png

Step 7: Click next in above dialog box and write the below code inside the function. You need to declare parameters and local variables as shown in the editor below. Keep the datatypes of these parameters and local variables what we have for global variables in setp 2. Validate the function and save it.

Fun.png

Step 8: Now your function is ready to use. Considering that you have SQL Server as a database where you want to capture these errors in a table. Create a table to store the information.

CREATE TABLE [dbo].[ERROR_LOG](
      [SEQ_NO] [int] IDENTITY(1,1) NOT NULL,
      [ERROR_NUMBER] [int] NULL,
      [ERROR_CONTEXT] [varchar](512) NULL,
      [ERROR_MESSAGE] [varchar](512) NULL,
      [ERROR_TIMESTAMP] [VARCHAR] (512) NULL
)

You may change the datastore as per your requirement. I have taken ETL_CTRL as a datastore in above function which is connected to a SQL Server Database where above table is being created.

Step 9: Just to make sure that dataflow is failing, we will be forcing it to throw an error at run time. Inside your dataflow use permanent target table. Now double click the target table and add one text line below existing comment under load triggers tab. Refer below screen shot. This is one way to throw an error in a dataflow at run time.

Tbl Error.png

Step 10: Now your Job is ready to execute. Save and Execute your Job. You should get an error message  monitor log. Open the table in your database and check if error log information is populated. Error Log shall look like as shown below.
Monitor Log.png

ERROR_LOG table shall capture the same error message in a table as under.

Error Result.png

Hope this helps. In case you face any issue, do let me know. 

Custom function in BODS to remove special characters from a string

Below is step by step procedure to write a custom function in BODS to remove special characters in a string using ASCII values.

Step 1: Create a custom function in BODS and name it as 'CF_REMOVE_SPECIAL_CHARS'

Step 2: Use the below code in your function.


# This function is to remove special characters from the string.It only retains alphabets and numbers from the string.

$L_String =$P_Input_Field;
$L_String_Length =length( $L_String );
$L_Counter =1;
$L_String_final =null;

while($L_String_Length>0)
begin
$L_Char =substr( $L_String ,$L_Counter,1);

if((ascii($L_Char)>=48 and ascii($L_Char)<=57) or (ascii($L_Char)>=65 and ascii($L_Char)<=90) or (ascii($L_Char)>=97 and ascii($L_Char)<=122))
begin
$L_String_final =$L_String_final||$L_Char;
$L_Counter =$L_Counter+1;
$L_String_Length =$L_String_Length-1;
end

else

begin
$L_Counter =$L_Counter+1;
$L_String_Length = $L_String_Length-1;
end
end

Return replace_substr( replace_substr( rtrim_blanks( rtrim_blanks( $L_String_final )),'  ',' '),'  ', ' ');


Your code in Editor would look like as under:
fun.png

Step 3: Declare Parameters and local variables as shown in left pane of the above function editor.

$P_Input_Field - parameter type is input (data type varchar(255) )
$L_Char - datatype varchar(255)
$L_Counter - datatype int
$L_String - datatype varchar(255)
$L_String_final - datatype varchar(255)
$L_String_Length - datatype int

Note: Change the parameter return type to Varchar(255). By default return type is int.

Step 4: Save this function.

Step 5: Call this function while mapping any field in Query Editor where you want to remove special characters.

Ex: CF_REMOVE_SPECIAL_CHARS(Table1.INPUT_VAL)

Above function call shall remove all special characters from INPUT_VAL field in a table1 and output Value shall look like below table's data.

TBL.png

Substitution parameters in SAP DS

What is substitution parameter?

  • Substitution parameters are used to store constant values and defined at repository level.
  • Substitution parameters are accessible to all jobs in a repository.
  • Substitution parameters are useful when you want to export and run a job containing constant values in a specific environment.

Scenario to use Substitution Parameters:

For instance, if you create multiple jobs in a repository and those references a directory on your local computer to read the source files. Instead of creating global variables in each job to store this path you can use a substitution parameter instead. You can easily assign a value for the original, constant value in order to run the job in the new environment. After creating a substitution parameter value for the directory in your environment, you can run the job in a different environment and all the objects that reference the original directory will automatically use the value. This means that you only need to change the constant value (the original directory name) in one place (the substitution parameter) and its value will automatically propagate to all objects in the job when it runs in the new environment.

Key difference between substitution parameters and global variables:

  • You would use a global variable when you do not know the value prior to execution and it needs to be calculated in the job.
  • You would use a substitution parameter for constants that do not change during execution. By using a substitution parameter means you do not need to define a global variable in each job to parameterize a constant value.

Global Variables
Substitution Parameters
Defined at Job Level
Defined at Repository Level
Can not be shared across Jobs
Available to all Jobs in a repository
Data-Type specific
No data type (all strings)
Value can change during job execution
Fixed value set prior to execution of Job (constants)

How to define the Substitution Parameters?
  
Open the Substitution Parameter Editor from the Designer by selecting
Tools Substitution Parameter Configurations....
• You can either add another substitution parameter in existing configuration or you may add a new configuration by clicking the Create New Substitution Parameter Configuration icon in the toolbar.
• The name prefix is two dollar signs $$ (global variables are prefixed with one dollar sign). When
adding new substitution parameters in the Substitution Parameter Editor, the editor automatically
adds the prefix.
• The maximum length of a name is 64 characters.

In the following example, the substitution parameter $$SourceFilesPath has the value D:/Data/Staging in the configuration named Dev_Subst_Param_Conf and the value C:/data/staging in the Quality_Subst_Param_Conf configuration.

subst.png

This substitution parameter can be used in more than one Jobs in a repository. You can use substitution parameters in all places where global variables are supported like Query transform WHERE clauses, Scripts, Mappings, SQL transform, Flat-file options, Address cleanse transform options etc.  Below script will print the source files path what is defined above.

Print ('Source Files Path: [$$SourceFilesPath]');

Associating a substitution parameter configuration with a system configuration:
  
A system configuration groups together a set of datastore configurations and a substitution parameter configuration. For example, you might create one system configuration for your DEV environment and a different system configuration for Quality Environment. Depending on your environment, both system configurations might point to the same substitution parameter configuration or each system configuration might require a different substitution parameter configuration. In below example, we are using different substitution parameter for DEV and Quality Systems.

To associate a substitution parameter configuration with a new or existing system configuration:

In the Designer, open the System Configuration Editor by selecting
Tools System Configurations
You may refer this blog to create the system configuration.

The following example shows two system configurations, DEV and Quality. In this case, there are substitution parameter configurations for each environment. Each substitution parameter configuration defines where the data source files are located. Select the appropriate substitution parameter configuration and datastore configurations for each system configuration.
system.png
At job execution time, you can set the system configuration and the job will execute with the values for the associated substitution parameter configuration.

Exporting and importing substitution parameters:
  
Substitution parameters are stored in a local repository along with their configured values. The DS does not include substitution parameters as part of a regular export. Therefore, you need to export substitution parameters and configurations to other repositories by exporting them to a file and then importing the file to another repository.

Exporting substitution parameters
  1. Right-click in the local object library and select Repository Export Substitution Parameter
  2. Configurations.
  3. Select the check box in the Export column for the substitution parameter configurations to export.
  4. Save the file.
The software saves it as a text file with an .atl extension.
Export.png
Importing substitution parameters
The substitution parameters must have first been exported to an ATL file.

  1. In the Designer, right-click in the object library and select Repository Import from file.
  2. Browse to the file to import.
  3. Click OK.