Saturday, 7 March 2015

Advantage of Join Ranks in BODS

What is Join Rank?

You can use join rank to control the order in which sources (tables or files) are joined in a dataflow. The highest ranked source is accessed first to construct the join.

Best Practices for Join Ranks:
  • Define the join rank in the Query editor.
  • For an inner join between two tables, in the Query editor assign a higher join rank value to the larger table and, if possible, cache the smaller table.

Default, Max and Min values in Join Rank:
Default value for Join Rank is 0. Max and Min value could be any non negative number.
Consider you have tables T1, T2 and T3 with Join Ranks as 10, 20 and 30 then table T3 has highest join rank and therefore T3 will act as a driving table. 

Performance Improvement:

Controlling join order can often have a huge effect on the performance of producing the join result. Join ordering is relevant only in cases where the Data Services engine performs the join. In cases where the code is pushed down to the database, the database server determines how a join is performed.

Where Join Rank to be used?

When code is not full push down and sources are with huge records then join rank may be considered. The Data Services Optimizer considers join rank and uses the source with the highest join rank as the left source. Join Rank is very useful in cases where DS optimizer is not being able to resolve the most efficient execution plan automatically. If join rank value is higher that means that particular table is driving the join.

You can print a trace message to the Monitor log file which allows you to see the order in which the Data Services Optimizer performs the joins. This information may help you to identify ways to improve theperformance. To add the trace, select Optimized Data Flow in the Trace tab of the "Execution Properties"dialog.

Article shall continue with a real time example on Join Rank soon. 

How to export and import Users and Groups in Information Steward

Last week I was searching for how to export Users and Groups in IS but unfortunately could not found the answer and then created a thread where one of the member answered that. I believe that sharing this solution will help other developers not to struggle for the answer in future. Hope you will find it useful.

So, here is the step by step procedure to export and import Users and Groups in information steward:

Scenario:
Considering that you wan to promote Users and Groups from DEV Environment to Production Environment. You simply don't want to re-create everything in new environment and will make use of Users and Groups defined from DEV Environment.

Step 1: Login to Dev Environment of Central Management Console from where you want to export Custom Users and Groups and go to Promotion Management tab as shown below.

1.png
Step 2: Right click on the Promotion Jobs folder and select New Job as shown under.
3.png

Step 3: Name the Job (Ex: Test_Promotion_Job) and then select the source as 'Login to a New CMS' as shown below. You will need to enter the credentials of your DEV CMC. Select the source system and enter your user Id and Password.

4.png

Step 4: You can see a green check adjacent to Source Name if you entered the credentials correctly.

Then, Select 'Output to LCMBIAR File' as a Destination as shown below. Then click on the Create button.
6.png

Step 5: It will load all the jobs which can be promoted. Go to User Groups as shown below and check all the Groups you want to promote to Production environment as highlighted below. Then click on 'Add and Close' button.

5.png

Step 6: Now User Groups are ready to promote. Click on the Promote Button as shown below.
9.png
Step 7: Select the source system and enter the credentials of CMS. Then select the Output to LCMBIAR file. Then click on Export button as shown below and save the LCMBIAR file to your local machine. Now you have successfully exported to User Groups.

6.png
Step 8: Now, Login to your Production environment through CMS where you want to import this User Groups. Click on the Promotion Management tab.
1.png

Step 9: Right Click on the Promotion Jobs folder and then select Import file as shown below. Select the LCMBIAR file which you had saved in earlier step from your local machine.
11.png

Step 10: Then in destination select 'Login to New CMS' and pass your system name, user and password. then click on Create button as shown below.
12.png
Step 11: So you are ready to promote the User Groups from DEV to Production. Click on the Promote button as shown below.
13.png

So that's it. You can go to CMC home and click on Users and Groups to verify if all are successfully promoted to the new environment.

This is how you can export existing User Groups from existing environment to a new environment.

Quick Tips for Job Performance Optimization in BODS

  • Ensure that most of the dataflows are optimized. Maximize the push-down operations to the database as much as possible. You can check the optimized SQL using below option inside a dataflow. SQL should start with INSERT INTO……SELECT statements.....
1.png
  • Split complex logics in a single dataflow into multiple dataflows if possible. This would be much easier to maintain in future as well as most of the dataflows can be pushed down.

  • If full pushdown is not possible in a dataflow then enable Bulk Loader on the target table. Double click the target table to enable to bulk loader as shown in below diagram. Bulk loader is much faster than using direct load.

2.png 
  • Right click the Datastore. Select Edit and then go to Advanced Option and then Edit it. Change the Ifthenelse Support to ‘Yes’. Note that by default this is set to ‘No’ in BODS. This will push down all the decode and ifthenelse functions used in the Job.
3.png

  • Index Creation on Key Columns: If you are joining more than one tables then ensure that Tables have indexes created on the columns used in where clause. This drastically improves the performance. Define primary keys while creating the target tables in DS. In most of the databases indexes are created automatically if you define the keys in your Query Transforms. Therefore, define primary keys in query transforms itself when you first create the target table. This way you can avoid manual index creation on a table.

  • Select Distinct: In BODS ‘Select Distinct’ is not pushed down. This can be pushed down only in case you are checking the ‘Select Distinct’ option just before the target table. So if you require to use select distinct then use it in the last query transform.

  • Order By and Group By are not pushed down in BODS. This can be pushed down only in case you have single Query Transform in a dataflow.

  • Avoid data type conversions as it prevents full push down. Validate the dataflow and ensure there are no warnings.

  • Parallel Execution of Dataflows or WorkFlows: Ensure that workflows and dataflows are not executing in sequence unnecessarily. Make it parallel execution wherever possible.

  • Avoid parallel execution of Query Transforms in a dataflow as it prevents full pushdown. If same set of data required from a source table then use another instance of the same Table as source.

  • Join Rank: Assign higher Join Rank value to the larger table. Open the Query Editor where tables are joined. In below diagram second table has millions of records so have assigned higher join rank. Max number has higher join rank. This improves performance.

5.png
  • Database links and linked datastores: Create database links if you are using more than one database for source and target tables (multiple datastores) or in case using different database servers. You can refer my another article on how to create the DB Link.  Click URL

  • Use of Joining in place of Lookup Functions: Use Lookup table as a source table and set as an outer join in dataflow instead of using lookup functions. This technique has advantage over the lookup functions as it pushes the execution of the join down to the underlying database. Also, it is much easier to maintain the dataflow.

Hope this will be useful.

Custom function to get database name from a datastore in DS

Getting function name from a datastore could be useful if you have configured more than one datastores in your project. This helps avoiding code change while migrating objects from one environment to another during the code promotion. 

Below is step by step procedure to create custom function to get the database name from a datastore and call it in a Batch Job.

1) Go to your Local Object Library and choose Custom Functions then right click on Custom Function and select New

2) Enter name of the function as ‘CF_DATABASE_NAME

3) Enter the below line of code inside the editor.

Fn.png
Click on the above image to zoom it.

Then declare an input parameter named $P_Datastore of length varchar(64). Then Validate the function and if no error found then Save it.

4) Create a global variable at Job level in any of the test batch job you have created and name this global variable as $G_Database of length varchar(64)

5) Call this function in one script of your batch job and use this Global variable wherever required in your Job.

You can call this function as under in script. You simply need to pass the name of your datastore in single quotes.
$G_Database=CF_DATABASE_NAME('datastore_name');

Example of a practical use:

sql('CONV_DS',' TRUNCATE TABLE ||$G_DATABASE||'.ADMIN.CONV_CUSTOMER_ADDRESS');

So the above code will read database name at run time from global variable and truncate records before it loads in the Job. This code can be used in any environment when you promote your code else if database name is hard coded then you will end up updating the code in every new environment.

Some cool options in SAP DS

I find couple of cool options in SAP DS and used to apply in almost all the projects I have been doing. You may also give a try if not done yet. Hope you would like these. You can see all these options in designer.

Monitor Sample Rate:
Right Click the Job >  Click on Properties>  Then click on Execution Options

You can change this value of monitor sample rate here and every time when you execute the Job it shall take the latest value set.

Setting this value to a higher number has performance improvement as well as every time you need not to enter this value while executing the Job. The frequency that the Monitor log refreshes the statistics is based on this Monitor sample rate. With a higher Monitor sample rate, Data Services collects more data before calling the operating system to open the file, and performance improves. Increase Monitor sample rate to reduce the number of calls to the operating system to write to the log file. Default value is set to 5.  Maximum value you can set is 64000.

Refer the below screen shot for reference.

11.png


Click on the Designer Menu Bar and select Tool > Options (see the diagram below). There are couple of cool options available here which can be used in your project. Note that if you change any option from here,it shall apply to whole environment.

12.png

Once selected Go to:
Designer >General > View data sampling size (rows)
Refer the below screen shot. You can increase this value to a higher number if you want to see more no. of records while viewing the data in BODS. Sample size can be controlled from here.

13.png
Designer >General > Perform complete validation before Job execution
Refer the below screen shot. I prefer this to set from here as I need not to worry about validating the Job manually before executing any Job. If you are testing the Job and there is chance of some syntax errors then I would recommend this to set before hand. This will save some time. Check this option if you want to enable.

13 - Copy.png

Designer >General > Show dialog when job is completed
Refer the screen shot below. This is also one of the cool option available in designer. This option facilitate the program to open a dialog box when Job completes. This way you need not to see the monitor log manually for each Job when it completes. I love this option.
13 - Copy (2).png

Designer >Graphics>
Refer the screen shot below. Using this option you change the line type as per your likes. I personally likeHorizontal/Vertical as all transforms looks more clean inside the dataflow. You can also change the color scheme, background etc.

14.png

Designer > Fonts
See the dialog box below. Using this option, you can change the Font Size.

15.png

Do feel free to add to this list if you have come across more cool stuffs in SAP DS.