Oracle Data Integrator 11g Performance Bottlenecks

Start Here

Get in touch with a
TriCore Solutions specialist

Blog | Apr 4, 2017

Oracle Data Integrator 11g Performance Bottlenecks

However fast the execution plans may be running, the end user will always desire execution plans which are more agile and faster. Let’s take a look at the reasons for the performance bottlenecks. 



Introduction:

This blog aims to help oracle data integrator designers, administrators and system teams analyze addressing the performance bottlenecks in ODI execution plans. Please note that, however fast the execution plans may be running, the end user will always desire execution plans which are more agile and faster. Let’s take a look at the reasons for the performance bottlenecks. 

Performance bottlenecks can occur due to eight primary reasons given order of priority below:

  1. Network Latency
  2. Temp memory allocation
  3. Application memory allocation
  4. Sequential vs Parallel processing of Fact Groups
  5. Source DB Connection Type
  6. Optimal usage of Indexes
  7. Reduce Data Movement
  8. Execution location of transformations.
     

Network Latency

The Batch Update/Array Fetch configuration is a compromise between network and agent overhead. With a highly available network, you can keep low values (<30, for example). With a poor network, you can use larger values (100 and more).

-Parameters that helps to tune this data flow:

  1. Array fetch in the source data server definition
  2. Batch update in the target data server definition

Temp Memory Allocation

Temp spaces are used during aggregation of data or during lookup queries or during parallel running sessions.

For optimized performance: Temp space should be equal to Table space.

Application Memory Allocation

Application memory Allocation refers to WebLogic memory. If the agent needs to have data flowing through it, it should not be installed on a machine with no spare resources. In Oracle Data Integrator 11g, you can adjust the ODI_INIT_HEAP and ODI_MAX_HEAP parameters in the odiparams configuration file to define the agents JVM initial and maximum heap size.

oracle data integrator

Default Values:

ODI_INIT_HEAP: 32 MB

ODI_MAX_HEAP: 256 MB

Max No of Connections: 10

Recommended Values:

ODI_INIT_HEAP: 512 MB

ODI_MAX_HEAP: 4096 MB

Max no of connections: 20

ODI_MAX_HEAP can be increased approximately half the size of the RAM available on the server on which the ODI Agent is running.

Sequential vs Parallel processing of Fact Groups

When running a load plan with the Serial Steps for Fact groups, individual task / run times are less. However total time for load plan completion is higher. Instead, if load plan is executed with OOTB Parallel Steps for fact Groups, the total time is lesser although the individual task / run times are higher. To increase the number of concurrent connections to schema user the total Load time is lesser in Parallel steps compared to Serial steps. 

Source DB Connection Type

ODI has two options to connect to source DB for SDE loads: JDBC Connection and DB Links. After comparative study for SDE load execution times for JDBC Connection & DB Links. DB Links provide better Performance Optimization.
 

Optimal usage of Indexes

Optimization of Source Indexes may help in enhancing the load performance.

  1. Analyze Explain Plan for SDE Queries
  2. Check for usage of Indexes on Sources tables
  3. Look for optimization scope (creation / modification of Indexes)

 

Reduce Data Movement 

If a mapping aggregate a large amount of source data to generate a small data flow to a remote target then you should consider locating staging area on source side


oracle database admin
Execution Location of Transformations 

When filtering source data, execute the filters on the source servers to reduce the data flow from the source to staging area. 

When Joining Source Tables

  1. If the expected source set resulting from the join is small than the sum of the two sources, the join should be performed on the source.
  2. If the expected source set after the join is larger than the sum of the two sources, the join should be performed on the staging area (for example in this case of a cross join). 

Other Performances Tuning Tips 

Pay attention while setting up the physical architecture, if the staging and target areas are on the same server then make sure you select the correct data schema and work schema for the tasks. 

Enable archival logging on the system and regularly purge logs and archive logs. This will help clean up the metadata tables that use the most storage thereby increasing the start time and execution time of scenarios. 

Conclusion:
I hope you found this blog valuable and it helped to clarify the Performance bottlenecks techniques to be followed. Feel free to reach out to me for better ideas / suggestions by clicking below. You can also leve a comment in the field given below.

Ask Sivarama