Ssis Interview Questions

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 82

1. In BIDS, the SSIS project contain 10 packages.

But How to deploy only 5


packages in Destination machine eventhough Manifest file contains all 10
packages after the Build?
-Open the Manifest file in any editor like BIDS or Notepad, keep the required 5 packages,
remove remaining 5 packages.
-Save and Close the Manifest file.
-Double click on Manifest file to deploy the required 5 package.

2. How would you pass a Parent variable value to Child Package?


We can pass the parent variable to child package by using Package configuration:
Parent Package Variable.
2.1 Create parent variable
FilePath - Parent - String - C:\RK\file.txt
2.2 Drag an 'Execute Package Task' in Control Flow and configure it to start
child
package.

2.3 Go to Child Package and create a variable with same name as your parent package
variable.
2.4 Add Package configurations
2.5 "Enable Package Configuration", choose Configuration type as "Parent Package
Variable" and type the name of the variable.

2.6 click 'Next' button and select the 'Value' property of the child package variable.

2.7 click 'Next' and 'OK' buttons


2.8 To test the package, I added sample Script Task with a messageBox to show the
value of the parent package.
3. My Source Table data as follows:

Output Should be as follows:

How to Implement?
Designed SSIS package like:

The script component code:

4.My Source Table Data as follows:

Output should be as follows:

How to Implement?

5. The data in the Flat File as follows:


"132","Ramakrishna"," " ,"Hyderabad"
"132","Radhika","17","Vangara"
How to remove double quotes from the file to process the data.
In the Flat File Connection Manager Editor, Enter double quotes in Text Qualifier text box:

6. Will trigger fire when inserting data through SSIS package?


1.In the data flow task, go to the Advanced Editor of OLEDB Destination, and there
should be a property "FastLoadOptions". Specify FIRE_TRIGGERS as an additional
option.

2. SQL Destination Editor:

3. Bulk Insert Task Editor:

7. What are the different types of Transformations you have worked


AGGEGATE -The Aggregate transformation applies aggregate functions to column values
and copies the results to the transformation output. Besides aggregate functions, the
transformation provides the GROUP BY clause, which you can use to specify groups to
aggregate across.
The Aggregate Transformation supports following operations:
Group By, Sum, Average, Count, Count Distinct, Minimum, Maximum
AUDIT - Adds Package and Task level Metadata - such as Machine Name, Execution
Instance, Package Name, Package ID, etc..
CHARACTER MAP - When it comes to string formatting in SSIS, Character Map
transformation is very useful, used to convert data lower case, upper case.
CONDITIONAL SPLIT used to split the input source data based on condition.
COPY COLUMN - Add a copy of column to the output, we can later transform the copy
keeping the original for auditing.
DATA CONVERSION - Converts columns data types from one to another type. It stands
for Explicit Column Conversion.
DATA MINING QUERY Used to perform data mining query against analysis services and
manage Predictions Graphs and Controls.
DERIVED COLUMN - Create a new (computed) column from given expressions.
EXPORT COLUMN Used to export a Image specific column from the database to a flat
file.
FUZZY GROUPING Groups the rows in the dataset that contain similar values.
FUZZY LOOKUP - Used for Pattern Matching and Ranking based on fuzzy logic.

IMPORT COLUMN - Reads image specific column from database onto a flat file.
LOOKUP - Performs the lookup (searching) of a given reference object set against a data
source. It is used to find exact matches only.
MERGE - Merges two sorted data sets of same column structure into a single output.
MERGE JOIN - Merges two sorted data sets into a single dataset using a join.
MULTI CAST - is used to create/distribute exact copies of the source dataset to one or
more destination datasets.
ROW COUNT - Stores the resulting row count from the data flow / transformation into a
variable.
ROW SAMPLING - Captures sample data by using a row count of the total rows in
dataflow specified by rows or percentage.
UNION ALL - Merge multiple data sets into a single dataset.
PIVOT Used for Normalization of data sources to reduce analomolies by converting
rows into columns
UNPIVOT Used for demoralizing the data structure by converts columns into rows
incase of building Data Warehouses.
8. What are the different types of Transaction Options
Required: If a transaction already exists at the upper level, the current executable will
join the transaction. If No transaction at the upper level, a new transaction is created
automatically.
Supported:In any executable, if there is a transaction at upper level, the executable join
the transaction else do not create a new transaction.
Not Supported:The executable of the package do not honour any transaction ie do not
join other transaction nor creates new transaction.
9. Explain about Checkpoints with properties
Checkpoint is used to restart the package execution from the point of failure rather than
from initial start.
Set the following Properties:
CheckpointFileName: Specifies the name of the checkpoint file.
CheckpointUsage: Never, IfExists, Always
SaveCheckpoints: indicates whether the package needs to save checkpoints. This
property must be set to True to restart a package from a point of failure.
FailPackageOnFailure: property needs to be set to True for enabling the task in the
checkpoint.
Checkpoint mechanism uses a Text File to mark the point of package failure.
These checkpoint files are automatically created at a given location upon the package
failure and automatically deleted once the package ends up with success.

10. How to execute SSIS Package from Stored Procedure.


using xp_cmdshell command

11. How to enable Xp_CmdShell in Sql Server?


We can enable through either T-Sql or SQL Server Surface Area Configuration.
-- To allow advanced options to be changed.
EXEC sp_configure 'show advanced options', 1
GO-- To update the currently configured value for advanced options.RECONFIGURE
GO
-- To enable the feature.
EXEC sp_configure 'xp_cmdshell', 1
GO
-- To update the currently configured value for this feature.
RECONFIGURE
GO
12. Package configuration? Different types of Configuration Files
The package can be transferred across various environments like development and unit
testing, system testing, UAT and production. Most packages will have environment
specific variables like connection string to a database or path to a flat file, or user
defined variables etc. that would be impacted while moving the package across
environments as part of deployment process. Hence, it is mandatory to change these
environment dependent variables when the package is transferred across environments.
Package configurations help in managing such changes without actually opening and
editing the SSIS package in Business Intelligence Development Studio (BIDS).
After deploying the package to a different machine (using SQL Server or file system
deployment mode) it is mandatory to copy the related package configuration files on to
that machine. If the package is scheduled to run through a SQL Agent job,
the configuration file should be added while creating the job so that package will read
the information from the configuration file. While executing the job, SQL Agent will take
the design time values for connection strings if the package configuration file is not
supplied.
There are 5 configuration types available with package configurations.

13. Logging. Different types of Logging files


Logging is used to log the information during the execution of package.
A log provider can be a text file, the SQL Server Profiler, a SQL Server relational
database, a Windows event log, or an XML file. If necessary, you can define a custom log
provider (e.g., a proprietary file format).

14. How to debug a package


For debugging a package, we have 3 options:
-Setting breakpoints in a package, container or Task
-Using Data Viewer in the Data flow task
-Setting Error redirection in the Data Flow task
15. What is the LoggingMode property?
SSIS packages, tasks and containers have a property called LoggingMode. This property
accepts 3 possible values:
Enabled - to enable logging of that component
Disabled - to disable logging of that component
UseParentSetting - to use parent's setting of that component to decide whether or not to
log the data.
16. Error handling in SSIS package
I have created a package like below:

Select 'Load to sql Table' Data flow Task. Navigate to 'Even Handlers' Tab.

Drag and Drop 'Execute Sql Task'. Open the Execute Sql Task Editor and in Parameter
Mapping' section, select the system variables as follows:

Create a table in Sql Server Database with Columns as: PackageID, PackageName,
TaskID, TaskName, ErrorCode, ErrorDescription.

The package will be failed during the execution.

The error information is inserted into Table.

17. How to configure Error Output in SSIS


We have 3 options to configure error output in ssis.
1. Ignore Failure
2. Redirect Row
3. Fail Component
For Example:

I have comma-separated value in a flat file with two columns (code,


Name). Code is an integer value and name is a varchar(20) data type
configured in the flat file connection manager. Some of the codes in the
flat files are characters. So, flat file reader component will fail reading the
character value. But, I want to redirect the error data to separate table.

18. What is Ignore Failure option in SSIS?


In Ignore Failure option, the error will be ignored and the data row will be directed to
continue on the next transformation. Lets say you have some JUNK data(wrong type of
data ) flowing from source, then using this option in SSIS we can REDIRECT the junk
data records to another transformation instead of FAILING the package. This helps to
MOVE only valid data to destination and JUNK can be captured into separate file.
19. Incremental Load in SSIS
Using Slowly Changing Dimension
Using Lookup and Cache Transformation
20. How to migrate Sql server 2005 Package to 2008 version
1. In BIDS, by right click on the "SSIS Packages" folder of an SSIS project and
selecting "Upgrade All Packages".
2. Running "ssisupgrade.exe" from the command line (default physical location
C:\Program Files\Microsoft SQL Server\100\DTS\Bin folder).
3. If you open a SSIS 2005 project in BIDS 2008, it will automatically launch the SSIS

package upgrade wizard.


21. Difference between Synchronous and Asynchronous Transformation
Synchronous T/F process the input rows and passes them onto the data flow one row at
a time.
When the output buffer of Transformation created a new buffer, then it is Asynchronous
transformation. Output buffer or output rows are not sync with input buffer.
22. What are Row Transformations, Partially Blocking Transformation, Fully
Blocking Transformation with examples.
In Row Transformation, each value is manipulated individually. In this transformation,
the buffers can be re-used for other purposes like following:
OLEDB Datasource,
OLEDB Data Destinations
Other Row transformation within the package,
Other partially blocking transformations
within the package.
examples of Row Transformations: Copy Column, Audit, Character Map
Partially Blocking Transformation:
These can re-use the buffer space allocated for available Row transformation and get
new buffer space allocated exclusively for Transformation.
examples: Merge, Conditional Split, Multicast, Lookup, Import, Export Column
Fully Blocking Transformation:
It will make use of their own reserve buffer and will not share buffer space from other
transformation or connection manager.
examples: Sort, Aggregate, Cache Transformation
23. Difference between FileSystem and Sql server Deployment
File System Deployment: We can save the package on a physical location on hard drive
or any shared folder with this option, and we should provide a full qualified path to
stored package in the FileSystem option.
Sql Server Deployment: SSIS packages will be stored in the sysssispackages table of
MSDB Database.
24. Difference between Lookup and Fuzzy Lookup transformation
Lookup Transformation finds the exact match.
Fuzzy Lookup transformation matches input table with reference table. It finds the
closest match and indicates the quality of the match.

25. Difference between Full Cache and Partial Cache


Partial Cache: The lookup cache starts off empty at the beginning of the data flow. When
a new row comes in, the lookup transform checks its cache for the matching values. If
no match is found, it queries the database. If the match is found at the database, the
values are cached so they can be used the next time a matching row comes in.
Full Cache: The default cache mode for lookup is Full cache. The database is queried
once during the pre-execute phase of the data flow. The entire reference set is pulled
into memory. This approach uses most of the memory. Caching takes place before any
rows are read from the data flow source. Lookup operations will be very fast during
execution.

26. Cache Transformation


Cache Transformation: to cache the data used in the Lookup transform.
When to use Full cache mode:
When you're accessing a large portion of your reference set
When you have a small reference table
When your database is remote or under heavy load, and you want to reduce the number
of queries sent to the server
When to use Partial cache mode:
When you're processing a small number of rows and it's not worth the time to charge the
full cache.
When you have a large reference table.
When your data flow is adding new rows to your reference table.
When you want to limit the size of your reference table by modifying query with
parameters from the data flow.
27. Explain Slowly Changing Dimesnion
The SCD concept is basically about how the data modifications are absorbed and
maintained in a Dimension Table.
The new (modified) record and the old record(s) are identified using some kind of a flag
like say IsActive, IsDeleted etc. or using Start and End Date fields to
indicate the validity of the record.
Types:
Type 1 - update the columns in the dimension row without preserving any change
history.
Type 2 - preserve the change history in the dimension table and create a new row
when there are changes.
Type 3 - some combination of Type 1 and Type 2, usually maintaining multiple instances
of a column in the dimension row; e.g. a current value and one or more previous values.
28. Different types of File Enumerators

Foreach ADO:
The ADO Enumerator enumerates rows in a table. For example, we can get the rows in
the ADO records.The variable must be of Object data type.
Foreach ADO.NET Schema Rowset:
The ADO.Net Enumerator enumerates the schema information. For example, we can get
the table from the database.

Foreach File:
The File Enumerator enumerates files in a folder. For example, we can get all the files
which have the *.txt extension in a windows folder and its sub folders.
Foreach From Variable:
The Variable Enumerator enumerates objects that specified variables contain. Here
enumerator objects are nothing but an array or data table.
Foreach Item:
The Item Enumerator enumerates the collections. For example, we can enumerate the
names of executables and working directories that an Execute Process task uses.
Foreach Nodelist:
The Node List Enumerator enumerates the result of an XPath expression.
Foreach SMO:
The SMO Enumerator enumerates SQL Server Management Objects (SMO). For example,
we can get the list of functions or views in a SQL Server database.
29. How to execute the package from .NET?
We need a reference to Microsoft.SqlServer.ManagedDts.dll to call a package.
using Microsoft.SqlServer.Dts.Runtime
Application app = new Application();
Package package = null;
package = app.LoadPackage(@"C:\Program Files\Microsoft SQL
Server\100\DTS\Packages\Integration Services Project2\Package.dtsx", null);
Microsoft.SqlServer.Dts.Runtime.DTSExecResult results = package.Execute();
30. How to schedule a package (Role of Sql Server Agent)
In order for the job to run successfully, the SQL Server agent should be running on the
target machine.
We can start the SQL Server Agent Services in numerous ways like:

Starting SQL Server Agent Service from Command Line


Starting SQL Server Agent Service from Services.MSC console
Starting SQL Server Agent Service using SQL Server Configuration Manager
Starting SQL Server Agent Service using SQL Server Management Studio (SSMS)
31. What are containers? (For loop, Sequence Container)
SSIS Containers are controls that provide structure to SSIS packages. Containers
support repeating control flows in packages and they group tasks and containers into
meaningful units of work. Containers can include other containers in addition to tasks.
Container
Type

Container Description

Purpose of SSIS Container


To repeat tasks for each element in a
collection, for example retrieve files from
a folder, running T-SQL statements that
reside in multiple files, or running a
command for multiple objects.

Foreach
Loop
Container

This container runs a Control


Flow repeatedly using an
enumerator.

For Loop
Container

This container runs a Control


To repeat tasks until a specified
Flow repeatedly by checking
expression evaluates to false. For
conditional expression (same as example, a package can send a different
For Loop in programming
e-mail message seven times, one time

Sequence
Container

language).

for every day of the week.

Groups tasks as well as


containers into Control Flows
that are subsets of the package
Control Flow.

This container group tasks and containers


that must succeed or fail as a unit. For
example, a package can group tasks that
delete and add rows in a database table,
and then commit or roll back all the tasks
when one fails.

32. What are precedence constraints


A task will only execute if the condition that is set by the precedence
constraint preceding the task is met. By using these constraints,it will choose different
execution paths depending on the success or failure of other tasks.
Success Workflow will proceed when the preceding container executes successfully.
Indicated in control flow by a solid green line.
Failure Workflow will proceed when the preceding containers execution results in a
failure. Indicated in control flow by a solid red line.
Completion Workflow will proceed when the preceding containers execution completes,
regardless of success or failure. Indicated in control flow by a solid blue line.
Expression/Constraint with Logical AND Workflow will proceed when specified
expression and constraints evaluate to true. Indicated in control flow by a solid color line
along with a small fx icon next to it. Color of line depends on logical constraint chosen
(e.g. success=green, completion=blue).
Expression/Constraint with Logical OR Workflow will proceed when either the specified
expression or the logical constraint (success/failure/completion) evaluates to true.
Indicated in control flow by a dotted color line along with a small fx icon next to it.
Color of line depends on logical constraint chosen (e.g. success=green,
completion=blue).
33. Performance Optimization in SSIS
1. Avoid Asynchronous Transformation (Sort T/F) wherever possible. Sort T/F required all
the incoming rows to be arrivaed before start processing. Instead of using Sort T/F, we
get sorted rows from datasource using ORDER By clause.
2. Pulling High Volumes of Data
Drop all Non-Clustered Indexes and Clustered Index if exists, then Transfer and load the
data into Destination Table. Create Clustered Index and Non-clustered indexes.
3. Avoid SELECT *
DataFlow Task uses buffer oriented architecture for data transfer and transformation.
When data transfer from Source to Destination, the data first comes into the buffer,
required transformations are done in the buffer itself and then written to Destination.
The size of buffer depends on the estimated row size. The estimated row size is equal to
the maximum size of all columns in the row. So the more columns in a row means less
number of rows in a buffer. Hence select only those columns which are required at the
destination.
Even if we need all the columns from source, we should use the column name specifically
in the SELECT statement, otherwise it takes another round for the source to gather
meta-data about the columns when u are using SELECT *.

4. Effect of OLEDB Destination Settings


There are couple of settings with OLEDB destination which can impact the performance
of data transfer as listed below.
Data Access Mode This setting provides the 'fast load' option which internally uses a
BULK INSERT statement for uploading data into the destination table instead of a simple
INSERT statement (for each single row) as in the case for other options. So unless you
have a reason for changing it, don't change this default value of fast load. If you select
the 'fast load' option, there are also a couple of other settings which you can use as
discussed below.
Keep Identity By default this setting is unchecked which means the destination table (if
it has an identity column) will create identity values on its own. If you check this setting,
the dataflow engine will ensure that the source identity values are preserved and same
value is inserted into the destination table.
Keep Nulls Again by default this setting is unchecked which means default value will be
inserted (if the default constraint is defined on the target column) during insert into the
destination table if NULL value is coming from the source for that particular column. If
you check this option then default constraint on the destination table's column will be
ignored and preserved NULL of the source column will be inserted into the destination.
Table Lock By default this setting is checked and the recommendation is to let it be
checked unless the same table is being used by some other process at same time. It
specifies a table lock will be acquired on the destination table instead of acquiring
multiple row level locks, which could turn into lock escalation problems.
Check Constraints Again by default this setting is checked and recommendation is to
un-check it if you are sure that the incoming data is not going to violate constraints of
the destination table. This setting specifies that the dataflow pipeline engine will validate
the incoming data against the constraints of target table. If you un-check this option it
will improve the performance of the data load.
#5 - Effect of Rows Per Batch and Maximum Insert Commit Size Settings:
Rows per batch:
The default value for this setting is -1 which specifies all incoming rows will be treated as
a single batch. You can change this default behavior and break all incoming rows into
multiple batches. The allowed value is only positive integer which specifies the maximum
number of rows in a batch.
Maximum insert commit size:
The default value for this setting is '2147483647' (largest value for 4 byte integer type)
which specifies all incoming rows will be committed once on successful completion. You
can specify a positive value for this setting to indicate that commit will be done for those
number of records. Changing the default value for this setting will put overhead on the
dataflow engine to commit several times. Yes that is true, but at the same time it will
release the pressure on the transaction log and tempdb to grow specifically during high
volume data transfers.
The above two settings are very important to understand to improve the performance of
tempdb and the transaction log. For example if you leave 'Max insert commit size' to its
default, the transaction log and tempdb will keep on growing during the extraction
process and if you are transferring a high volume of data the tempdb will soon run out of

memory as a result of this your extraction will fail. So it is recommended to set these
values to an optimum value based on your environment.
#7 - DefaultBufferSize and DefaultBufferMaxRows :
The execution tree creates buffers for storing incoming rows and performing
transformations.
The number of buffer created is dependent on how many rows fit into a buffer and how
many rows fit into a buffer dependent on few other factors. The first consideration is the
estimated row size, which is the sum of the maximum sizes of all the columns from the
incoming records. The second consideration is the DefaultBufferMaxSize property of the
data flow task. This property specifies the default maximum size of a buffer. The default
value is 10 MB and its upper and lower boundaries are constrained by two internal
properties of SSIS which are MaxBufferSize (100MB) and MinBufferSize (64 KB). It
means the size of a buffer can be as small as 64 KB and as large as 100 MB. The third
factor is, DefaultBufferMaxRows which is again a property of data flow task which
specifies the default number of rows in a buffer. Its default value is 10000.
If the size exceeds the DefaultBufferMaxSize then it reduces the rows in the buffer. For
better buffer performance you can do two things.
First you can remove unwanted columns from the source and set data type in each
column appropriately, especially if your source is flat file. This will enable you to
accommodate as many rows as possible in the buffer.
Second, if your system has sufficient memory available, you can tune these properties to
have a small number of large buffers, which could improve performance. Beware if you
change the values of these properties to a point where page spooling (see Best Practices
#8) begins, it adversely impacts performance. So before you set a value for these
properties, first thoroughly testing in your environment and set the values appropriately.
#8 - How DelayValidation property can help you
SSIS uses two types of validation.
First is package validation (early validation) which validates the package and all its
components before starting the execution of the package.
Second SSIS uses component validation (late validation), which validates the
components of the package once started.
Let's consider a scenario where the first component of the package creates an object i.e.
a temporary table, which is being referenced by the second component of the package.
During package validation, the first component has not yet executed, so no object has
been created causing a package validation failure when validating the second
component. SSIS will throw a validation exception and will not start the package
execution. So how will you get this package running in this common scenario?
To help you in this scenario, every component has a DelayValidation (default=FALSE)
property. If you set it to TRUE, early validation will be skipped and the component will be
validated only at the component level (late validation) which is during package execution
9. Better performance with parallel execution
10. When to use events logging and when to avoid.

11. Monitoring the SSIS Performance with Performance Counters


Launch Performance Monitor:
1. Start -> All Programs -> Administrative Tools -> Performance
2. Load the SSIS related Counters
In the Performance Object, select SQL Server:SSIS Pipeline and SQL Server:SSIS
Service.
SSIS provide a set of performance counters. Among them, the following few are helpful
when you tune or debug your package:
Buffers in use
Flat buffers in use
Private buffers in use
Buffers spooled
Rows read
Rows written
Buffers in use, Flat buffers in use and Private buffers in use are useful to discover
leaks. During package execution time, we will see these counters fluctuating. But once
the package finishes execution, their values should return to the same value as what
they were before the execution. Otherwise, buffers are leaked.
Buffers spooled has an initial value of 0. When it goes above 0, it indicates that the
engine has started memory swapping. In a case like this, set Data Flow Task properties
BLOBTempStoragePath and BufferTempStoragePath appropriately for maximal I/O
bandwidth.
Buffers Spooled: The number of buffers currently written to the disk. If the data flow
engine runs low on physical memory, buffers not currently used are written to disk and
then reloaded when needed.
Rows read and Rows written show how many rows the entire Data Flow has
processed.
12. FastParse property
Fast Parse option in SSIS can be used for very fast loading of flat file data. It will speed
up parsing of integer, date and time types if the conversion does not have to be localesensitive. This option is set on a per-column basis using the Advanced Editor for the flat
file source.

13. Checkpoint features helps in package restarting


34. Upgrade DTS package to SSIS
1. In BIDS, from the Project Menu, select 'Migrate DTS 2000 Package'
2. In the Package Migration Wizard, choose the Source, Sql Server 2000 Server Name,
Destination folder.
3. Select the List of packages that needs to be upgraded to SSIS
4. Specifty the Log file for Package Migration.
35. Events in SSIS
OnError : Runs when a task or container reports an error.
OnExecStatusChanged : Runs for all tasks and containers when the execution status
changes to In Process, Success, or Failed.
OnInformation : Runs when SSIS outputs information messages during the validation
and execution of a task or container.
OnPostExecute : Runs after a container or task successfully completes.
OnPostValidate : Executes after a container or task has successfully been validated.
OnPreExecute : Runs just before a container or task is executed.
OnPreValidate: Runs before the component is validated by the engine.
OnProgress : Executed when a progress message is sent by the SSIS engine, indicating
tangible advancement of the task or container.
OnQueryCancel : Invoked when an Execute SQL Task is cancelled through manual
intervention, such as stopping the package.
OnTaskFailed : Similar to OnError, but runs when a task fails rather than each time an
error occurs.
OnVariableValueChanged: Runs when the value changes in a variable for which the
RaiseChangeEvent property is set to True.
OnWarning Runs: when a task returns a warning event such as a column not being used
in a data flow.

36. Difference between Control Flow and Data Flow


1.Control flow consists of one or more tasks and containers that execute when the
package runs. We use precedence constraints to connect the tasks and containers in a
package. SSIS provides three different types of control flow elements: Containers that
provide structures in packages, Tasks that provide functionality, and Precedence
Constraints that connect the executables, containers, and tasks into an ordered control
flow.
2.Control flow does not move data from task to task.
3.Tasks are run in series if connected with precedence or in parallel.
1. A data flow consists of the sources and destinations that extract and load data, the
transformations that modify and extend data, and the paths that link sources,
transformations, and destinations. The Data Flow task is the executable within the SSIS
package that creates, orders, and runs the data flow. Data Sources, Transformations,
and Data Destinations are the three important categories in the Data Flow.
2. Data flows move data, but there are also tasks in the control flow, as such, their
success or Failure effects how your control flow operates
3. Data is moved and manipulated through transformations.
4. Data is passed between each component in the data flow.
37. Different ways to execute SSIS package
1. Using the Execute Package Utility (DTEXECUI.EXE) graphical interface one can
execute an SSIS package that is stored in a File System, SQL Server or an SSIS Package
Store.
DTEXECUI provides a graphical user interface that can be used to specify the various
options to be set when executing an SSIS package. You can launch DTEXECUI by
double-clicking on an SSIS package file (.dtsx). You can also launch DTEXECUI from a
Command Prompt then specify the package to execute.
2. Using the DTEXEC.EXE command line utility one can execute an SSIS package
that is stored in a File System, SQL Server or an SSIS Package Store. The syntax to
execute a SSIS package which is stored in a File System is shown below.
DTEXEC.EXE /F "C:\BulkInsert\BulkInsertTask.dtsx"
3. Test the SSIS package execution by running the package from BIDS:
-In Solution Explorer, right click the SSIS project folder that contains the package which
you want to run and then click properties.
- In the SSIS Property Pages dialog box, select Build option under the Configuration
Properties node and in the right side panel, provide the folder location where you want
the SSIS package to be deployed within the OutputPath. Click OK to save the changes in
the property page.
-Right click the package within Solution Explorer and select Execute Package option from
the drop down menu
4. Sql Server Agent:
Drill down to the SQL Server Agent node in the Object Explorer. Right click on the Jobs
node and select New Job from the popup menu.

The first step to setting up the proxy is to create a credential (alternatively you could use
an existing credential). Navigate to Security then Credentials in SSMS Object Explorer
and right click to create a new credential
Navigate to SQL Server Agent then Proxies in SSMS Object Explorer and right click to
create a new proxy
38. How to execute a Stored Procedure from SSIS
using Execute SQL Task
39. How to deploy packages from one server to another server
1.To copy the deployment bundle
Locate the deployment bundle on the first server.
If you used the default location, the deployment bundle is the Bin\Deployment folder.
Right-click the Deployment folder and click Copy.
Locate the public share to which you want to copy the folder on the target computer and
click Paste.
2: Running the Package Installation Wizard
1. On the destination computer, locate the deployment bundle.
2. In the Deployment folder, double-click the manifest file,
Project1.SSISDeploymentManifest.
3. On the Welcome page of the Package Installation Wizard, click Next.
4. On the Deploy SSIS Packages page, select either File sytem or SQL Server
deployment option, select the "Validate packages after installation" check box, and
then click Next.
5. On the Specify Target SQL Server page, specify (local), in the Server name box.
6. If the instance of SQL Server supports Windows Authentication, select Use Windows
Authentication; otherwise, select Use SQL Server Authentication and provide a user
name and a password.
7. Verify that the "Rely on server storage for encryption" check box is cleared.
Click Next.
8. On the Select Installation Folder page, click Browse.
9. On the Confirm Installation page, click Next.
10. The wizard installs the packages. After installation is completed, the Configure
Packages page opens.

40. How to deploy a package

Right click on the Solution in Solution Explorer and choose properties in the
Menu.

When the build /rebuild is successful, navigate to the directory is referred


in DeploymentOutputPath
Deploying the Package:
Double click the Manifest File to start the deployment. The Package Installation wizard
begins and Deploy SSIS Packages step is the first screen that is presented.

This screen lets you select where shall the packages be deployed, as mentioned in the
Dialog Box, deploying in SQL Server is more secure, since SQL Server stores the
packages internally compared to File System where additional security measures needs
to taken to secure the physical

files.

41. What is the use of Percentage Sampling transformation in SSIS?

Percentage Sampling transformation is generally used for data mining. This


transformation builds a random sample of set of output rows by choosing specified
percentage of input rows. For example if the input has 1000 rows and if I specify 10
as percentage sample then the transformation returns 10% of the RANDOM records from
the input data.
42. What is the use of Term Extraction transformation in SSIS?
Term Extraction transformation is used to extract nouns or noun phrases or both noun
and noun phrases only from English text. It extracts terms from text in a transformation
input column and then writes the terms to a transformation output column. It can be
also used to find out the content of a dataset.
43. What is Data Viewer and what are the different types of Data Viewers in
SSIS?
A Data Viewer allows viewing data at a point of time at runtime.
The different types of data viewers are:
1. Grid
2. Histogram
3. Scatter Plot
4. Column Chart

44.
Difference
between
Merge
and
UnionAll
Transformations
The Union All transformation combines multiple inputs into one output. The
transformation inputs are added to the transformation output one after the other; no
reordering of rows occurs.

Merge Transformations combines two sorted data sets of same column structure into a
single output.The rows from each dataset are inserted into the output based on values in
their key columns.
The Merge transformation is similar to the Union All transformations. Use the Union All
transformation instead of the Merge transformation in the following situations:
-The Source Input rows are not need to be sorted.
-The combined output does not need to be sorted.

-when we have more than 2 source inputs.


45. Multicast, Conditional Split, Bulk Insert Tasks
Multicast Transformation is used to extract output from single source and places onto
multiple destinations.
Conditional Split transformation is used for splitting the input data based on a specific
condition. The condition is evaluated in VB Script.
Multicast Transformation generates exact copies of the source data, it means each
recipient will have same number of records as the source whereas the Conditional Split
Transformation divides the source data based on the defined conditions and if no rows
match with this defined conditions those rows are put on default output.
Bulk Insert Task is used to copy the large volumn of data from text file to sql server
destination.
46. Explain Audit Transformation ?
It allows you to add auditing information. Auditing options that you can add to
transformed
data
through
this
transformation
are
:
1. Execution of Instance GUID : ID of execution instance of the package
2.
PackageID
:
ID
of
the
package
3.
PackageName
:
Name
of
the
Package
4.
VersionID
:
GUID
version
of
the
package
5.
Execution
StartTime
6.
MachineName
7.
UserName
8.
TaskName
9. TaskID : unique identifier type of the data flow task that contains audit
transformation.
47. what are the possible locations to save SSIS package?
1.File System: We can save the package on a physical location on hard drive or any
shared folder with this option, and we should provide a full qualified path to stored
package in the FileSystem option.
2. Sql Server: SSIS packages will be stored in the MSDB database, in the
sysssispackages table.
SSIS Package Store is nothing but combination of SQL Server and File System
deployment, as you can see when you connect to SSIS through SSMS: it looks like a
store which has categorized its contents (packages) into different categories based on its
managers (which is you, as the package developer) taste. So, dont get it wrong as
something different from the 2 types of package deployment.
48. How to provide security to packages?
We can provide security to packages in 2 ways
1. Package encryption
2. Password protection

1. DonotSaveSensitive: any sensitive information is simply not written out to


the
package
XML
file
when you
save
the
package.
2. EncryptSensitiveWithUserKey: encrypts sensitive information based on the
credentials of the user who created the package. It is the default value for
the
ProtectionLevel
property.
3. EncryptSensitiveWithPassword: requires to specify a password in the
package, and this password will be used to encrypt and decrypt the sensitive
information
in
the
package.
4. EncryptAllWithPassword: allows to encrypt the entire contents of the SSIS
package
with
your
specified
password.
5. EncryptAllWithUserKey: allows to encrypt the entire contents of the SSIS
package
by
using
the
user
key.
6. Server Storage: allows the package to retain all sensitive information
when you are saving the package to SQL Server. SSIS packages are saved to
MSDB
database
of
SQL
Server.
You can change the Protection Level of deployed packages by using the
DTUTIL
utility.
49. How to track a variable in ssis?
OnVariableValueChanged: This event gets raised when value of the variable is changed.
1.Set the "EvaluateasExpression" property of the variable as True.
2.Set the "RaiseChangedEvent" property of the variable as True.
3.Create an event handler for the "OnVariableValueChanged" event for the container
in which the variable is scoped.
50. FTP Task:
The FTP task downloads and uploads data files and manages directories on servers. For
example, a package can download data files from a remote server.
use the FTP task for the following purposes:
1. Copying directories and data files from one directory to another, before or after
moving data, and applying transformations to the data.
2. Downloading files from an FTP location and applying transformations to column data
before loading the data into a database.

At run time, the FTP task connects to a server by using an FTP connection manager. The
FTP connection manager includes the server settings, the credentials for accessing the
FTP server, and options such as the time-out and the number of retries for connecting to
the server.
The FTP connection manager supports only anonymous authentication and basic
authentication. It does not support Windows Authentication.
Predefined FTP Operations:
Send Files,
Receive File,
Create Local directory,
Remove Local Directory,
Create Remote Directory, Remove Remote Directory
Delete Local Files,
Delete Remote File
Customer Log Entries available on FTP Task:
FTPConnectingToServer
FTPOperation

51. New features in SSIS 2012


1. GUI Improvements - -Sort packages by name -Package visualization -Zoom -Data flow
source/destination wizard -Grouping in data flow
2. CDC (Change Data Capture) Task and Components - -CDC is nothing but Incremental
load loads all rows that have changed since the last load -CDC needs to keep track of
which changes have already been processed. -CDC task does this by storing LSNs in a
tracking table -CDC source component reads from the CDC table function, based on the
LSN it for from the CDC task. -CDC transformation splits records into new rows, updated
rows and deleted rows.
3. Flat File Connection Manager Changes - -The Flat File connection manager now
supports parsing files with embedded qualifiers. The connection manager also by default
always checks for row delimiters to enable the correct parsing of files with rows that are
missing column fields. The Flat File Source now supports a varying number of columns,
and embedded qualifiers.
4. Offline Connection Managers: Integration Services now validates all connections
before validating all of the data flow components when a package is opened and sets any
connections that are slow or unavailable to work offline. This helps to reduce the delay in
validating the package data flow. After a package is opened, you can also turn off a
connection by right-clicking the connection manager in the Connection Managers area
and then clicking Work Offline. This can speed up operations in the SSIS Designer.
5. New Functions/Expressions in SSIS 2012:
LEFT: You now can easily return the leftmost portion of a string rather than use the
SUBSTRING function. Left syntax is the same as we know in TSQL: LEFT(character_expression,number)
REPLACENULL: You can use this function to replace NULL values in the first argument
with the expression specified in the second argument. This is equivalent to ISNULL in TSQL: REPLACENULL(expression, expression)

TOKEN: This function allows you to return a substring by using delimiters to separate a
string into tokens and then specifying which occurrence to
return: TOKEN(character_expression, delimiter_string, occurrence)
TOKENCOUNT: This function uses delimiters to separate a string into tokens and then
returns the count of tokens found within the string: TOKENCOUNT(character_expression,
delimiter_string)
6. Easy Column Remapping in Data Flow (Mapping Data Flow Columns) -When modifying
a data flow, column remapping is sometimes needed -SSIS 2012 maps columns on name
instead of id -It also has an improved remapping dialog
7. Shared Connection Managers: To create connection managers at the project level that
can shared by multiple packages in the project. The connection manager you create at
the project level is automatically visible in the Connection Managers tab of the SSIS
Designer window for all packages. -When converting shared connection managers back
to regular (package) connection managers, they disappear in all other packages.
8. Scripting Enhancements: Now Script task and Script Component support for 4.0. Breakpoints are supported in Script Component
9. ODBC Source and Destination - -ODBC was not natively supported in 2008 -SSIS
2012 has ODBC source & destination -SSIS 2008 could access ODBC via ADO.NET
10. Reduced Memory Usage by the Merge and Merge Join Transformations The old
SSIS Merge and Merge Join transformations, although helpful, used a lot of system
resources and could be a memory hog. In 2012 these tasks are much more robust and
reliable. Most importantly, they will not consume excessive memory when the multiple
inputs produce data at uneven rates.
11. Undo/Redo: One thing that annoys users in SSIS before 2012 is lack of support of
Undo and Redo. Once you performed an operation, you cant undo that. Now in SSIS
2012, we can see the support of undo/redo.
52. Difference between Script Task and Script Component in SSIS.
Script Task

Script Component

Control
Flow/Date
Flow

The Script task is configured on the Control Flow tab of


the designer and runs outside the data flow of the
package.

The Script component is configured on the


Data Flow page of the designer and
represents a source, transformation, or
destination in the Data Flow task.

Purpose

A Script task can accomplish almost any general-purpose You must specify whether you want to create
task.
a source, transformation, or destination with
the Script component.

Raising
Results

The Script task uses both the TaskResult property and


The Script component runs as a part of the
the optional ExecutionValue property of the Dts object to Data Flow task and does not report results
notify the runtime of its results.
using either of these properties.

Raising
Events

The Script task uses the Events property of the Dts


object to raise events. For example:
Dts.Events.FireError(0, "Event Snippet", ex.Message &
ControlChars.CrLf & ex.StackTrace

The Script component raises errors, warnings,


and informational messages by using the
methods of the IDTSComponentMetaData100
interface returned by the
ComponentMetaData property. For example:
Dim myMetadata as
IDTSComponentMetaData100
myMetaData = Me.ComponentMetaData
myMetaData.FireError(...)

Execution

A Script task runs custom code at some point in the


package workflow. Unless you put it in a loop container
or an event handler, it only runs once.

A Script component also runs once, but


typically it runs its main processing routine
once for each row of data in the data flow.

Editor

The Script Task Editor has three pages: General, Script,


and Expressions. Only the ReadOnlyVariables and
ReadWriteVariables, and ScriptLanguage properties
directly affect the code that you can write.

The Script Transformation Editor has up to


four pages: Input Columns, Inputs and
Outputs, Script, and Connection Managers.
The metadata and properties that you
configure on each of these pages determines
the members of the base classes that are
autogenerated for your use in coding.

Interaction
with the
Package

In the code written for a Script task, you use the Dts
property to access other features of the package. The
Dts property is a member of the ScriptMain class.

In Script component code, you use typed


accessor properties to access certain package
features such as variables and connection
managers. The PreExecute method can access
only read-only variables. The PostExecute
method can access both read-only and
read/write variables.

Using
Variables

The Script task uses the Variables property of the Dts


object to access variables that are available through the
tasks ReadOnlyVariables and ReadWriteVariables
properties. For example: string myVar;
myVar =
Dts.Variables["MyStringVariable"].Value.ToString();

The Script component uses typed accessor


properties of the autogenerated based class,
created from the components
ReadOnlyVariables and ReadWriteVariables
properties. For example:
string myVar; myVar =
this.Variables.MyStringVariable;

Using
The Script task uses the Connections property of the Dts
Connections object to access connection managers defined in the
package. For example:
string
myFlatFileConnection;
myFlatFileConnection =
(Dts.Connections["Test Flat File
Connection"].AcquireConnection(Dts.Transaction) as
String);

53.Difference between DTS and SSIS?


DTS:
Limited number of transformations.
Limited error handling.
Message box in active-x scripts.
SSIS:
More number of transformations.
Better error handling.

The Script component uses typed accessor


properties of the autogenerated base class,
created from the list of connection managers
entered by the user on the Connection
Managers page of the editor. For example:
IDTSConnectionManager100
connMgr;connMgr =
this.Connections.MyADONETConnection;

Message box in .NET scripting.


54. How to execute package from command line
1. To execute an SSIS package saved to SQL Server using Windows Authentication:
dtexec /sq pkgOne /ser productionServer
2. To execute an SSIS package that is saved in the file system:
dtexec /f "c:\pkgOne.dtsx"
3. To execute an SSIS package saved to the File System folder in the SSIS Package
Store:
dtexec /dts "\File System\MyPackage"
4. To execute an SSIS package that is saved in the file system and configured externally:
dtexec /f "c:\pkgOne.dtsx" /conf "c:\pkgOneConfig.cfg"
55. How to unzip a File in SSIS?
Use Execute Process Task in the Control Flow.
From BIDS, drag and drop an "Execute Process Task" to the control flow and configure.
In the Execute Process, perform the following configurations:
Executable: The path of the application that is being used.
Arguments: Need to supply the arguments to extract the zipped files.
Working Directory: The current directory for all process.
56. which service requires to start a job
SQL Server Agent Service
57. Difference between OLEDB Destination, SQL Destination, Bulk Insert
1. OLEDB destination loads the records in batches, where as SQL Server destination
loads all the records at one go.
2. OLEDB Destination uses the 'Fast Load' data access mode.
SQL Server destination uses shared memory for maximum loading speed, must execute
on the same server as the database engine.
Prefer the OLE-DB Destination simply because it gives better flexibility on where you
execute the package.
3. The Bulk Insert task uses the T-SQL BULK INSERT statement for speed when loading
large amounts of data.
58.which services are installed during Sql Server installation
SSIS
SSAS
SSRS
SQL Server (MSSQLSERVER)
SQL Server Agent Service
SQL Server Browser
SQL Full-Text
59. How to run dynamic T-SQL in SSIS?
Option#1: Using Script Component as Destination
Option#2: Using Object Variable and run t-sql with Execute SQL Task
60. What is the use of RecordSet Destination?
The Recordset destination does not save data to an external data source. Instead, it
saves data in memory in a recordset that is stored in variable of the Object data type.
After the Recordset destination saves the data, we use a Foreach Loop container with the
Foreach ADO enumerator to process one row of the recordset at a time. The Foreach

ADO enumerator saves the value from each column of the current row into a separate
package variable. Then, the tasks that you configure inside the Foreach Loop container
read those values from the variables and perform some action with them.
61. Delay Validation, Forced Execution
Delay Validation: Validation take place during the package execution.
Early Validation: Validation take place just before the package execution.
62. Transfer Database Task
used to move a database to another SQL Server instance or create a copy on the same
instance (with different database name). This task works in two modes: Offline, Online.

Offline: In this mode, the source database is detached from the source server after
putting it in single user mode, copies of the mdf, ndf and ldf files are moved to specified
network location. On the destination server the copies are taken from the network
location to the destination server and then finally both databases are attached on the
source and destination servers. This mode is faster, but a disadvantage with mode is that
the source database will not available during copy and move operation. Also, the person
executing the package with this mode must be sysadmin on both source and destination
instances.
Online: The task uses SMO to transfer the database objects to the destination server. In
this mode, the database is online during the copy and move operation, but it will take
longer as it has to copy each object from the database individually. Someone executing
the package with this mode must
be either sysadmin or database owner of the specified databases.
63. Transfer SQL Server Object Task
Used to transfer one or more SQL Server objects to a different database, either on the
same or another SQL Server instance. You can transfer tables, views, Indexes, stored
procedures, User defined functions, Users, Roles etc.

64. How to Generate an Auto Incremental Number in a SSIS Package?


A script component can be used for the designated task. The steps are as follows:
1. Drag and drop the Script Document to the Data flow and select the Script Component
Type as Transformation.
2. Double click the Script Component.
3. Select the column which is to pass through the script component, in the Input
Columns tab.
4. Add a column with an integer data type, in the Inputs and Outputs tab.
65. Breakpoint in SSIS?
A breakpoint allows you to pause the execution of the package in BIDS during
development or when troubleshooting an SSIS Package. You can right click on the task in
control flow, click on 'Edit Breakpoint' menu and from the Set Breakpoint window, you
specify when you want execution to be halted/paused. For example OnPreExecute,
OnPostExecute, OnError events, etc. To toggle a breakpoint, delete all breakpoints and
disable all breakpoints go to the Debug menu and click on the respective menu item. You
can even specify different conditions to hit the breakpoint as well.
66. What is the DisableEventHandlers property used for?
SSIS packages, tasks and containers have a property called DisableEventHandlers. If
you set this property to TRUE for task or container, then all event handlers will be
disabled for that task or container. If you set this property value to FALSE then the
event handlers will once again be executed.
67. How to create Temporary Table using SSIS?
1. For the connection manager, set the property RetainSameConnection =True so that

temporary table created in one Control Flow task can be retained in another task.
RetainSameConnection means that the temp table will not be deleted when the task is
completed.
2. Create a data-flow task that consumes your global temp table in an OLE DB Source
component.
3. Set DelayValidation=TRUE on the data-flow task, means that the task will not check if
the table exists upon creation.
68. How to Lock a variable in Script Task?
public void Main()
{
Variables vars = null ;
bool fireAgain = true;
Dts.VariableDispenser.LockOneForRead("varName", ref vars);
//Do something with the value...
vars.Unlock();
Dts.TaskResult = (int)ScriptResults.Success;
}
69. How to pass property value at Run time?
A property value like connection string for a Connection Manager can be passed to the
package using package configurations.
70. How to skip first 5 lines in each Input flat file?
In the Flat file connection manager editor, Set the 'Header rows to skip' property.
71. Parallel processing in SSIS
To support parallel execution of different tasks in the package, SSIS uses 2 properties:
1.MaxConcurrentExecutables: defines how many tasks can run simultaneously, by
specifying the maximum number of SSIS threads that can execute in parallel per
package. The default is -1, which equates to number of physical or logical processor + 2.
2. EngineThreads: is property of each DataFlow task. This property defines how many
threads the data flow engine can create and run in parallel. The EngineThreads property
applies equally to both the source threads that the data flow engine creates for sources
and the worker threads that the engine creates for transformations and destinations.
Therefore, setting EngineThreads to 10 means that the engine can create up to ten
source threads and up to ten worker threads.
72. How do we convert data type in SSIS?
The Data Conversion Transformation in SSIS converts the data type of an input column
to a different data type.

SSRS Interview Questions and


Answers
1. How do u implement Cascading parameter?
The list of values for one parameter depends on the value chosen in preceding
parameter.
Eg: Country --> State --> City
2. How to pass parameter from Report Viewer Control to sub report?
3. How to open another report in a new window from existing report?
Use a little javascript with a customized URL in the "Jump to URL" option of the
Navigation tab.
Non-parameterized Solution
To get started, let's pop up a simple non parameterized report. Follow these instructions:
1.
Instead of using the "Jump to Report" option on the Navigation tab, use the
"Jump to URL" option.
2.

Open the expression screen (Fx button).

3.
Enter the following:
="javascript:void(window.open('http://servername?%2freportserver%2fpathto
%2freport&rs:Command=Render'))"
4.

Click OK twice, then save and deploy the report.

Parameterized Solution
Assume you have a field called ProductCode. Normally, you might hard code that like
this:
http://servername/reportserver?%2fpathto
%2freport&rs:Command=Render&ProductCode=123
In this case, you want to pass variables dynamically, using an available value from the
source dataset. You can think of it like this:
http://servername/reportserver?%2fpathto
%2freport&rs:Command=Render&ProductCode=Fields!ProductCode.Value
The exact syntax in the "Jump to URL" (Fx) expression window will be:
="javascript:void(window.open('http://servername/reportserver?%2fpathto
%2freport&rs:Command=Render&ProductCode="+Fields!ProductCode.Value+"'))"
4. How to pass parameter from chart to Table in same report?
5. How to apply custom Colors of chart report?
STEP1:
Create your custome color palette in the report using Custom Code in your report. To do
so, click Report => Report Properties => Code and copy below code:

Private colorPalette As String() = { "Blue", "Red", "Teal", "Gold", "Green","#A59D93",


"#B8341B", "#352F26", "#F1E7D6", "#E16C56", "#CFBA9B"}
Private count As Integer = 0
Private mapping As New System.Collections.Hashtable()
Public Function GetColor(ByVal groupingValue As String) As String
If mapping.ContainsKey(groupingValue) Then
Return mapping(groupingValue)
End If
Dim c As String = colorPalette(count Mod colorPalette.Length)
count = count + 1
mapping.Add(groupingValue, c)
Return c
End Function
STEP2:
In the Pie Chart, select Series Properties and select the Fill option from left side.
Now write following expression in the Color expression:
=code.GetColor(Fields!Year.Value)
Here Fields!Year.Value is a field of dataset which is used as Chart Category fields.
6. Can we have Table within a Table in SSRS report?
Yes. We can have Nested Tables.
7. How to apply stylesheet to SSRS Reports
select Report->Report Properties from the menu and then click the Code tab.
Function StyleElement (ByVal Element As String) As String
Select Case UCase(Element)
Case "TABLE_HEADER_BG"
Return "Red"
Case "TABLE_FOOTER_BG"
Return "Green"
Case "TABLE_HEADER_TEXT"
Return "White"t;
Case e Elsese
Return "Blackk
End Select
End Function
Now apply this function to the style property of an element on the report.
=code.StyleElement("TABLE_HEADER_TEXT")
If you want apply dynamic styles to report, then create tables in sql server and insert
style information into the tables.
Create a Dataset, specify the Stored Procedure.
example: =Fields!TABLE_HEADER_TEXT.Value
where TABLE_HEADER_TEXT is a value in the table.
8. Dynamic sorting, Dynamic Grouping in SSRS

Can be done using expressions.


9. Different types of Filters
The 2 types of filters in SSRS are:
Dataset Filter: Filtering within the source query. When you implement a filter within
the data set, less data is sent from the source database server to the Report Server usually a good thing.
Report Filter: This includes filtering after the source query has come back on a data
region (like the Tablix), or a data grouping. When you implement a filter within the
report, when the report is re-executed again with different parameter choices, the Report
Server uses cached data rather than returning to the database server.
Using a Dataset Filter is the most efficient method.
10. Difference between Filter and Parameter? Which one is better?
In case of Filters, first the data will be fetched from the database, then the Filters are
applied on the fetched data. Filters are applied at run time first on the dataset, and then
on the data region, and then on the group, in top-down order for group hierarchies.
To add a filter, we must specify a filter equation (expression). The data type of filtered
data and value must match.
Parameters are applied at the database level. The Data will be fetched based on
parameters at the database level using WHERE condition in the query.
Parameters are better than Filters in performance.
11. Optimization of Report
Report can be optimized in terms of Grouping, Filters.
Report can be optimized through Caching, Snapshot and subscriptions.

1. The total time to generate a report (RDL) can be divided into 3 elements:
Time to retrieve the data (TimeDataRetrieval).
Time to process the report (TimeProcessing)
Time to render the report (TimeRendering)
Total time = (TimeDataRetrieval) + (TimeProcessing) + (TimeRendering)
These 3 performance components are logged every time for which a deployed report is
executed. This information can be found in the table ExecutionLogStorage in the
ReportServer database.
SELECT TOP 10 Itempath, parameters,
TimeDataRetrieval + TimeProcessing + TimeRendering as [total time],
TimeDataRetrieval, TimeProcessing, TimeRendering,
ByteCount, [RowCount],Source, AdditionalInfo
FROM ExecutionLogStorage
ORDER BY Timestart DESC
2. Use the SQL Profiler to see which queries are executed when the report is generated.
Sometimes you will see more queries being executed than you expected. Every dataset
in the report will be executed. A lot of times new datasets are added during building of

reports. Check if all datasets are still being used. For instance, datasets for available
parameter values. Remove all datasets which are not used anymore.
3. Sometimes a dataset contains more columns than used in the Tablix\list. Use only
required columns in the Dataset.
4. ORDER BY in the dataset differs from the ORDER BY in the Tablix\list. You need to
decide where the data will be sorted. It can be done within SQL Server with an ORDER
BY clause or in by the Reporting server engine. It is not useful to do it in both. If an
index is available use the ORDER BY in your dataset.
5. Use the SQL Profiler to measure the performance of all datasets (Reads, CPU and
Duration). Use the SQL Server Management Studio (SSMS) to analyze the execution plan
of every dataset.
6. Avoid dataset with result sets with a lot of records like more than 1000 records. A lot
of times data is GROUPED in the report without an Drill down option. In that scenario do
the group by already in your dataset. This will save a lot of data transfer to the SQL
Server and it will save the reporting server engine to group the result set.
7. Rendering of the report can take a while if the result set is very big. Look very critical
if such a big result set is necessary. If details are used in only 5 % of the situations,
create another report to display the details. This will avoid the retrieval of all details in
95 % of the situations.
12. I have 'State' column in report, display the States in bold, whose State
name starts with letter 'A' (eg: Andhra pradesh, Assam should be in bold)
13. In which scenario you used Matrix Report
Use a matrix to display aggregated data summaries, grouped in rows and columns,
similar to a PivotTable or crosstab. The number of rows and columns for groups is
determined by the number of unique values for each row and column groups.
14. Image control in SSRS
An image is a report item that contains a reference to an image that is stored on the
report server, embedded within the report, or stored in a database.
Image Source : Embedded
Local report images are embedded in the report and then referenced. When you embed
an image, Report Designer MIME-encodes the image and stores it as text in the report
definition.
When to Use:
When image is embedded locally within the report.
When you are required to store all images within the report definition.

Image Source : External


When you use an external image in a report, the image source is set to External and the
value for the image is the URL to the image.
When to Use:
When images are stored in a File System, External File Share or Web Site.
Image Source : Database
If we add images that are stored in a database to report then such image is known as a
data-bound image. Data-bound images can also be displayed from binary data (BLOB)
stored in a database.
When to use:
When image is stored in a Database.
When you specify a dataset field that is bound to a database field that contains an
image.
15. Role of Report Manager
Deploying the reports onto the web server.
Delivering the reports through E-mail or File Share using the subscriptions.
Creating the Cached and Snapshot Reports.
Providing the Security to the reports.
16. How to upload a report to report server
In the Report Manager, we have upload option to upload the reports.
17. What is a Shared Dataset
Shared datasets retrieve data from shared data sources that connect to external data
sources. A shared dataset contains a query to provide a consistent set of data for
multiple reports. The dataset query can include dataset parameters.
Shared datasets use only shared data sources, not embedded data sources.

To create a shared dataset, you must use an application that creates a shared dataset
definition file (.rsd). You can use one of the following applications to create a shared
dataset:
1. Report Builder: Use shared dataset design mode and save the shared dataset to a
report server or SharePoint site.
2. Report Designer in BIDS: Create shared datasets under the Shared Dataset folder in
Solution Explorer. To publish a shared dataset, deploy it to a report server or SharePoint
site.
Upload a shared dataset definition (.rsd) file. You can upload a file to the report server or
SharePoint site. On a SharePoint site, an uploaded file is not validated against the
schema until the shared dataset is cached or used in a report.
The shared dataset definition includes a query, dataset parameters including default
values, data options such as case sensitivity, and dataset filters.
18. How do u display the partial text in bold format in textbox in Report? (eg:
FirstName LastName, where "FirstName" should in bold fornt and "LastName"
should be in normal font.)
Use PlaceHolder
19. How to Keep Headers Visible When Scrolling Through a Report?
1. Right-click the row, column, or corner handle of a tablix data region, and then click
Tablix Properties.
2. On the General tab, under Row Headers or Column Headers, select Header should
remain visible while scrolling.

3. Click OK.

To keep a static tablix member (row or column) visible while scrolling


1. On the design surface, click the row or column handle of the tablix data region to
select it. The Grouping pane displays the row and column groups.
2. On the right side of the Grouping pane, click the down arrow, and then clickAdvanced
Mode. The Row Groups pane displays the hierarchical static and dynamic members for
the row groups hierarchy and the Column groups pane shows a similar display for the
column groups hierarchy.
3. Click the static member (row or column) that you want to remain visible while
scrolling. The Properties pane displays the Tablix Member properties.
4. In the Properties pane, set FixedData to True.

20. How to add Page Break


1. On the design surface, right-click the corner handle of the data region and then click
Tablix Properties.
2. On the General tab, under Page break options, select one of the following options:
Add a page break before:Select this option when you want to add a page break before
the table.
Add a page break after:Select this option when you want to add a page break after the
table.
Fit table on one page if possible:Select this option when you want the data to stay on
one page.

21. A main report contain subreport also. Can we export both main report and
subreport to Excel?
Yes. The exported report contains both the mail report and sub report.
22. how to convert PDF report from Portrait to Landscape format?
In Report Properties -->
Set the width of the report to the landscape size of your A4 paper: 29.7 cm
Set the height of the report to 21 cm.
To avoid extra blank pages during export, the size of the body should be less or equal to
the size of the report - margins.
Set the width of the body to 26.7 cm (29.7 -1.5 - 1.5)
Set the height of the body to 18 cm (21 - 1.5 -1.5)
23. Error handling in Report
Step 1: All the data sets of the report should contain one addition input parameter which
should pass a unique information for every request (for every click of View Report
button) made by the user.
Step 2: Need to implement TRY CATCH blocks for all the Stored procedures used in the
SSRS reports through datasets. The CATCH section of every procedure should have the
provision to save the error details into DB table, if any error occurred while execution of
that procedure.
Step 3: Add one more additional dataset with the name "ErrorInfo" which should call the
store procedure (USP_ERROR_INFO). This procedure should be accepting a unique value.
This unique value should be passed to all the data sets for every click of 'View Report'
button made by the user. This dataset will return the error information available in the
data base table by verifying records with the unique id which has passes as input
parameter.
Step 4:Enable the Use Single Transaction When Processing Queries option in data

source properties, which makes all the query executions through a single transaction.
Step 5: After successful completion of all the above mentioned steps, insert new table on
SSRS report with custom error information which will be shown to the report user if the
user gets any error during execution of the report.

24. Have u worked on any 3rd party Report Tools


There are few third party Report Tools like Nevron, izenda.
25. Different ways of Deploying reports
1. We can deploy the reports using rs.exe tool
2. In the Solution Explorer,
2.1.Right-click the report project, and then click Properties.
2.2.In the Property Pages dialog box for the project, select a configuration to edit from
the Configuration list. Common configurations are DebugLocal, Debug, and Release.
2.3.In StartItem, select a report to display in the preview window or in a browser
window when the report project is run.
2.4.In the OverwriteDataSources list, select True to overwrite the shared data source
on the server each time shared data sources are published, or select False to keep the
data source on the server.
2.5.In the TargetDataSourceFolder text box, type the folder on the report server in
which to place the published shared data sources. The default value for
TargetDataSourceFolder is Data Sources. If you leave this value blank, the data sources
will be published to the location specified in TargetReportFolder.
2.6. In the TargetReportFolder text box, type the folder on the report server in which
to place the published reports. The default value for TargetReportFolder is the name of
the report project.
2.7. In the TargetServerURL text box, type the URL of the target report server. Before
you publish a report, you must set this property to a valid report server URL.

3. There are 2 options for deploying the reports that you create with Report Builder 3.0:
1. Report Manager
2. SharePoint document library
26. Difference between Cached Report and Snapshot Report
Cached Report is a saved copy of processed report.
The first time a user clicks the link for a report configured to cache, the report execution
process is similar to the on-demand process. The intermediate format is cached and
stored in ReportServerTempDB Database until the cache expiry time.
If a user request a different set of parameter values for a cached report, then the report
processor treats the requests as a new report executing on demand, but flags it as a
second cached instance.
Report snapshot contains the Query and Layout information retrieved at specific point of
time. It executes the query and produces the intermediate format. The intermediate
format of the report has no expiration time like a cached instance, and is stored in
ReportServer Database.
27. Subscription. Different types of Subscriptions?
Subscriptions are used to deliver the reports to either File Share or Email in response to
Report Level or Server Level Schedule.
There are 2 types of subscriptions:
1. Standard Subscription: Static properties are set for Report Delivery.
2. Data Driven Subscription: Dynamic Runtime properties are set for Subscriptions
28. SSRS Architecture
29. How to deploy Reports from one server to other server
30. Different life cycles of Report
1.Report authoring:
This stage involves creation of reports that are published using the Report Definition
language. RDL is an XML based industry standard for defining reports.
Report Designer is a full-featured report authoring tool that runs in Business Intelligence
Development Studio and Report Builder.
2. Report management:
This involves managing the published reports as a part of the webservice. The reports
are cached for consistency and performance. They can be executed whenever demanded
or can be scheduled and executed.
In short Report Management includes:
- Organizing reports and data sources,
- Scheduling report execution and delivery
- Tracking reporting history.
3. Report delivery:
Reports can be delivered to the consumers either on their demand or based on an event.
Then they can view them is a web-based format.
Web based delivery via Report Manager web site
Subscriptions allow for automated report delivery
URL Access, Web Services and Report Viewer control
4.Report security:

It is important to protect reports as well as the report resources. Therefore, Reporting


Services implement a flexible, role-based security model.
31. Different type of Reports
Linked report:A linked report is derived from an existing report and retains the original's
report definition. A linked report always inherits report layout and data source properties
of the original report. All other properties and settings can be different from those of the
original report, including security, parameters, location, subscriptions, and schedules.
Snapshot reports: A report snapshot contains layout information and query results that
were retrieved at a specific point in time. Report snapshots are processed on a schedule
and then saved to a report server.
Subreport: A subreport is a report that displays another report inside the body of a main
report. The subreport can use different data sources than the main report.
Cached reports: A cached report is a saved copy of a processed report. Cached reports
are used to improve performance by reducing the number of processing requests to the
report processor and by reducing the time required to retrieve large reports. They have a
mandatory expiration period, usually in minutes.
Drill Down Report: Means navigate from the summary level to detail level in the same
report.
Drill Through Report: Navigation from one report to another report.
Ad hoc reports:Ad Hoc reporting allows the end users to design and create reports on
their own provided the data models.
3 components: Report Builder, Report Model and Model Designer
Use 'Model Designer' tool to design 'Report Models' and then use 'Report Model' tool to
generate reports.
Report Builder
- Windows Winform application for End users to build ad-hoc reports with the help of
Report models.
32. Explain the Report Model Steps.
1. Create the report model project
select "Report Model Project" in the Templates list
A report model project contains the definition of the data source (.ds file), the definition
of a data source view (.dsv file), and the report model (.smdl file).
2. Define a data source for the report model
3. Define a data source view for the report model
A data source view is a logical data model based on one or more data sources.
SQL Reporting Services generates the report model from the data source view.
4. Define a report model
5. Publish a report model to report server.
33. How to get the data for Report Model Reports
Datasource View
34. Difference between RDL and RDLC?
RDL files are created for Sql Server Reporting Services and .RDLC files are for Visual
Studio Report Viewer Component.
The <Query> element of RDL contains query or command and is used by the Report
Server to connect to the datasources of the report.
The <Query> element is optional in RDLC file. This element is ignored by Report Viewer

control because Report Viewer control does not perform any data processing in Local
processing mode, but used data that the host application supplies.
35. Difference between Sorting and Interactive Sorting?
To control the Sort order of data in report, you must set the sort expression on the data
region or group. The does not have control over sorting.
You can provide control to the user by adding Interactive Sort buttons to toggle between
ascending and descending order for rows in a table or for rows and columns in a matrix.
The most common use of interactive sort is to add a sort button to every column header.
The user can then choose which column to sort by.
36. What is Report Builder
Windows Winform application for End users to build ad-hoc reports with the help of
Report models.
37. Difference between Table report and Matrix Report
A Table Report can have fixed number of columns and dynamic rows.
A Matrix Report has dynamic rows and dynamic columns.
38. When to use Table, Matrix and List
1. Use a Table to display detail data, organize the data in row groups, or both.
2. Use a matrix to display aggregated data summaries, grouped in rows and columns,
similar to a PivotTable or crosstab. The number of rows and columns for groups is
determined by the number of unique values for each row and column groups.
3. Use a list to create a free-form layout. You are not limited to a grid layout, but can
place fields freely inside the list. You can use a list to design a form for displaying many
dataset fields or as a container to display multiple data regions side by side for grouped
data. For example, you can define a group for a list; add a table, chart, and image; and
display values in table and graphic form for each group value
39. Report Server Configuration Files
1. RSReportServer.config:
Stores configuration settings for feature areas of the Report Server service: Report
Manager, the Report Server Web service, and background processing.
2. RSSrvPolicy.config
Stores the code access security policies for the server extensions.
3. RSMgrPolicy.config
Stores the code access security policies for Report Manager.
4. Web.config for the Report Server Web service
Includes only those settings that are required for ASP.NET.
5. ReportingServicesService.exe.config
6. Registry settings
7. Web.config for Report Manager
Includes only those settings that are required for ASP.NET
8. RSReportDesigner.config
9. RSPreviewPolicy.config
40. Difference between a Report and adhoc Report
Ad Hoc reporting allows the end users to design and create reports on their own
provided the data models.
Adhoc Report is created from existing report model using Report Builder.

41. How do u secure a Report


1. Authorization is provided through a role-based security model that is specific to
Reporting Services.
Different Types of Roles provided by SSRS :
- Browsers
- Content Manager
- My Reports
- Publishers
- Report Builder
2. IIS security controls access to the report server virtual directory and Report Manager.
42.How to Combine Datasets in SSRS (1 Dataset gets data from Oracle and
other dataset from Sql Server)
Using LookUP function, we can combine 2 datasets in SSRS.
In the following example, assume that a table is bound to a dataset that includes a field
for the product identifier ProductID. A separate dataset called "Product" contains the
corresponding product identifier ID and the product name Name.
=Lookup(Fields!ProductID.Value, Fields!ID.Value, Fields!Name.Value, "Product")
In the above expression, Lookup compares the value of ProductID to ID in each row of
the dataset called "Product" and, when a match is found, returns the value of the Name
field for that row.
43. Difference between Report Server and Report Manager
Report Server handle authentication, data processing, rendering and delivery operations.
The configuration settings of Report Manager and the Report Server Web service are
stored in a single configuration file (rsreportserver.config).
Report Manager is the web-based application included with Reporting Services that
handles all aspects of managing reports (deploying datasources and reports, caching a
report, subscriptions, snapshot).
44. Steps to repeat Table Headers in SSRS 2008?
1. Select the table
2. At the bottom of the screen, select a dropdown arrow beside column groups. Enable
"Advanced Mode" by clicking on it.
3. under Row Groups,select the static row and choose properties / press F4.
4. Set the following attributes for the static row or header row.
Set RepeatOnNewPage= True for repeating headers
Set KeepWithGroup= After
Set FixedData=True for keeping the headers visible.
45. How to add assemblies in SSRS

45. Report Extensions?

46. parent grouping, child grouping in SSRS


47. How to show "No Data Found" Message to end user?
Add a Text box with expression =IIF(Count(<ID Field>, "DataSet")=0,"No Data
Returned", nothing)
and set the visibility of this Text box as =IIF(Count(<ID Field>,
"DataSet")=0,False,True)

48. What is the 'Use single transaction when processing the


queries' in the Datasource? Dataset Execution Order?
By default, datasets are executed in parallel.

This option used to reduce the amount of open connections to the


database. For example, if you have a report with 3 datasets and you
dont have this option checked, a new connection is made to the database
for every single dataset. However, if you have it checked, then only one
connection will be open to the database and all the datasets will return
the data and the connection will be closed. This can be used to reduce
network traffic and potentially increase performance.
Open the data source dialog in report designer, and select the "Use Single
Transaction when processing the queries' check box. Once selected,
datasets that use the same data source are no longer executed in parallel.
They are also executed as a transaction, i.e. if any of the queries fails to
execute, the entire transaction is rolled back.

The order of the dataset execution sequence is determined by the topdown order of the dataset appearance in the RDL file, which also
corresponds to the order shown in report designer.
49. ReportServer and ReportServerTempDB Databases
ReportServer: hosts the report catalog and metadata.
For eg: keeps the catalog items in the Catalog table, the data source information in the
Data-Source table of ReportServer Database.
ReportServerTempDB: used by RS for caching purposes.
For eg: once the report is executed, the Report Server saves a copy of the report in the

ReportServerTempDB database.

What do we mean by dataflow in SSIS?


Data flow is nothing but the flow of data from the corresponding sources to the referred
destinations. In this process, the data transformations make changes to the data to make it ready
for the data warehouse.
What is a breakpoint in SSIS? How is it setup? How do you disable
it?
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an
opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the
object where you want to set the breakpoint and select the Edit Breakpoints option.
Can you name 5 or more of the native SSIS connection managers?
1) OLEDB connection Used to connect to any data source requiring an OLEDB connection
(i.e.,
SQL Server 2000)

2) Flat file connection Used to make a connection to a single file in the File System. Required
for reading information from a File System flat file
3) ADO.Net connection Uses the .Net Provider to make a connection to SQL Server 2005 or
other
connection exposed through managed code (like C#) in a custom task
4) Analysis Services connection Used to make a connection to an Analysis Services database or
project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
5) File connection Used to reference a file or folder. The options are to either use or create a file
or folder
6) Excel
What is the use of Bulk Insert Task in SSIS?
Bulk Insert Task is used to upload large amount of data from flat files into Sql Server. It supports
only OLE DB connections for destination database.
What is Conditional Split transformation in SSIS?
This is just like IF condition which checks for the given condition and based on the condition
evaluation, the output will be sent to the appropriate OUTPUT path. It has ONE input and
MANY outputs. Conditional Split transformation is used to send paths to different outputs based
on some conditions. For example, we can organize the transform for the students in a class who
have marks greater than 40 to one path and the students who score less than 40 to another path.
How do you eliminate quotes from being uploaded from a flat file
to SQL Server?
This can be done using TEXT QUALIFIER property. In the SSIS package on the Flat File
Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to
ensure the quotes are not included.
Can you explain how to setup a checkpoint file in SSIS?
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName Specify the full path to the Checkpoint file that the package uses
to save the value of package variables and log completed tasks. Rather than using a hard-coded
path as shown above, its a good idea to use an expression that concatenates a path defined in a
package variable and the package name.
CheckpointUsage Determines if/how checkpoints are used. Choose from these options:
Never(default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists
is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint
file is found it is used to restore package variable values and restart at the point of failure. If a
Checkpoint file is not found the package starts execution with the first task. The Always choice
raises an error if the Checkpoint file does not exist.

SaveCheckpoints Choose from these options: True or False (default). You must select
True to implement the Checkpoint behavior.
What are the different values you can set for CheckpointUsage
property ?
There are three values, which describe how a checkpoint file is used during package execution:
1) Never: The package will not use a checkpoint file and therefore will never restart.
2) If Exists: If a checkpoint file exists in the place you specified for the CheckpointFilename
property, then it will be used, and the package will restart according to the checkpoints written.
3) Always: The package will always use a checkpoint file to restart, and if one does not exist,
the package will fail.
What is the ONLY Property you need to set on TASKS in order to
configure CHECKPOINTS to RESTART package from failure?
The one property you have to set on the task is FailPackageOnFailure. This must be set
for each task or container that you want to be the point for a checkpoint and restart. If you do not
set this property to true and the task fails, no file will be written, and the next time you invoke the
package, it will start from the beginning again.
Where can we set the CHECKPOINTS, in DataFlow or ControlFlow ?
Checkpoints only happen at the Control Flow; it is not possible to checkpoint transformations or
restart inside a Data Flow. The Data Flow Task can be a checkpoint, but it is treated as any other
task.
Can you explain different options for dynamic configurations in
SSIS?
1) XML file
2) custom variables
3) Database per environment with the variables
4) Use a centralized database with all variables
What is the use of Percentage Sampling transformation in SSIS?
Percentage Sampling transformation is generally used for data mining. This transformation builds
a random sample of set of output rows by choosing specified percentage of input rows. For
example if the input has 1000 rows and if I specify 10 as percentage sample then the
transformation returns 10% of the RANDOM records from the input data.
What is the use of Term Extraction transformation in SSIS?
Term Extraction transformation is used to extract nouns or noun phrases or both noun and noun
phrases only from English text. It extracts terms from text in a transformation input column and

then writes the terms to a transformation output column. It can be also used to find out the
content of a dataset.
What is Data Viewer and what are the different types of Data
Viewers in SSIS?
A Data Viewer allows viewing data at a point of time at runtime. If data viewer is placed before
and after the Aggregate transform, we can see data flowing to the transformation at the runtime
and how it looks like after the transformation occurred. The different types of data viewers are:
1. Grid
2. Histogram
3. Scatter Plot
4. Column Chart.
What is Ignore Failure option in SSIS?
In Ignore Failure option, the error will be ignored and the data row will be directed to continue on
the next transformation. Lets say you have some JUNK data(wrong type of data or JUNK data)
flowing from source, then using this option in SSIS we can REDIRECT the junk data records to
another transformation instead of FAILING the package. This helps to MOVE only valid data to
destination and JUNK can be captured into separate file.
Which are the different types of Control Flow components in SSIS?
The different types of Control Flow components are: Data Flow Tasks, SQL Server Tasks, Data
Preparation Tasks, Work flow Tasks, Scripting Tasks, Analysis Services Tasks, Maintenance
Tasks, Containers.
What are containers? What are the different types of containers in
SSIS?
Containers are objects that provide structures to packages and extra functionality to tasks. There
are four types of containers in SSIS, they are: Foreach Loop Container, For Loop Container,
Sequence Container and Task Host Container.
What are the different types of Data flow components in SSIS?
There are 3 data flow components in SSIS.
1. Sources
2. Transformations
3. Destinations
What are the different types of data sources available in SSIS?
There are 7 types of data sources provided by SSIS: a.) Data Reader source b.) Excel source c.)
Flat file source d.) OLEDB source e.) Raw file source f.) XML source g.) Script component

What is SSIS Designer?


It is a graphical tool for creating packages. It has 4 tabs: Control Flow, Data Flow, Event
Handlers and Package Explorer.
What is Control Flow tab?
It is the tab in SSIS designer where various Tasks can be arranged and configured. This is the tab
where we provide and control the program flow of the project.
What is Data Flow tab?
This is the tab where we do all the work related to ETL job. It is the tab in SSIS Designer where
we can extract data from sources, transform the data and then load them into destinations.
What is the function of control flow tab in SSIS?
On the control flow tab, the tasks including dataflow task, containers and precedence constraints
that connect containers and tasks can be arranged and configured.
What is the function of Event handlers tab in SSIS?
On the Event handlers tab, workflows can be configured to respond to package events.
For example, we can configure Work Flow when ANY task Failes or Stops or Starts ..
What is the function of Package explorer tab in SSIS?
This tab provides an explorer view of the package. You can see what is happening in the package.
The Package is a container at the top of the hierarchy.
What is Solution Explorer?
It is a place in SSIS Designer where all the projects, Data Sources, Data Source Views and other
miscellaneous files can be viewed and accessed for modification.
How do we convert data type in SSIS?
The Data Conversion Transformation in SSIS converts the data type of an input column to a
different data type.
How are variables useful in ssis package?
Variables can provide communication among objects in the package. Variables can provide
communication between parent and child packages. Variables can also be used in expressions and
scripts. This helps in providing dynamic values to tasks.
Explain Aggregate Transformation in SSIS?
It aggregates data, similar you do in applying TSQL functions like Group By, Min, Max, Avg, and
Count. For example you get total quantity and Total line item for each product in Aggregate
Transformation Editor. First you determine input columns, then output column name in Output
Alias table in datagrid, and also operations for each Output Alias in Operation columns of the

same datagrid. Some of operation functions listed below :


Group By
Average
Count
Count Distinct : count distinct and non null column value
Min, Max, Sum
In Advanced tab, you can do some optimization here, such as setting up Key Scale option (low,
medium, high), Count Distinct scale option (low, medium, high), Auto Extend factor and Warn
On Division By Zero. If you check Warn On Division By Zero, the component will give warning
instead of error. Key Scale option will optimize transformation cache to certain number of key
threshold. If you set it low, optimization will target to 500,000 keys written to cache, medium can
handle up to 5 million keys, and high can handle up to 25 million keys, or you can specify
particular number of keys here. Default value is unspecified. Similar to number of keys for Count
Distinct scale option. It is used to optimize number of distinct value written to memory, default
value is unspecified. Auto Extend Factor is used when you want some portion of memory is used
for this component. Default value is 25% of memory.
Explain Audit Transformation ?
It allows you to add auditing information as required in auditing world specified by HIPPA and
Sarbanes-Oxley (SOX). Auditing options that you can add to transformed data through this
transformation are :
1. Execution of Instance GUID : ID of execution instance of the package
2. PackageID : ID of the package
3. PackageName
4. VersionID : GUID version of the package
5. Execution StartTime
6. MachineName
7. UserName
8. TaskName
9. TaskID : uniqueidentifier type of the data flow task that contains audit transformation.
Explain Character Map Transformation?
It transforms some character. It gives options whether output result will override the existing
column or add to new column. If you define it as new column, specify new column name.
Operations available here are:
1. Uppercase
2. Lowercase
3. Byte reversal : such as from 01234 to 04321
4. Full width
5. Half width

6. Hiragana/katakana/traditional Chinese/simplified Chinese


7. Linguistic casing
Explain Conditional split Transformation ?
It functions as ifthenelse construct. It enables send input data to a satisfied conditional
branch. For example you want to split product quantity between less than 500 and greater or
equal to 500. You can give the conditional a name that easily identifies its purpose. Else section
will be covered in Default Output Column name.
After you configure the component, it connect to subsequent transformation/destination, when
connected, it pops up dialog box to let you choose which conditional options will apply to the
destination transformation/destination.
Explain Copy column Transformation?
This component simply copies a column to another new column. Just like ALIAS Column in TSql.
Explain Data conversion Transformation?
This component does conversion data type, similar to TSQL function CAST or CONVERT. If
you wish to convery the data from one type to another then this is the best bet. But please make
sure that you have COMPATABLE data in the column.
Explain Data Mining query Transformation?
This component does prediction on the data or fills gap on it. Some good scenarios uses this
component is:
1. Take some input columns as number of children, domestic income, and marital income to
predict whether someone owns a house or not.
2. Take prediction what a customer would buy based analysis buying pattern on their shopping
cart.
3. Filling blank data or default values when customer doesnt fill some items in the questionnaire.
Explain Derived column Transformation?
Derived column creates new column or put manipulation of several columns into new column.
You can directly copy existing or create a new column using more than one column also.
Explain Merge Transformation?
Merge transformation merges two paths into single path. It is useful when you want to break out
data into path that handles errors after the errors are handled, the data are merge back into
downstream or you want to merge 2 data sources. It is similar with Union All transformation, but
Merge has some restrictions :
1. Data should be in sorted order
2. Data type , data length and other meta data attribute must be similar before merged.

Explain Merge Join Transformation?


Merge Join transformation will merge output from 2 inputs and doing INNER or OUTER join on
the data. But if you the data come from 1 OLEDB data source, it is better you join through SQL
query rather than using Merge Join transformation. Merge Join is intended to join 2 different data
source.
Explain Multicast Transformation?
This transformation sends output to multiple output paths with no conditional as Conditional Split
does. Takes ONE Input and makes the COPY of data and passes the same data through many
outputs. In simple Give one input and take many outputs of the same data.
Explain Percentage and row sampling Transformations?
This transformation will take data from source and randomly sampling data. It gives you 2
outputs. First is selected data and second one is unselected data. It is used in situation where you
train data mining model. These two are used to take the SAMPLE of data from the input data.
Explain Sort Transformation?
This component will sort data, similar in TSQL command ORDER BY. Some transformations
need sorted data.
Explain Union all Transformation?
It works in opposite way to Merge transformation. It can take output from more than 2 input
paths and combines into single output path.
What r the possible locations to save SSIS package?
You can save a package wherever you want.
SQL Server
Package Store
File System
What is a package?
A discrete executable unit of work composed of a collection of control flow and other objects,
including data sources, transformations, process sequence, and rules, errors and event handling,
and data destinations.
What is a workflow in SSIS?
A workflow is a set of instructions on how to execute tasks.
(It is a set of instructions on how to execute tasks such as sessions, emails and shell commands. a
workflow is created form work flow mgr.

What is the diff between control flow Items and data flow Items?
The control flow is the highest level control process. It allows you to manage the run-time
process activities of data flow and other processes within a package.
When we want to extract, transform and load data within a package. You add an SSIS dataflow
task to the package control flow.
What are the main component of SSIS(project-architecture)?
SSIS archItecture has 4 main components
1.ssis service
2.ssis runtime engine & runtime executables
3.ssis dataflow engine & dataflow components
4.ssis clients
Different components in SSIS package?
Control flow
Data flow
Event handler
Package explorer
What are Connection Managers?
It is a bridge b/w package object and physical data. It provides logical representation of a
connection at design time the properties of the connection mgr describes the physical connection
that integration services creates when the package is run.
What is environment variable in SSIS?
An environment variable configuration sets a package property equal to the value in an
environment variable.
Environmental configurations are useful for configuring properties that are dependent on the
computer that is executing the package.
How to provide securIty to packages?
We can provide security in two ways
1. Package encryption
2. Password protection.
What are Precedence constraints?
Constraints that link executable, container, and tasks wIthin the package control flow and specify
condItion that determine the sequence and condItions for determine whether executable run.
What is Design time Deployment in SSIS ?

When you run a package from with in BIDS,it is built and temporarily deployed to the folder. By
default the package will be deployed to the BIN folder in the Packages Project folder and you
can configure for custom folder for deployment. When the Packages execution is completed and
stopped in BIDS,the deployed package will be deleted and this is called as Design Time
Deployment.

1. What is a package?
a).a discrete executable unit of work composed of a collection of control
flow and other objects, including data sources, transformations, process
sequence, and rules, errors and event handling, and data destinations.
2. What is a workflow in SSIS?
a).`a workflow is a set of instructions on how to execute Tasks.
(It is a set of instructions on how to execute Tasks such as sessions, emails
and shell commands. a workflow is created form work flow mgr.)
3. What is the Difference between control flow Items and data
flow Items?
a).the control flow is the highest level control process. It allows you to
manage the run-time process the run time process activities of data flow
and other processes within a package.
When we want to extract, transform and load data within a package. You
add an SSIS dataflow Task to the package control flow.
4. What are the main components of SSIS (project-architecture)?
A).SSIS architecture has 4 main components
1.SSIS service
2.SSIS runtime engine & runtime executables
3.SSIS dataflow engine & dataflow components
4.SSIS clients
5.different components in SSIS package?
1. Control flow
2.data flow
3.event handler
4.package explorer
Containers: provide structure and scope to your package
Types of containers:
i. Task host container: the Taskhost container services a single Task.
ii. Sequence container: It can handle the flow of subset of a package
and can help you drive a package into smaller more manageable process.
Uses:-

1. Grouping Tasks so that you can disable a part of the package that no
longer needed.
2. Narrowing the scope of the variable to a container.
3. Managing the property of multiple Tasks in one step by setting the
properties of the container.
iii. For loop container: evaluates an expression and repeats Its workflow
until the expression evaluates to false.
iv. For each loop container: defines a control flow repeatedly by using
an enumerator.
For each loop container repeats the control flow for each member of a
specified enumerator.
Tasks: It provides the functionality to your package.
It is a individual unit of work.
Event handler: It responds to raised events in your package.
Precedence constraints: It provides ordinal relationship b/w various
Items in your package.
6. How to deploy the package?
To deploy the package first we need to configure some properties.
Go to project tab->package properties->we get a window, configure
deployment Utility as "True"
Specify the path as "bin/deployment"
7. Connection manager:
a).It is a bridge b/w package object and physical data. It provides logical
representation of a connection at design time the properties of the
connection mgr describes the physical connection that integration
services creates when the package is run.
8. Tell the Utility to execute (run) the package?
a) In BIDS a package that can be executed in debug mode by using the
debug menu or toolbar or from solution explorer.
In production, the package can be executed from the command line or
from Microsoft windows Utility or It can be scheduled for automated
execution by using the SQL server agent.
i). Go to->debug menu and select the start debugging button
ii).press F5 key
iii).right click the package and choose execute package.
iv).command prompts utilities

a).DTExecUI

1. To open command prompt->run->type dtexecui->press enter


2. The execute package Utility dialog box opens.
3. in that click execute to run the package.
Wait until the package has executed successfully.

b).DTExec Utility
1.open the command prompt window.
2. Command prompt window->type dtexec /followed by the DTS, SQL, or
file option and the package path, including package name.
3. If the package encryption level is encrypting sensitive with password or
encrypt all with password, use the decrypt option to provide the password.
If no password is included, dtexec will prompt you for the password.
4. Optionally, provide additional command-line options
5. Press enter.
6. Optionally, view logging and reporting information before closing the
command prompt window.
The execute package Utility dialog box opens.
7. In the execute package Utility dialog box, click execute package.
Wait until the package has executed successfully.
v).using SQL server mgmt studio to execute package
1. In SSMS right click a package, and then click run package.
Execute package Utility opens.
2. Execute the package as described previously.
9. How can u design SCD in SSIS?
a) Def:-SCD explains how to capture the changes over the period of time.
This is also known as change data capture.
type1: It keeps the most recent values in the target. It does not maintain
the history.
type2: It keeps the full history in the target database. For every update in
the source a new record is inserted in the target.
type3: It keeps current & previous information in the target.

10. How can u handle the errors through the help of logging in
SSIS?
a) To create an on error event handler to which you add the log error
execute SQL Task.
11. What is a log file and how to send log file to mgr?
a) It is especially useful when the package has been deployed to the
production environment, and you cannot use BIDS and VSA to debug the
package.
SSIS enables you to implement logging code through the Dts. Log method.
When the Dts. Log method is called in the script, the SSIS engine will route
the message to the log providers that are configured in the containing

package.
12. What is environment variable in SSIS?
a) An environment variable configuration sets a package property equal to
the value in an environment variable.
Environmental configurations are useful for configuring properties that are
dependent on the computer that is executing the package.
13. about multiple configurations?
a) It means including the xml configuration, environment variable, registry
entry, parent package variable, SQL Server table, and direct and indirect
configuration types.
14. How to provide security to packages?
a) In two ways
1. Package encryption
2. Password protection.
15. as per error handling in T/R, which one handle the better
performance? Like fail component, redirect row or ignore failure?
a) Redirect row provides better performance for error handling.
16. Staging area??
a) It is a temporary data storage location. Where various data T/R
activities take place.
Staging area is a kitchen of data warehouse.
17. Task??
a) An individual unit of work.

Types:1. Active x script Task


2. Analysis services execute DDL Task *
3. Analysis services processing Task *
4. Bulk insert Task *
5. Data flow Task *
6. Data mining query Task
7. Execute Dts 2000 package Task
8. Execute package Task *
9. Execute process Task
10. Execute SQL Task *
11. File system Task *
12. Ftp Task

13.
14.
15.
16.
17.
18.
19.

Message queue Task


Script Task *
Send mail Task *
Web service Task
Wmi data reader Task
Wmi event Task
Xml Task

18. Event handler & logging?


Even handler is the mechanism to raise a event based on specific
scenario. For example if there is any failure in data load it will notify thru
email or entry in error table etc.
Logging can be done based on event, in SSIS there are 12 events that can
be logged at Task or package level. You can enable partial logging for one
Task and enable much more detailed logging for billing Tasks.
Example:On error; On post validate; On progress; On warning
In SSIS different type of logging mechanism are there:
SQL profiler
Text files
SQL server
Window event log
Xml file
19. Import & export wizard?
a) Easiest method to move data from sources like oracle, db2, SQL server.
Right click on database name->go to Task->import and export wizard
Select the source
Select the destination
Query copy of tables
Execute
Finish
20.what solution explorer?
Once you creating project with specific project name then if you want to
add
data source/-data source views/packages/ miscellaneous; then this
window will help to organize different files under one structure.
21. Precedence constraints?
a) Constraints that link executable, container, and Tasks within the
package control flow and specify condition that determine the sequence
and conditions for determine whether executable run.

22. Data pipeline?


a) The memory based, multithreaded, buffered t/r process flow data
through an SSIS data flow Task during package execution.
23. TRANSFORMATIONS??
It is an object that generates, modifies, or passes data.
1. AGGEGATE T/R:-It applies an aggregate function to grouped records and
produces new output records from aggregated results.
2. AUDIT T/R:-the t/r adds the value of a system variable, such as machine
name or execution instance GUID to a new output column.
3. CHARACTER MAP T/R:-this t/r makes string data changes such as
changing data from lower case to upper case.
4. CONDITIONAL SPLIT:-It separate input rows into separate output data
pipelines based on the Boolean expressions configured for each output.
5. COPY COLUMN:-add a copy of column to the t/r output we can later
transform the copy keeping the original for auditing personal
6.DATA CONVERSION:-converts a columns data type to another data type.
7. DATA MINING QUERY:-perform a data mining query against analysis
services.
8. DERIVED COLUMN:-create a new derive column calculated from
expression.
9. EXPORT COLUMN:-It allows you to export a column from the data flow to
a file.
10. FUZZY GROUPING:-perform data cleansing by finding rows that are
likely duplicates.
11. FUZZY LOOKUP:-matches and standardizes data based on fuzzy logic.
Ex:-transform the name jon to john
12.IMPORT COLUMN:-reads the data from a file & adds It into a dataflow.
13. LOOKUP:-perform the lookup of data to be used later in a transform.
Ex:-t/T to lookup a city based on zip code.
1. Getting a related value from a table using a key column value
2. Update slowly changing dimension table
3.to check whether records already exist in the table.
14. MERGE:-merges two sorted data sets into a single data set into a
single data flow.
15. MERGE JOIN:-merges two data sets into a single dataset using a join
junction.
16. MULTI CAST:-sends a copy of two data to an additional path in the
workflow.
17. ROW COUNT:-stores the rows count from the data flow into a variable.
18. ROW SAMPLING:-captures the sample of data from the dataflow by
using a row count of the total rows in dataflow.
19. ROW SAMPLING:-captures the sample of the data from the data flow
by using a row count of the total rows in data flow.
20. UNION ALL:-merge multiple data sets into a single dataset.
21. PIVOT:-converts rows into columns
22.UNPIVOT:-converts columns into rows

24. Batch?
a) A batch is defined as group of sessions. Those are 2 types.
1. Parallel batch processing
2. Sequential batch processing
to execute a SSIS package we will use "execute package utility"
To deploy a SSIS package we will use "package deployment Utility
SSIS Interview Questions / FAQs
Question: Which SSIS versions You have worked on?
Comment: Differences between 2005 and 2008 are not very big so 2005, 2008 or
2008 R2 experience usually is very similar. The big difference is with 2000 which
had DTS and it very different (SSIS is created from scratch)
Question: Have you used or what do you mean by SSIS Framework?
Comment: This is common term in SSIS world which just means that you have
templates that are set up to perform routine tasks like logging, error handling
etc. Yes answer would usually indicate experienced person, no answer is still fine
if your project is not very mission critical.
Question: Share experienced working with data warehouses?
Comment: SSIS is in most cases used for data warehouses so knowledge of Data
Warehouses Designs is very useful.
Question: How have you attended any training SSIS.
Comment: The thing is that most people who read good books have usually an
advantage over those who hasn't because they know what they know and they
know what they don't know (but they know it exists and is available).
Blog/Articles very in quality so best practise articles is a big plus+, conferences
can be also a plus.
Question: SSIS certifications?
Comment: This is rather disappointing point for me. Qualifications generally are
welcome but unfortunately many people simply cheat. Companies run courses
and then give questions and answers, or people find them on the internet. I've
met people who had certification but knew very little, I've met people very
experienced and knowledgeable without certification and people who have done
certification for their self-satisfaction and are experienced and knowledgeable. In
other words be careful with certification. It is easy to get a misleading
impression so make sure you ask the best questions for the position you can.
SSIS Development Interview FAQ
Question: How many difference source and destinations have you used?
It is very common to get all kinds of sources so the more the person worked with
the better for you. Common ones are SQL Server, CSV/TXT, Flat Files, Excel,
Access, Oracle, MySQL but also Salesforce, web data scrapping.
Question: What configuration options have you used?

Comment: This is an important one. Configuration should always be dynamic and


usually is done using XML and/or Environment Variable and SQL Table with all
configurations.
Question: How do you apply business rules in SSIS (Transformations.Specific
calculations but also cleansing)?
Comment: Some people use SSIS only to extract data and then go with stored
procedures only.they are usually missing the point of the power of SSIS. Which
allows to create "a flow" and on each step applies certain rules this greatly
simplifies the ETL process and simplicity is very good.
Question: How to quickly load data into sql server table?
Comment: Fast Load option. This option is not set by default so most developers
know this answer as otherwise the load is very slow.
Question: Give example of handling data quality issues?
Comment: Data Quality is almost always a problem and SSIS handles it very well.
Examples include importing customers from different sources where customer
name can be duplicates. For instance you can have as company name: SQL
Server Business Intelligence but also SQL Server BI or SQL Server BI LTD or SQL
Server BI Limited or intelligence (with one l). There are different ways to handle
it. Robust and time consuming is to create a table with or possible scenarios and
update it after each update. You can also use fuzzy grouping which is usually
easy to implement and will make usually very good decisions but it is not 100%
accurate so this approach has to be justified. Other typical quality issues are
nulls (missing values), outliers (dates like 2999 or types like 50000 instead of
5000 especially important if someone is adjusting the value to get bigger bonus),
incorrect addresses
and these are either corrected during ETL, ignored, re-directed for further manual
updates or it fails the packages which for big processes is usually not practised.
Question: When to use Stored Procedures?
Comment: This was one of the requested question in comment (at the bottom of
the page). This one is very important but also tricky. ALL SSIS developers have
SQL Server background and that is sometime not very good if they use SQL not
SSIS approach.
Let's start with when you typically use SPs. This is for preparing tables (truncate),
audit tasks (usually part of SSIS framework), getting configuration values for
loops and a few other general tasks.
During ETL extract you usually type simple SQL because it comes from other
sources and usually over complication is not a good choice (make it dynamic)
because any changes usually affect the package which has to be updated as
well.
During Transformation phase (business rules, cleaning, core work) you should
use Transformation tasks not Stored procedures! There are loads of tasks that
make the package much easier to develop but also a very important reason is
readability which is very important for other people who need to change the
package and obviously it reduces risks of making errors. Performance is usually
very good with SSIS as it is memory/flow based approach. So when to use Stored
Procedures for transformations? If you don't have strong SSIS developers or you
have performance reasons to do it. In some cases SPs can be much faster
(usually it only applies to very large datasets). Most important is have reasons
which approach is better for the situation.

Question: What is your approach for ETL with data warehouses (how many
packages you developer during typical load etc.)?
Comment: This is rather generic question. A typical approach (for me) when
building ETL is to. Have a package to extract data per source with extract specific
transformations (lookups, business rules, cleaning) and loads data into staging
table. Then a package do a simple merge from staging to data warehouse
(Stored Procedure) or a package that takes data from staging and performs extra
work before loading to data warehouse. I prefer the first one and due to this
approach I occasionally consider having extract stage (as well as stage phase)
which gives me more flexibility with transformation (per source) and makes it
simpler to follow (not everything in one go). So to summarize you usually have
package per source and one package per data warehouse table destination.
There are might be other approach valid as well so ask for reasons.
SSIS Advanced Interview Questions
Question: What is XMLify component?
Comment: It is 3rd party free component used rather frequently to output errors
into XML field which saves development time.
Question: What command line tools do you use with SSIS ?
Comment: dtutil (deployment), dtexec (execution), dtexecui (generation of
execution code)
SSIS Interview Questions continue
Q: What is SSIS? How it related with SQL Server.
SQL Server Integration Services (SSIS) is a component of SQL Server which can be
used to perform a wide range of Data Migration and ETL operations. SSIS is a
component in MSBI process of SQL Server.
This is a platform for Integration and Workflow applications. It is known for a fast and
flexible OLTP and OLAP extensions used for data extraction, transformation, and loading
(ETL). The tool may also be used to automate maintenance of SQL Server databases
and multidimensional data sets.

Q: What are the tools associated with SSIS?


We use Business Intelligence Development Studio (BIDS) and SQL Server Management
Studio (SSMS) to work with Development of SSIS Projects.
We use SSMS to manage the SSIS Packages and Projects.
Q: What are the differences between DTS and SSIS
Data Transformation Services
SQL Server Integration Services
Limited Error Handling
Complex and powerful Error Handling
Message Boxes in ActiveX Scripts
Message Boxes in .NET Scripting
No Deployment Wizard
Interactive Deployment Wizard
Limited Set of Transformation
Good number of Transformations
NO BI functionality
Complete BI Integration
Q: What is a workflow in SSIS ?
Workflow is a set of instructions on to specify the Program Executor on how to execute
tasks and containers within SSIS Packages.
Q: What is the control flow?

A control flow consists of one or more tasks and containers that execute when the
package runs. To control order or define the conditions for running the next task or
container in the package control flow, we use precedence constraints to connect the
tasks and containers in a package. A subset of tasks and containers can also be
grouped and run repeatedly as a unit within the package control flow. SQL Server 2005
Integration Services (SSIS) provides three different types of control flow elements:
Containers that provide structures in packages, Tasks that provide functionality, and
Precedence Constraints that connect the executables, containers, and tasks into an
ordered control flow.
Q: What is a data flow?
A data flow consists of the sources and destinations that extract and load data, the
transformations that modify and extend data, and the paths that link sources,
transformations, and destinations The Data Flow task is the executable within the SSIS
package that creates, orders, and runs the data flow. A separate instance of the data
flow engine is opened for each Data Flow task in a package. Data Sources,
Transformations, and Data Destinations are the three important categories in the Data
Flow.
Q: How does Error-Handling work in SSIS
When a data flow component applies a transformation to column data, extracts data
from sources, or loads data into destinations, errors can occur. Errors frequently occur
because of unexpected data values.
Type of typical Errors in SSIS:
-Data Connection Errors, which occur incase the connection manager cannot be
initialized with the connection string. This applies to both Data Sources and Data
Destinations along with Control Flows that use the Connection Strings.
-Data Transformation Errors, which occur while data is being transformed over a Data
Pipeline from Source to Destination.
-Expression Evaluation errors, which occur if expressions that are evaluated at run time
perform invalid
Q: What is environment variable in SSIS?
An environment variable configuration sets a package property equal to the value in an
environment variable.
Environmental configurations are useful for configuring properties that are dependent on
the computer that is executing the package.
Q: What are the Transformations available in SSIS?
AGGEGATE - It applies aggregate functions to Record Sets to produce new output
records from aggregated values.
AUDIT - Adds Package and Task level Metadata - such as Machine Name, Execution
Instance, Package Name, Package ID, etc..
CHARACTER MAP - Performs SQL Server level makes string data changes such as
changing data from lower case to upper case.
CONDITIONAL SPLIT Separates available input into separate output pipelines based
on Boolean Expressions configured for each output.
COPY COLUMN - Add a copy of column to the output we can later transform the copy
keeping the original for auditing.
DATA CONVERSION - Converts columns data types from one to another type. It stands
for Explicit Column Conversion.
DATA MINING QUERY Used to perform data mining query against analysis services

and manage Predictions Graphs and Controls.


DERIVED COLUMN - Create a new (computed) column from given expressions.
EXPORT COLUMN Used to export a Image specific column from the database to a flat
file.
FUZZY GROUPING Used for data cleansing by finding rows that are likely duplicates.
FUZZY LOOKUP - Used for Pattern Matching and Ranking based on fuzzy logic.
IMPORT COLUMN - Reads image specific column from database onto a flat file.
LOOKUP - Performs the lookup (searching) of a given reference object set against a
data source. It is used for exact matches only.
MERGE - Merges two sorted data sets into a single data set into a single data flow.
MERGE JOIN - Merges two data sets into a single dataset using a join junction.
MULTI CAST - Sends a copy of supplied Data Source onto multiple Destinations.
ROW COUNT - Stores the resulting row count from the data flow / transformation into a
variable.
ROW SAMPLING - Captures sample data by using a row count of the total rows in
dataflow specified by rows or percentage.
UNION ALL - Merge multiple data sets into a single dataset.
PIVOT Used for Normalization of data sources to reduce analomolies by converting
rows into columns
UNPIVOT Used for demoralizing the data structure by converts columns into rows
incase of building Data Warehouses.
Q: How to log SSIS Executions?
SSIS includes logging features that write log entries when run-time events occur and can
also write custom messages. This is not enabled by default. Integration Services
supports a diverse set of log providers, and gives you the ability to create custom log
providers. The Integration Services log providers can write log entries to text files, SQL
Server Profiler, SQL Server, Windows Event Log, or XML files. Logs are associated with
packages and are configured at the package level. Each task or container in a package
can log information to any package log. The tasks and containers in a package can be
enabled for logging even if the package itself is not.
Q: How do you deploy SSIS packages.
BUILDing SSIS Projects provides a Deployment Manifest File. We need to run the
manifest file and decide whether to deploy this onto File System or onto SQL Server
[ msdb]. SQL Server Deployment is very faster and more secure then File System
Deployment. Alternatively, we can also import the package from SSMS from File System
or SQ Server.
Q: What are variables and what is variable scope ?
Variables store values that a SSIS package and its containers, tasks, and event handlers
can use at run time. The scripts in the Script task and the Script component can also use
variables. The precedence constraints that sequence tasks and containers into a
workflow can use variables when their constraint definitions include expressions.
Integration Services supports two types of variables: user-defined variables and system
variables. User-defined variables are defined by package developers, and system
variables are defined by Integration Services. You can create as many user-defined
variables as a package requires, but you cannot create additional system variables.
Q: Can you name five of the Perfmon counters for SSIS and the value they provide?
SQLServer:SSIS Service
SSIS Package Instances

SQLServer:SSIS Pipeline
BLOB bytes read
BLOB bytes written
BLOB files in use
Buffer memory
Buffers in use
Buffers spooled
Flat buffer memory
Flat buffers in use
Private buffer memory
Private buffers in use
Rows read
Rows written
Q SSIS Blocking and Non blocking transformation.
Data flow transformations in SSIS use memory/buffers in different ways. The way a
transformation uses memory can dramatically impact the performance of your package.
Transformation buffer usage can be classified into 3 categories: Non Blocking, Partially
Blocking, and (Full) Blocking.
If you picture a data flow as a river, and transformation buffer usage as a dam in that
river, here is the impact of your transformation on your data flow.
A Non Blocking transformation is a dam that just lets the water spill over the top. Other
than perhaps a bit of a slow down the water (your data) proceeds on its way with very
little delay
A Partially Blocking transformation is a dam that holds the water back until it reaches a
certain volume , and then releases that volume of water downstream and then
completely blocks the flow until that volume is achieved again. Your data in this case,
will stop, then start, then stop, then start over and over until all the data has moved
through the transformation. The downstream transformations end up starved for data
during certain periods, and then flooded with data during other periods. Clearly your
downstream transformations will not be able to work as efficiently when this happens,
and your entire package will slow down as a result.
A Blocking transformation is a dam that lets nothing through until the entire volume of the
river has flowed into the dam. Nothing is left to flow from upstream, and nothing has
been passed downstream. Then once the transformation is finished, it releases all the
data downstream. Clearly for a large dataset this can be extremely memory intensive.
Additionally, if all the transforms in your package are just waiting for data, your package
is going to run much more slowly.
Generally speaking if you can avoid Blocking and Partially Blocking transactions, your
package will simply perform better. If you think about it a bit, you will probably be able to
figure out which transformations fall into which category. Here is a quick list for your
reference:
Non Blocking
Audit
Character Map
Conditional Split
Copy Column
Data Conversion
Derived Column

Import Column
Lookup
Multicast
Percentage sampling
Row count
Row sampling
Script component
Partially Blocking
Data mining
Merge
Merge Join
Pivot/Unpivot
Term Extraction
Term Lookup
Union All
Blocking
Aggregate
Fuzzy Grouping
Fuzzy Lookup
Sort
Facts :
Sort is a fully blocking transformation.
A Merge transform requires a Sort, but a Union All does not, use a Union All when you
can.
Q. Diffrence between synchronous and asynchronous data transmission
To understand the difference between a synchronous and an asynchronous
transformation in Integration Services, it is easiest to start with an understanding of a
synchronous transformation. If a synchronous transformation does not meet your needs,
your design might require an asynchronous transformation.
Synchronous Transformations: A synchronous transformation processes incoming rows
and passes them on in the data flow one row at a time. Output is synchronous with input,
meaning that it occurs at the same time. Therefore, to process a given row, the
transformation does not need information about other rows in the data set. In the actual
implementation, rows are grouped into buffers as they pass from one component to the
next, but these buffers are transparent to the user, and you can assume that each row is
processed separately.

An example of a synchronous transformation is the Data Conversion transformation. For


each incoming row, it converts the value in the specified column and sends the row on
its way. Each discrete conversion operation is independent of all the other rows in the
data set.
In Integration Services scripting and programming, you specify a synchronous
transformation by looking up the ID of a component's input and assigning it to
the SynchronousInputID property of the component's outputs. This tells the data flow
engine to process each row from the input and send each row automatically to the
specified outputs. If you want every row to go to every output, you do not have to write
any additional code to output the data. If you use the ExclusionGroupproperty to specify
that rows should only go to one or another of a group of outputs, as in the Conditional
Split transformation, you must call the DirectRow method to select the appropriate
destination for each row. When you have an error output, you must

call DirectErrorRow to send rows with problems to the error output instead of the default
output.
Asynchronous Transformations: requires an asynchronous transformation when it is not
possible to process each row independently of all other rows. In other words, you cannot
pass each row along in the data flow as it is processed, but instead must output data
asynchronously, or at a different time, than the input. For example, the following
scenarios require an asynchronous transformation:
The component has to acquire multiple buffers of data before it can perform its
processing. An example is the Sort transformation, where the component has to process
the complete set of rows in a single operation.
The component has to combine rows from multiple inputs. An example is the Merge
transformation, where the component has to examine multiple rows from each input and
then merge them in sorted order.
There is no one-to-one correspondence between input rows and output rows. An
example is the Aggregate transformation, where the component has to add a row to the
output to hold the computed aggregate values.
In Integration Services scripting and programming, you specify an asynchronous
transformation by assigning a value of 0 to the SynchronousInputID property of the
component's outputs. . This tells the data flow engine not to send each row automatically
to the outputs. Then you must write code to send each row explicitly to the appropriate
output by adding it to the new output buffer that is created for the output of an
asynchronous transformation.
Note
Since a source component must also explicitly add each row that it reads from
the data source to its output buffers, a source resembles a transformation with
asynchronous outputs.

It would also be possible to create an asynchronous transformation that emulates a


synchronous transformation by explicitly copying each input row to the output. By using
this approach, you could rename columns or convert data types or formats. However this
approach degrades performance. You can achieve the same results with better
performance by using built-in Integration Services components, such as Copy Column or
Data Conversion.
Q. How to Create deployment Utility
First step in deploying packages is to create a deployment utility for an Integration
Services project. The deployment utility is a folder that contains the files you need to
deploy the packages in an Integration Services project on a different server. The
deployment utility is created on the computer on which the Integration Services project is
stored.
You create a package deployment utility for an Integration Services project by first
configuring the build process to create a deployment utility, and then building the project.
When you build the project, all packages and package configurations in the project are
automatically included. To deploy additional files such as a Readme file with the project,
place the files in the Miscellaneous folder of the Integration Services project. When the
project is built, these files are also automatically included.
You can configure each project deployment differently. Before you build the project and
create the package deployment utility, you can set the properties on the deployment

utility to customize the way the packages in the project will be deployed. For example,
you can specify whether package configurations can be updated when the project is
deployed. To access the properties of an Integration Services project, right-click the
project and click Properties.
The following table lists the deployment utility properties.
1. AllowConfigurationChange: A value that specifies whether configurations can be
updated
during deployment.
2. CreateDeploymentUtility: A value that specifies whether a package deployment is
created when the project is built. This property must be True to create a deployment
utility.
3. DeploymentOutputPath: The location, relative to the Integration Services project, of
the deployment utility.
When you build an Integration Services project, a manifest file,
.SSISDeploymentManifest.xml, is created and added, together with copies of the project
packages and package dependencies, to the bin\Deployment folder in the project, or to
the location specified in the DeploymentOutputPath property. The manifest file lists the
packages, the package configurations, and any miscellaneous files in the project.
The content of the deployment folder is refreshed every time that you build the project.
This means that any file saved to this folder that is not copied to the folder again by the
build process will be deleted. For example, package configuration files saved to the
deployment folders will be deleted.
To create a package deployment utility

In SQL Server Data Tools (SSDT), open the solution that contains the Integration
Services project for which you want to create a package deployment utility.
Right-click the project and click Properties.
In the Property Pages dialog box, click Deployment Utility.
To update package configurations when packages are deployed, set
AllowConfigurationChanges to True.
Set CreateDeploymentUtility to True.
Optionally, update the location of the deployment utility by modifying the
DeploymentOutputPath property.
Click OK.
In Solution Explorer, right-click the project, and then click Build.
View the build progress and build errors in the Output window

After youve gone through these steps the next time you build your project it will create
the file (YourProjectName).SSISDeploymentManifest. This file is located in the same
folder as your packages in the bin\Deployment folder.
If you run this file it will open the Package Installation Wizard that will allow you to deploy
all your packages that were located in the project to a desired location.
Q Setting the Protection Level of Packages

Defining Sensitive Information


In an Integration Services package, the following information is defined as sensitive:
The password part of a connection string. However, if you select an option that
encrypts everything, the whole connection string will be considered sensitive.
The task-generated XML nodes that are tagged as sensitive. The tagging of XML
nodes is controlled by Integration Services and cannot by changed by users.
Any variable that is marked as sensitive. The marking of variables is controlled
by Integration Services.
Whether Integration Services considers a property sensitive depends on whether
the developer of the Integration Services component, such as a connection
manager or task, has designated the property as sensitive. Users cannot add
properties to, nor can they remove properties from, the list of properties that are
considered sensitive.

Using Encryption
Encryption, as used by package protection levels, is performed by using the
Microsoft Data Protection API (DPAPI), which is part of the Cryptography API
(CryptoAPI).
The package protection levels that encrypt packages by using passwords require
that you provide a password also. If you change the protection level from a level
that does not use a password to one that does, you will be prompted for a
password.
Also, for the protection levels that use a password, Integration Services uses the
Triple DES cipher algorithm with a key length of 192 bits, available in the .NET
Framework Class Library (FCL).
Protection Levels
1. Do not save sensitive (DontSaveSensitive)
Suppresses the values of sensitive properties in the package when the package
is saved. This protection level does not encrypt, but instead it prevents
properties that are marked sensitive from being saved with the package and
therefore makes the sensitive data unavailable to other users. If a different user
opens the package, the sensitive information is replaced with blanks and the
user must provide the sensitive information.
When used with the dtutil utility (dtutil.exe), this protection level corresponds to
the value of 0.
2. Encrypt all with password (EncryptAllWithPassword)
Uses a password to encrypt the whole package. The package is encrypted by
using a password that the user supplies when the package is created or
exported. To open the package in SSIS Designer or run the package by using the
dtexec command prompt utility, the user must provide the package password.

Without the password the user cannot access or run the package.
When used with the dtutil utility, this protection level corresponds to the value of
3.
3. Encrypt all with user key (EncryptAllWithUserKey)
Uses a key that is based on the current user profile to encrypt the whole
package. Only the user who created or exported the package can open the
package in SSIS Designer or run the package by using the dtexec command
prompt utility.
When used with the dtutil utility, this protection level corresponds to the value of
4.
Note
For protection levels that use a user key, Integration Services uses DPAPI
standards. For more information about DPAPI, see the MSDN Library at
http://msdn.microsoft.com/library.
4. Encrypt sensitive with password (EncryptSensitiveWithPassword)
Uses a password to encrypt only the values of sensitive properties in the
package. DPAPI is used for this encryption. Sensitive data is saved as a part of
the package, but that data is encrypted by using a password that the current
user supplies when the package is created or exported. To open the package in
SSIS Designer, the user must provide the package password. If the password is
not provided, the package opens without the sensitive data and the current user
must provide new values for sensitive data. If the user tries to execute the
package without providing the password, package execution fails. For more
information about passwords and command line execution, see dtexec Utility
(SSIS Tool).
When used with the dtutil utility, this protection level corresponds to the value of
2.
5. Encrypt sensitive with user key (EncryptSensitiveWithUserKey)
Uses a key that is based on the current user profile to encrypt only the values of
sensitive properties in the package. Only the same user who uses the same
profile can load the package. If a different user opens the package, the sensitive
information is replaced with blanks and the current user must provide new values
for the sensitive data. If the user attempts to execute the package, package
execution fails. DPAPI is used for this encryption.
When used with the dtutil utility, this protection level corresponds to the value of
1.
Note For protection levels that use a user key, Integration Services uses DPAPI
standards. For more information about DPAPI, see the MSDN Library at
http://msdn.microsoft.com/library.
6. Rely on server storage for encryption (ServerStorage)
Protects the whole package using SQL Server database roles. This option is
supported only when a package is saved to the SQL Server msdb database. It is
not supported when a package is saved to the file system from Business
Intelligence Development Studio.
you change the protection level as listed in the following steps:
1. During development, leave the protection level of packages set to the
default value, EncryptSensitiveWithUserKey. This setting helps ensure that only

the developer sees sensitive values in the package. Or, you can consider using
EncryptAllWithUserKey, or DontSaveSensitive.
2. When it is time to deploy the packages, you have to change the protection
level to one that does not depend on the developer's user key. Therefore you
typically have to select EncryptSensitiveWithPassword, or
EncryptAllWithPassword. Encrypt the packages by assigning a temporary strong
password that is also known to the operations team in the production
environment.
3. After the packages have been deployed to the production environment, the
operations team can re-encrypt the deployed packages by assigning a strong
password that is known only to them. Or, they can encrypt the deployed
packages by selecting EncryptSensitiveWithUserKey or EncryptAllWithUserKey,
and using the local credentials of the account that will run the packages

You might also like