1) How To Rerun A Pipe Line From Data Factory Monitor.: Azure Data Factory Advanced Interview Questions and Answers
1) How To Rerun A Pipe Line From Data Factory Monitor.: Azure Data Factory Advanced Interview Questions and Answers
1) How To Rerun A Pipe Line From Data Factory Monitor.: Azure Data Factory Advanced Interview Questions and Answers
2) If you have 15 activities in pipeline, if we want to debug only first 10 activities how
do we do that?
Ans: Every Activity top red Circle will be there. If you select that “Debug Until” it will debug until that
activity.
i) Rerun: if you select rerun option it will return or restart from first step. If some steps are
succeeded, those will be retrigger. So before triggering rerun option verify succeeded steps
will be retriggered. There are some changes of reloading data again if its having copy
activities.
ii) Rerun From Failed Activity: If an activity fails, times out, or is canceled, you can rerun the
pipeline from that failed activity by selecting Rerun from failed activity.
iii) Rerun from Activity: If you wish to rerun starting at a specific point, you can do so from the
activity runs view. Select the activity you wish to start from and select Rerun from activity.
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu
4) How to make activity dependency in Pipeline Multiple Activities and those types.
Activity Dependency defines how subsequent activities depend on previous activities, determining
the condition of whether to continue executing the next task. An activity can depend on one or
multiple previous activities with different dependency conditions.
The different dependency conditions are: Succeeded, Failed, Skipped, and Completed.
For example, if a pipeline has Activity A -> Activity B, the different scenarios that can happen are:
In the data factory to which the permissions were granted, create a new self -hosted IR (linked) and
enter the resource ID.
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu
6) What is Logic App? Or How to send an email notification in Azure Data Factory?
Or what is WEB Activity and when we can use this activity?
WE can use this for multiple scenarios. Logic App (email) and ADF (error handling).
The communication between these two Azure parts is done with a JSON message via
an HTTP request (post). The JSON message contains the name of the Data Factory
and the pipeline that failed, an error message and an email address. You could of
course hardcode the email address in Logic Apps, but now you can reuse the Logic
App for various pipelines or data factories and notify different people.
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu
Next, we will add a new step to our Logic App, called “Send an email”. I will use gmail
but if you want to use another email provider pick that one.
It’s the first time you connect Gmail account on Azure? Then you need to connect
your Gmail account to Azure by signing in. (Note: allow pop-ups in your browser.)
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu
After creation of Azure Logic App and saving the Logic App, Azure created an
endpoint URL for our Logic Apps, you’ll find in the first step. Copy this URL to a
notepad, we’ll need this later.
You can use the Get Metadata activity to retrieve the metadata of any data in Azure
Data Factory. You can use this activity in the following scenarios:
You can use the output from the Get Metadata activity in conditional expressions to
perform validation.
You can trigger a pipeline when a condition is satisfied via Do Until looping.
The Get Metadata activity takes a dataset as an input and returns metadata
information as output. Currently, the following connectors and corresponding
retrievable metadata are supported. The maximum size of returned metadata is
around 4 MB.
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu
child Items List of subfolders and files in the given folder. Applicable only to folders.
Returned value is a list of the name and type of each child item.
Lookup activity can retrieve a dataset from any of the Azure Data Factory-supported
data sources. Use it in the following scenario:
When firstRowOnly is set to true (default), the output format is as shown in the
following code. The lookup result is under a fixed firstRow key. To use the result in
subsequent activity, use the pattern of
@{activity('LookupActivity').output.firstRow.table}.
9) If you are running more no of pipelines and its taking longer time to execute.
How to resolve this type of issues?
We can go with splitting pipelines batches wise and create multiple integration runtimes.
Then those loads will be shared by multiple integration runtimes and we can improve the
load performance of more no of pipelines.
10) What is auto resolve integration runtime in azure data factory?
11) Data Factory supports three types of triggers. Mention these types briefly
The Schedule trigger that is used to execute the ADF pipeline on a wall-clock schedule
The Tumbling window trigger that is used to execute the ADF pipeline on a periodic
interval, and retains the pipeline state
The Event-based trigger that responds to a blob related event, such as adding or deleting
a blob from an Azure storage account
Pipelines and triggers have a many-to-many relationship (except for the tumbling window
trigger).Multiple triggers can kick off a single pipeline, or a single trigger can kick off
multiple pipelines.
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu
12) Any Data Factory pipeline can be executed using three methods. Mention
these methods
13) How to load data whenever we receive a file in azure data factory? Or
How to run a pipeline if we receive a file or if we delete a file?
Blob path begins with: The blob path must start with a folder path. Valid values
include 2018/ and 2018/april/shoes.csv. This field can't be selected if a container isn't
selected.
Blob path ends with: The blob path must end with a file name or extension. Valid
values include shoes.csv and .csv. Container and folder name are optional but, when
specified, they must be separated by a /blobs/ segment. For example, a container
named 'orders' can have a value of /orders/blobs/2018/april/shoes.csv. To specify a
folder in any container, omit the leading '/' character. For example, april/shoes.csv will
trigger an event on any file named shoes.csv in folder a called 'april' in any container.
Note: Blob path begins with and ends with are the only pattern matching allowed in
Event Trigger. Other types of wildcard matching aren't supported for the trigger type.
Select whether your trigger will respond to a Blob created event, Blob deleted event, or both.
In your specified storage location, each event will trigger the Data Factory pipelines associated
with the trigger
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu
The tumbling window trigger and the schedule trigger both operate on time heartbeats.
How are they different?
The tumbling window trigger run waits for the triggered pipeline run to finish. Its run state
reflects the state of the triggered pipeline run. For example, if a triggered pipeline run is
cancelled, the corresponding tumbling window trigger run is marked cancelled. This is
different from the "fire and forget" behavior of the schedule trigger, which is marked
successful as long as a pipeline run started.
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu
Backfill scenarios
Tumbling Window: Supported. Pipeline runs can be scheduled for windows in the
past.
Scheduled Trigger: Not supported. Pipeline runs can be executed only on time
periods from the current time and the future.
Pipeline-to-trigger relationship
Tumbling Window: Supports a one-to-one relationship. Only one pipeline can be
triggered.
Scheduled Trigger: Supports many-to-many relationships. Multiple triggers can kick
off a single pipeline. A single trigger can kick off multiple pipelines.
Note:
In order to build a dependency chain and make sure that a trigger is executed only
after the successful execution of another trigger in the data factory,
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu
Azure Data Factory Advanced Interview Questions And Answers Youtube: pyspark telugu