0% found this document useful (0 votes)
19 views

Fabric Assignment

Uploaded by

maulikpan16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Fabric Assignment

Uploaded by

maulikpan16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Fabric Assignment

Module 1: Ingestion
Data Ingestion using shortcuts to ADLS
Connection details:

Connection URL: https://tridentsourcestorage.dfs.core.windows.net


Connection Name: Fabric_Workshop_Connection
ShortcutName: Same as table name
SubPath: adls-gold/StackOverflow_Database/DeltaTables/{ShortcutName}

Instruction:
Step 1: Create a new lakehouse named <<EmailAlias>>_LH_Bronze.
Step 2: Find the table names as follows:
['Badges', 'badgeschsk', 'Comments', 'Currency', 'LinkTypes', 'PostLinks', 'Posts', 'PostTypes', 'Users',
'Votes', 'VoteTypes']
Step 4: For each of the tables above, create a ADLS shortcut in the Tables section of
<<EmailAlias>>_LH_Bronze lakehouse.
Step 5: Use the Connection details as mentioned above to create a total of 11 delta tables via ADLS
shortcuts.

Output:
Submit a Word document containing the following proof of successful execution:
1. Screenshot of the lakehouse that shows all the ingested tables.

Note: Make sure that your email address is visible in all the screenshots

Module 2: Storage
Data Exploration within Lakehouse
Instruction:
Step 1: In the bronze lakehouse in the top right corner, switch to SQL analytics endpoint
Step 2: Write a SQL query to retrieve user ID, username and total score (arranged in descending
order) for each user using Comments and Users tables and save the final result as a view named
vScorePerUser
Step 3: Build visual query that performs the below operations:
a. Rename badgeschsk table to LocationDetail
b. Remove columns SpanishCountryRegionName and FrenchCountryRegionName
c. Rename column EnglishCountryRegionName to CountryRegionName
d. Save the final output as a view named vLocation
Step 3: Create a lakehouse named <<EmailAlias>>_LH_Silver
Step 4: Create a shortcut in the "Tables" section to bring Badges, Comments, Posts, Users and Votes
tables from the bronze lakehouse in the silver lakehouse
Output:
Submit a Word document containing the following proof of successful execution:
a. Screenshot of SQL query code and generated SQL code for the visual query
b. Screenshot of the data in both the views vScorePerUser and vLocation

Note: Make sure that your email address is visible in all the screenshots

Module 3: Processing
Data Cleaning and Transformation operations
Instruction:

Silver layer processing

Step 1: Create a notebook named <<EmailAlias>>_Silver_Transformation within the Silver


Lakehouse.
Step 2: Perform the transformations below:
1. Load data from Comments table into a dataframe named as comments_df.
2. Notice that the CreationDate column contains both Date as well as timestamp. Create 2
separate columns, one with the Date column and timestamp (in the hh:mm:ss format) in the
other column.
3. Convert datatype of Date column to DateType.
4. Extract year, month number, week number and day number of month as an integer from the
value in the Date column. For example, if the value in date column is 2002/01/01, the
respective values of the newly created columns should be as follows:
a. Year - 2002
b. Month - 1
c. Week Number - 1
d. Day Number - 1
5. Load data from Users table into a dataframe named as users_df.
6. Remove AboutMe column as it contains all null values.
7. Write back comments_df as CommentsNew table and users_df as UsersNew table in the
same <<EmailAlias>>_LH_Silver Lakehouse.
8. Load new tables in 2 different dataframes and output the modified schema and the number
of columns in the newly created tables.

Gold layer processing

Step 1: Create a Lakehouse named <<EmailAlias>>_LH_Gold and a notebook named


<<EmailAlias>>_Gold_Transformation in it.
Step 2: Create shortcut to bring all modified tables from the Silver Lakehouse to the Gold Lakehouse.
Step 3: Explore data in the updated Votes and Comments tables, find out the columns on which join
operation can be performed and write code to perform join to output the consolidated information
in both the tables.
Step 4: Store the output back to the Gold Lakehouse as a delta table named as VotesWithComments.

Output:
Submit a Word document containing the following proof of successful execution:
1. Screenshot of the modified schema and number of columns in the two newly created tables
during silver layer processing.
2. Screenshot that shows the data and the number of rows in VotesWithComments table
created in gold layer.

Note: Make sure that your email address is visible in all the screenshots

Module 4: Reporting
Creating a DirectLake PBI report
Instruction:
Step 1: Switch to SQL analytics endpoint in the gold lakehouse.
Step 2: On the Home tab, select ‘New report’ icon and build a DirectLake report named as
<<EmailAlias>>_PBI_Report
Step 3: Create measures for the following -
1. Total posts
2. Total tags
3. Total users
4. Total badges
Step 4 : Create the following relationships-

Relationship

voteswithcomments to Badges

voteswithcomments to Users

voteswithcomments to Comments

Comments to posts

Comments to votes

Step 5 : Create a report that displays the visuals mentioned below -


1. A doughnut chart visual that displays the number of views per location for the locations
below:
a. Amsterdam, Netherlands
b. Bangalore, India
c. Cape town, South Africa
d. Seattle, WA
e. California
f. Cambridge, United Kingdom
2. A stacked column chart that shows count of upvotes by username.

Output:
Submit a Word document containing the following proof of execution:
1. Screenshot of the model view in the SQL endpoint, make sure that all the relationships
created are clearly visible
2. Report screenshot
Note: Naming convention of the Word document should be <<EmailAlias>>_<<Module#>> and make
sure that your email address is visible in all the screenshots

You might also like