Python_Basics_Exercises
Python_Basics_Exercises
rahulsiyanwal@gmail.com
University of Hyderabad
February 16, 2024
ASSIGNMENT 01
During this practice session, we’ll apply our acquired knowledge to tackle practical challenges. It’s rec-
ommended, but not required, for students to review the two notebooks provided previously. Please adhere
to the instructions outlined in each exercise to successfully complete the tasks. Once you are done, you can
ask for the solutions after sending your work to me by an email. Wishing you the best of luck!
Disclaimer: My background is not in the life sciences, so please take the scenarios presented in this assign-
ment with a grain of salt. The primary goal here is to provide you with an opportunity to dive into Python
programming hands-on. If you find an error please send me an email with the corrected version.
Instructions:
1. Define Your DNA Sequence: Start by defining a variable for your DNA sequence. This sequence
should be a string composed of the nucleotide characters ’A’, ’T’, ’C’, and ’G’. For instance, you might
use ‘ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG‘ as your sequence.
2. Prepare for RNA Transcription: Initialize an empty string to hold the RNA sequence that will
result from transcribing the DNA sequence.
3. Transcribe DNA to RNA:
(a) Write a loop to iterate over each nucleotide in your DNA sequence.
(b) Within the loop, use a conditional statement to check if the nucleotide is ’T’:
i. If it is ’T’, append ’U’ to your RNA sequence string.
ii. If it is not ’T’, append the nucleotide itself to your RNA sequence string.
4. Output the RNA Sequence: After processing all the nucleotides in the DNA sequence, print the
resulting RNA sequence. This step verifies the successful transcription of your DNA sequence to RNA.
Hints:
• Consider how strings are immutable in Python, which means you cannot change them once created.
However, you can build a new string by appending characters to it.
• Pay close attention to how you iterate over the string and how you use conditional logic to replace ’T’
with ’U’.
Expected Output:
Your script should output an RNA sequence that is identical to the DNA sequence except that all ’T’s are
replaced with ’U’s. This simulates the RNA transcription process.
Challenge Yourself:
After you’ve completed the basic version of this exercise, try to write a more compact version using Python’s
string methods or a comprehension. This can introduce you to more advanced Python features and practices.
Instructions:
1. Initialize Your DNA Sequence: Start by defining a variable to hold your DNA sequence. This
sequence should be a string consisting of the characters ’A’, ’T’, ’C’, and ’G’. For example, you might
choose a sequence like ‘ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG‘.
2. Prepare for Complementary Strand Synthesis: Create a new variable to hold the complementary
DNA sequence. Initially, this should be an empty string since you’ll be building it character by character.
3. Process Each Nucleotide:
(a) Write a loop that goes through each character (nucleotide) in your original DNA sequence.
(b) Inside the loop, use conditional statements (if-elif-else) to check the nucleotide:
i. If the nucleotide is ’A’, its complement is ’T’.
ii. If the nucleotide is ’T’, its complement is ’A’.
iii. If the nucleotide is ’C’, its complement is ’G’.
iv. If the nucleotide is ’G’, its complement is ’C’.
(c) Add the complement of each nucleotide to the variable holding your complementary DNA sequence.
4. Display the Result: After processing all nucleotides, print the original DNA sequence and its com-
plementary sequence. This step helps verify that your script works correctly.
Hints:
• Remember that strings are immutable in Python. You cannot change them in place but can concatenate
(add) characters to build a new string.
• Pay close attention to how you structure your loop and conditional statements. This is key to correctly
identifying each nucleotide’s complement.
Expected Output:
Your script should output the original DNA sequence and its complementary sequence. Each ’A’ in the
original sequence should correspond to a ’T’ in the complement, each ’T’ to an ’A’, each ’C’ to a ’G’, and
each ’G’ to a ’C’.
Challenge Yourself:
Once you’ve successfully written the basic version of the script, consider enhancing it. For example, you
could add error checking to ensure the original DNA sequence contains only valid nucleotides (’A’, ’T’, ’C’,
’G’).
4. Output the Amino Acid Sequence: After processing the entire DNA sequence or stopping at a
stop codon, write the amino acid sequence in a file.
Hints:
• Remember to handle the DNA sequence in chunks of three nucleotides to correctly map each codon to
its amino acid.
• Learn about the .get() method (http://tinyurl.com/getMethod). Use the .get() method of dictionaries
to safely retrieve the value for a given codon, allowing for a default value if the codon is not found in
your table.
Expected Output:
Your script should output the amino acid sequence derived from the given DNA sequence, stopping translation
if a stop codon is encountered.
Challenge Yourself:
Consider enhancing your script to handle cases where the DNA sequence length is not a multiple of three.
How might you ensure that partial codons at the end of the sequence do not disrupt your translation?
Instructions:
1. Define Two DNA Sequences: Choose two DNA sequences for alignment. These sequences should be
strings composed of ’A’, ’T’, ’C’, and ’G’. For example, you might use ‘GATTACA‘ and ‘GCATGCU‘.
Hints:
• Dynamic programming is used to break down this problem into simpler, smaller problems. Each cell
in the scoring matrix represents a subproblem.
• The traceback step is crucial for reconstructing the alignment from the scoring matrix. Pay attention
to the conditions that determine whether to move diagonally, up, or left at each step
Expected Output:
The script should output two aligned sequences that represent the best alignment according to the given
scoring criteria, along with the alignment score.
Challenge Yourself:
After implementing the basic alignment algorithm, consider extending your solution to handle different scoring
schemes or to implement a different alignment algorithm like Smith-Waterman for local alignment.