BioPython: Understanding Transcription with BioPython

Shubham Thorat
1


Transcription is a fundamental process in molecular biology where a DNA sequence is converted into messenger RNA (mRNA). In this article, we'll explore how transcription works and how to perform it using BioPython, a powerful tool for bioinformatics analyses.


Image by : National Human Genome Research Institute


Understanding DNA Strands

DNA is composed of two complementary strands, the coding strand (also known as the Crick strand or strand +1) and the template strand (also known as the Watson strand or strand -1).

The coding strand is the one whose sequence corresponds directly to the mRNA, except that T (thymine) is replaced with U (uracil) in the mRNA.

Let's consider a hypothetical DNA sequence encoding a short peptide:


    DNA coding strand (aka Crick strand, strand +1)

5’  ATGGCCATTGTAATGGGCCGCTGAAAGG  3’

      | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 

3’  TACCGGTAACATTACCCGGCGACTTTCC  5’

    DNA template strand (aka Watson strand, strand -1)

|

Transcription

|

5’ AUGGCCAUUGUAAUGGGCCGCUGAAAGG 3’

Single stranded messenger RNA



In transcription, mRNA is synthesized based on the DNA template strand. However, in BioPython and bioinformatics, we typically work with the coding strand because it simplifies the process by replacing thymine (T) with uracil (U).


🚀 Running Python in Google Colab? If you're unsure how to run this code, check out our Google Colab getting started guide here for easy steps! 📝(alert-success)



Performing Transcription in BioPython

Now, let's perform transcription in BioPython. 
First, we'll create Seq objects for the coding and template DNA strands:

from Bio.Seq import Seq

coding_dna = Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG")
template_dna = coding_dna.reverse_complement()(code-box)


Here, coding_dna represents the coding strand, and template_dna represents the template strand, which is obtained by taking the reverse complement of the coding DNA sequence.

Transcribing the coding strand into mRNA is as simple as using the transcribe() method:

messenger_rna = coding_dna.transcribe()
print("Transcribed mRNA:", messenger_rna)(code-box)

Output:

Transcribed mRNA: AUGGCCAUUGUAAUGGGCCGCUGAAAGGGUGCCCGAUAG


You can see that the method replaced each occurrence of T with U, consistent with the conversion from DNA to RNA.

If you prefer to transcribe directly from the template strand, it's a two-step process. First, obtain the reverse complement of the template strand, then transcribe:

template_complement = template_dna.reverse_complement()
messenger_rna = template_complement.transcribe()
print("Transcribed mRNA from template strand:", messenger_rna)(code-box)

Output:

Transcribed mRNA from template strand: AUGGCCAUUGUAAUGGGCCGCUGAAAGGGUGCCCGAUAG



Back-Transcribing mRNA to DNA

BioPython also allows you to back-transcribe mRNA back to the coding strand using the back_transcribe() method:

messenger_rna = Seq("AUGGCCAUUGUAAUGGGCCGCUGAAAGGGUGCCCGAUAG")
coding_dna = messenger_rna.back_transcribe()
print("Back-transcribed DNA from mRNA:", coding_dna)(code-box)

Output:

Back-transcribed DNA from mRNA: ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG



Conclusion

Transcription is a vital process in molecular biology, and BioPython simplifies it with its intuitive methods. By understanding the concept of DNA strands and utilizing BioPython's functionality, you can easily transcribe DNA sequences into mRNA and vice versa, facilitating various bioinformatics analyses and research endeavors. Whether you're studying gene expression or manipulating genetic sequences, BioPython is a powerful tool to have in your bioinformatics toolkit.


Post a Comment

1Comments

  1. Some additional Info:

    The reason we use the reverse complement instead of the regular complement when creating the template DNA is because during transcription, the mRNA is synthesized based on the template DNA strand.

    Here's why we need the reverse complement:

    Directionality:

    In transcription, mRNA is synthesized in the 5' to 3' direction, complementary to the template DNA strand. Since DNA is read in the 3' to 5' direction, we need to reverse the template DNA sequence to match the directionality of mRNA synthesis.


    Complementary Base Pairing:

    In DNA, adenine (A) pairs with thymine (T), and cytosine (C) pairs with guanine (G). During transcription, these base pairings are maintained, but with uracil (U) instead of thymine (T) in the mRNA. Therefore, we need to obtain the complement of the original DNA sequence to match the mRNA sequence. However, since we've already reversed the DNA sequence, we also need to reverse the complement to maintain the correct order of bases.

    ReplyDelete
Post a Comment

#buttons=(Ok, Go it!) #days=(20)

Our app uses cookies to enhance your experience. Check Now
Ok, Go it!