New AI tool can decode DNA sequences

7 Aug 2024

Recently, In the journal Nature Machine, findings on this new tool “GROVER” which can extract important information out of DNA sequence were published.

About DNA

  • DNA, or deoxyribonucleic acid, is the central information storage system of most animals and plants, and even some viruses.
    • DNA is organised structurally into chromosomes and then wound around nucleosomes as part of those chromosomes. 
  • Classification: The name comes from its structure, which is a sugar and phosphate backbone which have bases sticking out from it—so-called bases.
    • It’s a polymer of four bases – Adenine (A), Cytosine (C), Guanine (G), and Thymine (T))
  • Double Helix model: In 1953 James Watson and Francis Crick, based on the X-ray diffraction data produced by Maurice Wilkins and Rosalind Franklin, proposed a very simple but famous Double Helix model for the structure of DNA. 
    • A DNA molecule consists of two strands wound around each other, with each strand held together by bonds between the bases. Adenine pairs with thymine, and cytosine pairs with guanine. 
    • Gene: The sequence of bases in a portion of a DNA molecule, called a gene, carries the instructions needed to assemble a protein

Enroll now for UPSC Online Course

  • Hallmarks: Base pairing between the two strands of polynucleotide chains.

DNA

About GROVER

  • GROVER is a new large language model trained on humans.
  • DNA that can extract important information out of DNA sequences, such as identifying gene promoters or protein binding sites
  • Significance: The researchers believe tools like GROVER could help transform genomics and personalized medicine. 
  • To train GROVER, the team at the Biotechnology Center (BIOTEC) of Dresden University of Technology in Germany, first created a ‘DNA dictionary’. 
  • The DNA Dictionary: DNA resembles language. It has four letters that build sequences and the sequences carry a meaning
    • DNA consists of four letters (A, T, G, and C) and genes, but there are no predefined sequences of different lengths that combine to build genes or other meaningful sequences.
    • Information hidden in the DNA is multilayered. Only 1-2 % of the genome consists of genes, the sequences that code for proteins.
  • GROVER Role: ​​Grover learns the grammar of DNA
    • In terms of the DNA code, this means learning the rules of the sequences, i.e. the order of the nucleotides and their meaning
    • For example: It’s Similar to how GPT models learn human languages, Grover has basically learned to speak DNA,
  • GROVER Functioning: Grover can not only predict the sequence of DNA sequences for certain genetic information, but also derive information of biological relevance from the context, such as the start of genes or protein binding sites on the DNA
    • Grover also learns processes that are considered “epigenetic“.
      • Epigenetics: It is the study of how cells control gene activity without changing the DNA sequence. 

Check Out UPSC CSE Books From PW Store

GROVER Training

  • DNA dictionary using byte pair encoding (BPE) : To train Grover, the team first created a DNA dictionary using byte pair encoding (BPE) –, a tokenization strategy – originally developed for transformer models such as GPT-3, and examined the entire genome for the most common letter combinations. 

 

Must Read
UPSC Daily Editorials UPSC Daily Current Affairs
Check Out UPSC NCERT Textbooks From PW Store Check Out UPSC Modules From PW Store 
Check Out Previous Years Papers From PW Store UPSC Test Series 2024
Daily Current Affairs Quiz Daily Main Answer Writing
Check Out UPSC CSE Books From PW Store

 

Archive Calendar

Mon Tue Wed Thu Fri Sat Sun
 123
45678910
11121314151617
18192021222324
25262728293031

Need help preparing for UPSC or State PSCs?

Connect with our experts to get free counselling & start preparing

Know about Physics Wallah

Physics Wallah is an Indian online education platform, that provides accessible & comprehensive learning experiences to students of classes 6 to 12 and those preparing for JEE and NEET exams. We also provide extensive NCERT solutions, sample papers, NEET, JEE Mains, BITSAT previous year papers, which makes us a one-stop solution for all resources. Physics Wallah also caters to over 3.5 million registered students and over 78 lakh+ Youtube subscribers with 4.8 rating on its app.

We Stand Out because

We successfully provide students with intensive courses by India's qualified & experienced faculties. PW strives to make the learning experience comprehensive and accessible for students of all sections of society. We believe in empowering every single student who couldn't dream of a good career in engineering and medical field earlier.

Our Key Focus Areas

Physics Wallah’s main focus is to create accessible learning experiences for students all over India. With courses like Lakshya, Udaan, Arjuna & many others, we have been able to provide a ready solution for lakhs of aspirants. From providing Chemistry, Maths, Physics formulae to giving e-books of eminent authors, PW aims to provide reliable solutions for student prep.

What Makes Us Different

Physics Wallah strives to develop a comprehensive pedagogical structure for students, where they get a state-of-the-art learning experience with study material and resources. Apart from catering students preparing for JEE Mains and NEET, PW also provides study material for each state board like Uttar Pradesh, Bihar, and others.

Aiming for UPSC?

Download Our App

# #
Quick Revise Now !
AVAILABLE FOR DOWNLOAD SOON
UDAAN PRELIMS WALLAH
Comprehensive coverage with a concise format
Integration of PYQ within the booklet
Designed as per recent trends of Prelims questions
हिंदी में भी उपलब्ध
Quick Revise Now !
UDAAN PRELIMS WALLAH
Comprehensive coverage with a concise format
Integration of PYQ within the booklet
Designed as per recent trends of Prelims questions
हिंदी में भी उपलब्ध

<div class="new-fform">






    </div>

    Subscribe our Newsletter
    Sign up now for our exclusive newsletter and be the first to know about our latest Initiatives, Quality Content, and much more.
    *Promise! We won't spam you.
    Yes! I want to Subscribe.