Finished! Looks like this project is out of data at the moment!

See Results

Hello Genome Detectives. Thank you again for your incredible efforts - we have posted an update on the project results tab. We are taking a break while we develop the next iteration of Genome Detectives, but have left you a data set here for practice only – PLEASE NOTE THESE DATA ARE ALREADY COMPLETE. If you are ready for more of a challenge then please check out our new 'Training Academy' website. To browse other active projects that still need your classifications, check out zooniverse projects.

Education

Information for this project

An overview


All bacteria have DNA in the form of a circular chromosome, which contains all the genetic material needed to keep the bacteria alive and functioning. The genes are located along the chromosome, most encoding a specific protein, although genes can also do other things. In bacteria the DNA is normally read continuously in one direction and beginning at a START signal (the ‘start codon’, green) and ending at a STOP (the ‘stop codon’, red). The sequence of DNA describes a sequence of amino acids, which the bacteria makes and assembles into a protein. This protein will do a specific job for the bacterium such as energy production, growth, reproduction, or protection from threats. Bacteria can read the DNA sequence in 3 frames depending on where they start, frame 1, 2, or 3 (F1, F2, F3).

DNA, genes and proteins in detail

What is DNA?

  • DeoxyriboNucleic Acid, or DNA, is a long string-like molecule that contains the instructions for making all living creatures. It is a bit like written text, except that it uses an alphabet of only four letters (A, C, G, and T).
  • DNA has a structure that is composed of 4 nucleotides (or bases): adenine (A), thymine (T), guanine (G), and cytosine (C); connected to a sugar phosphate backbone.
  • The bases A and T always pair together, as do G and C.
  • The backbone and paired bases together form a helical (or twisted) ladder formation.
  • And for a little added chemistry, the bases and the backbone are made up of hydrogen, oxygen, nitrogen, carbon, and phosphorus atoms.

What is a gene?

  • A gene is the length of DNA that encodes a protein. As proteins are the most functional unit of all living creatures, it is important to know the sequence of the exact code that is needed to make them.
  • A gene sequence is made of a string of bases (A, T, G, C) arranged into codons of 3 bases and includes a beginning (the start codon, shown below in green), an end (or the stop codon, shown below in red), and multiple codons in between.
  • Each codon (3 bases) of the DNA sequence encodes for an amino acid, the building blocks of proteins, ( as shown above: M, R, V, Q, P, S and so on until the stop codon shown as *).
  • The DNA sequence before the start codon (in green) is referred to as ‘up-stream sequence’ and the DNA sequence after the stop codon (in red) is called the ‘down-stream sequence’.

What is a codon?

  • DNA bases (A, T, G, C) can be thought of as an alphabet that make up the three-letter words found in DNA. The three-letter words are called ‘codons’.
  • Each codon makes a single chemical compound called an amino acid.
  • For example, the codon ‘TTT’ makes the amino acid phenylalanine, represented by the letter F.
  • The DNA sequence of a gene always begins with an initiation (or start) codon. The most common start codon is ATG, or methionine (M); there are 3 alternative start codons: GTG, valine (V); and TTG and CTG both leucine (L). The L start codons are less common.
  • The DNA sequence of a gene always ends with a termination (or stop) codon. There are 3 stop codons: TAA, TAG, and TGA. Stop codons do not make amino acids, hence the chain of existing amino acids breaks.

What is a protein?

  • Proteins are the fundamental molecule that make all living creatures growth and develop. Proteins can be involved in maintaining structure likle collagen or keratin, or as enzymes or channels.
  • Proteins are made from a string of amino acids
  • Amino acids are encoded by the codons shown above.
  • One gene codes for one protein.
  • Interesting fact: bacteria can create 3 amino acid sequences from just one DNA sequence. The three amino acid sequences are called ‘reading frames’ and are noted as F1, F2, and F3 depending on where you start reading the sequence.