Algorithm practice: The RNA transcriber

23 Sep 2019 - John


I’ve been trying a set of exercises from exercism.io and came across one that took me a bit longer than I expected. We are given a strand of DNA, and for each of its letters, we need to return its corresponding RNA sequence. DNA is made out of four nucleotides, adenine (A), cytosine (C), guanine (G) and thymine (T). For each one of them, there is a corresponding RNA nucleotide: adenine (A), cytosine (C), guanine (G) and uracil (U).

Image credit: Unknown

They are represented like this:

DNA -> RNA
 G  ->  C
 C  ->  G
 T  ->  A
 A  ->  U

So if we’re given the sequence 'ACGTGGTCTTAA', we should return 'UGCACCAGAAUU'.

The naive approach

We’re going to build a function with an empty array that we’re going to fill with a specific character based on the current iteration of a loop we’re going to build to go over the given strand:

const toRna = (dna) => {
  let rna = [];
  dna.split('').forEach(element => {
    if (element === 'G') {
      rna.push('C');
    } else if (element === 'C') {
      rna.push('G');
    } else if (element === 'T') {
      rna.push('A');
    } else if (element === 'A') {
      rna.push('U');
    }
  });
  return rna.join('');
};

By all means, this works, but our space complexity is too big, not to mention the if/else blocks are less than elegant. We could find a better way.

A better approach:

This time we’re going to build an object with key/value pairs, then we’re going to loop over the given strand to look for the key in our object and fill an empty array with the corresponding value. Then we join the array and return it:

const toRna = (dna) => {
  let rna = {
    G: 'C',
    C: 'G',
    T: 'A',
    A: 'U'
  }
  let result = [];
  dna.split('').forEach(element => {
    result.push(rna[element]);
  });
  return result.join('');
};

Much better! this time we’re splitting our string and going through each one of its elements. For every element we look at, we have a corresponding value that we’re going to push to our empty array. However, this approach can still be further refined.
An even better approach:
We use a forEach loop when we have map?

const rna = {
  G: 'C',
  C: 'G',
  T: 'A',
  A: 'U'
}
const toRna = (dna) => {
  return dna.split('').map((letter) => {
    return rna[letter];
  }).join('');
};

By using map, not only we get rid of the side variable result, but we’re also able to rebuild our string by applying .join() right off the bat.

Can we make it even better?

Yes we can. We can further refine our function with regex, which we’ll explore next week. Stay tuned!