Compare values in two strings and then yield a result that can be placed in an array
I have two strings in python that I have converted to lists:
Seq1 = [x1,x2,x3,x4] Seq2 = [y1,y2,y3,y4]
The strings are the same length and are composed of only the letters 'a', 'c', 'g', and 'u'.
Then I created an empty matrix len(Seq1) by len(Seq2):
a = numpy.zeros(shape=len(Seq1),len(Seq2))
Next, I want to compare the list values and place a 1 if the values match and 0 if they don't. The value should be placed in the relevant array element i.e.
if seq1 == seq: a[0,0] =  else: a[0,0] =  # repeat for all the values. print a
I had a loop that was working but it only filled in the first row and column. I can see that it's a problem with a range function like Seq1[i] == Seq2[j] but I can't figure it out.
A compact way to write the loop is:
import itertools for i1,i2 in itertools.product(xrange(len(Seq1)), xrange(len(Seq2))): a[i1,i2] = Seq1[i1] == Seq2[i2]
Iterate over both lists and compare:
for x in range(len(Seq1)): for y in range(len(Seq2)): a[x, y] = (Seq1[x] == Seq2[y])
I assume that this is a bioinformatics question. The purpose, however, is unclear to me. I've listed a generic matching system that you can use.
>>> for s1 in xrange(len(seq1)): ... for s2 in xrange(len(seq2)): ... if seq1[s1]==seq2[s2]: ... a[s1,s2]=1 ... else: ... a[s1,s2]=0
I wouldn't use the nested loops at all; outer methods in numpy can do it for you:
Python 2.7.1 (r271:86882M, Nov 30 2010, 10:35:34) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> seq1 = "acgu" >>> seq2 = "aagg" >>> numpy.equal.outer(map(ord, seq1), map(ord, seq2)) array([[ True, True, False, False], [False, False, False, False], [False, False, True, True], [False, False, False, False]], dtype=bool)