How to make a random but partial shuffle in Python?
Instead of a complete shuffle, I am looking for a partial shuffle function in python.
Example : "string" must give rise to "stnrig", but not "nrsgit"
It would be better if I can define a specific "percentage" of characters that have to be rearranged.
Purpose is to test string comparison algorithms. I want to determine the "percentage of shuffle" beyond which an(my) algorithm will mark two (shuffled) strings as completely different.
Here is my code. Improvements are welcome !
import random percent_to_shuffle = int(raw_input("Give the percent value to shuffle : ")) to_shuffle = list(raw_input("Give the string to be shuffled : ")) num_of_chars_to_shuffle = int((len(to_shuffle)*percent_to_shuffle)/100) for i in range(0,num_of_chars_to_shuffle): x=random.randint(0,(len(to_shuffle)-1)) y=random.randint(0,(len(to_shuffle)-1)) z=to_shuffle[x] to_shuffle[x]=to_shuffle[y] to_shuffle[y]=z print ''.join(to_shuffle)
Your problem is tricky, because there are some edge cases to think about:
- Strings with repeated characters (i.e. how would you shuffle "aaaab"?)
- How do you measure chained character swaps or re arranging blocks?
In any case, the metric defined to shuffle strings up to a certain percentage is likely to be the same you are using in your algorithm to see how close they are.
My code to shuffle n characters:
import random def shuffle_n(s, n): idx = range(len(s)) random.shuffle(idx) idx = idx[:n] mapping = dict((idx[i], idx[i-1]) for i in range(n)) return ''.join(s[mapping.get(x,x)] for x in range(len(s)))
Basically chooses n positions to swap at random, and then exchanges each of them with the next in the list... This way it ensures that no inverse swaps are generated and exactly n characters are swapped (if there are characters repeated, bad luck).
Explained run with 'string', 3 as input:
idx is [0, 1, 2, 3, 4, 5] we shuffle it, now it is [5, 3, 1, 4, 0, 2] we take just the first 3 elements, now it is [5, 3, 1] those are the characters that we are going to swap s t r i n g ^ ^ ^ t (1) will be i (3) i (3) will be g (5) g (5) will be t (1) the rest will remain unchanged so we get 'sirgnt'
The bad thing about this method is that it does not generate all the possible variations, for example, it could not make 'gnrits' from 'string'. This could be fixed by making partitions of the indices to be shuffled, like this:
import random def randparts(l): n = len(l) s = random.randint(0, n-1) + 1 if s >= 2 and n - s >= 2: # the split makes two valid parts yield l[:s] for p in randparts(l[s:]): yield p else: # the split would make a single cycle yield l def shuffle_n(s, n): idx = range(len(s)) random.shuffle(idx) mapping = dict((x[i], x[i-1]) for i in range(len(x)) for x in randparts(idx[:n])) return ''.join(s[mapping.get(x,x)] for x in range(len(s)))
This is a problem simpler than it looks. And the language has the right tools not to stay between you and the idea,as usual:
import random def pashuffle(string, perc=10): data = list(string) for index, letter in enumerate(data): if random.randrange(0, 100) < perc/2: new_index = random.randrange(0, len(data)) data[index], data[new_index] = data[new_index], data[index] return "".join(data)