# Vectorizing strcmp

3 messages
Open this post in threaded view
|

## Vectorizing strcmp

 On the one hand, it's nice that strcmp is a user-defined function---it's easy to look at the code to see what it does. On the other hand, the reason I'm looking at the code is that it is slow. I'm using strcmp(s1,s2) where s1 is a cell array of strings (around 1000X1) and s2 is a string. Currently, in 2.1.58, strcmp does n = length(t1); for i = 1:n    retval(i,:) = strcmp (t1{i}, s2); endfor which recursively calls strcmp for two strings and where t1 is s1(:). The for loop is slow. Short of making strcmp a built-in function, one way to speed it up is to replace the for loop with a repmat like this c1 = char(t1); [rc1, cc1] = size(c1); ns2 = size(s2,2); blank = ' '; ms2 = repmat([s2 repmat(blank,1,max(cc1 - ns2, 0))], rc1, 1); mc1 = [c1 repmat(blank,1,max(ns2 - cc1, 0))]; retval2 = sum(mc1==ms2, 2)==size(mc1, 2); The blank is to pad the strings if char(s1) and s2 are different widths. The second code fragment is about 10 times faster than the first for length(s1)==1000. For length(s1)==10 the speed is about the same. And for length(s1) < 10 the first approach is faster. For me the added complexity in the code is more than offset by the gain in speed.