The run-time complexity of the set functions should be as follows:

create_set = O (n log(n)) since you have to look up

each new element in the existing set before

inserting it, and lookup is O(log n). So the

sort algorithm is a fine way to handle this.

union = O(n+m) since in the worst case, you need to

copy both sets.

intersection, complement = O(max(n,m)) if the set

sizes are similar, or O(m log(n)) if m

is much smaller than n, since in the worst

case you need to copy the whole set (m~n)

or just look up a few elements in the

larger set (m<<n).

Octave 2.1.31 uses an O(n log n) algorithm for create_set and union, and

an O(n) interpreted loop for intersection, complement. The CVS archive

has an O(n log n) algorithm for intersection and complement as well,

which is faster than the O(n) algorithm only because for loops are so

slow. If you are concerned about speed for very large sets, it is worth

recoding union, intersect, and complement in C++ using the O(n) algorithm,

otherwise get the most recent scripts from the archives.

Paul Kienzle

[hidden email]
On Wed, Nov 15, 2000 at 09:52:44AM -0500, Lippert, Ross A. wrote:

>

>

>

> I have made a modification to create_set because I found myself

> often wanting not just the set of unique elements but also the

> counts of those elements in the original input. Perhaps that is

> just my own eccentricity.

>

> However, if one modifies create_set in this way (returning a

> second output of counts), then I believe that the runtime complexity

> of intersection and complement can be reduced to n log n instead of

> n^2 (which they appear to be now, but I am no expert on complexity).

>

> Anyhow, as a quick example I tried

> x = fix(20000*rand(1000,1)); y = fix(20000*rand(1000,1));

>

> and found a big difference between the timings of

> intersection(x,y), so I think I am on the right track.

>

> I'd be sending this out in the form of a patch, but for the moment,

> I'd really like feedback to tell me if I have done something stupid

> so I don't waste time updating my CVS, installing the new scripts

> and then diffing all to find out I missed something.

>

>

>

> -r

>

>