Vector approach to row margin frequencies

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Vector approach to row margin frequencies

Webb S.
In an attempt to learn more about octave, I am trying to figure out how to
calculate the row frequencies for the following matrix (call it X):

3,560 5,421 15,983
4,768 9,777 8,033
4,960 5,251 8,390
5,518 5,205 3,402
7,847 21,538 83,351
3,376 6,026 9,606
11,704 19,316 50,241
4,644 4,967 6,607
12,171 26,274 134,419
2,689 6,069 15,849
18,010 29,712 100,240
4,121 5,287 11,214
5,968 12,433 149,215
1,870 2,773 18,760
5,562 12,213 109,476
1,109 2,977 10,452

I am able to get the row totals with S=sum(X,2):

    24964
    22578
    18601
    14125
   112736
    19008
    81261
    16218
   172864
    24607
   147962
    20622
   167616
    23403
   127251
    14538

I would love to now get the row frequencies avoiding a while loop, something
like F=X./S, but that doesn't work (though it seems somehow consistent -- I
realize that we don't recycle vectors in Octave/ Mat* ) ....

Perhaps there is something like an "apply" command?

I can do this in a while loop if I have to, but I don't want to...

Tx!



_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Webb S.

> I would love to now get the row frequencies avoiding a while loop, something
> like F=X./S, but that doesn't work (though it seems somehow consistent -- I
> realize that we don't recycle vectors in Octave/ Mat* ) ....

Feeling a little silly, here is an answer to my question:

rowMargins =  bar ./ repmat(sum(bar,2),[1 columns(bar)])
colMargins =  bar ./ repmat(sum(bar,1),[rows(bar) 1])

Duh, sorry about the noise, hopefully some poor fool will be helped by this....

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Judd Storrs
In reply to this post by Webb S.
On Tue, Jun 23, 2009 at 6:39 PM, ws <[hidden email]> wrote:
I would love to now get the row frequencies avoiding a while loop, something
like F=X./S, but that doesn't work (though it seems somehow consistent -- I
realize that we don't recycle vectors in Octave/ Mat* ) ....

Are trying to divide each row by its sum? I think the bsxfun will do this:

bsxfun(@times,X,1./S)

I'm not sure if there is a direct function name for "./". If you can figure that out you can also avoid the separate division.

--judd



_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Judd Storrs


On Tue, Jun 23, 2009 at 7:40 PM, Judd Storrs <[hidden email]> wrote:
On Tue, Jun 23, 2009 at 6:39 PM, ws <[hidden email]> wrote:
I would love to now get the row frequencies avoiding a while loop, something
like F=X./S, but that doesn't work (though it seems somehow consistent -- I
realize that we don't recycle vectors in Octave/ Mat* ) ....

Are trying to divide each row by its sum? I think the bsxfun will do this:

bsxfun(@times,X,1./S)

I'm not sure if there is a direct function name for "./". If you can figure that out you can also avoid the separate division.

--judd



I found it. It's @ldivide

tic
Y1 = X ./ repmat(sum(X,2),[1 columns(X)]);
toc
Elapsed time is 0.02987 seconds.

tic
Y2 = bsxfun(@times,X,1./sum(X,2));
toc
Elapsed time is 0.0077 seconds.

tic
Y3 = bsxfun(@rdivide,X,sum(X,2));
toc
Elapsed time is 0.0002439 seconds.

--judd


_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Judd Storrs
Sorry, I messed that up. It should have been @rdivide in the previous mail. You can actually use either @ldivide or @rdivide but I had the order incorrect for @ldivide. Sorry. The correct versions are:

Y1 = X ./ repmat(sum(X,2),[1 columns(X)]);
Y2 = bsxfun(@times,X,1./sum(X,2));
Y3 = bsxfun(@rdivide,X,sum(X,2))
Y4 = bsxfun(@ldivide,sum(X,2),X)

For some reason the bxsfun runs are faster from inside a m-script than directly on the command line (3.0.1 maybe different in 3.2).

--judd


On Tue, Jun 23, 2009 at 8:02 PM, Judd Storrs <[hidden email]> wrote:


On Tue, Jun 23, 2009 at 7:40 PM, Judd Storrs <[hidden email]> wrote:
On Tue, Jun 23, 2009 at 6:39 PM, ws <[hidden email]> wrote:
I would love to now get the row frequencies avoiding a while loop, something
like F=X./S, but that doesn't work (though it seems somehow consistent -- I
realize that we don't recycle vectors in Octave/ Mat* ) ....

Are trying to divide each row by its sum? I think the bsxfun will do this:

bsxfun(@times,X,1./S)

I'm not sure if there is a direct function name for "./". If you can figure that out you can also avoid the separate division.

--judd



I found it. It's @ldivide

tic
Y1 = X ./ repmat(sum(X,2),[1 columns(X)]);
toc
Elapsed time is 0.02987 seconds.

tic
Y2 = bsxfun(@times,X,1./sum(X,2));
toc
Elapsed time is 0.0077 seconds.

tic
Y3 = bsxfun(@rdivide,X,sum(X,2));
toc
Elapsed time is 0.0002439 seconds.

--judd



_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Jaroslav Hajek-2
In reply to this post by Webb S.
On Wed, Jun 24, 2009 at 12:39 AM, ws<[hidden email]> wrote:

> In an attempt to learn more about octave, I am trying to figure out how to
> calculate the row frequencies for the following matrix (call it X):
>
> 3,560   5,421   15,983
> 4,768   9,777   8,033
> 4,960   5,251   8,390
> 5,518   5,205   3,402
> 7,847   21,538  83,351
> 3,376   6,026   9,606
> 11,704  19,316  50,241
> 4,644   4,967   6,607
> 12,171  26,274  134,419
> 2,689   6,069   15,849
> 18,010  29,712  100,240
> 4,121   5,287   11,214
> 5,968   12,433  149,215
> 1,870   2,773   18,760
> 5,562   12,213  109,476
> 1,109   2,977   10,452
>
> I am able to get the row totals with S=sum(X,2):
>
>    24964
>    22578
>    18601
>    14125
>   112736
>    19008
>    81261
>    16218
>   172864
>    24607
>   147962
>    20622
>   167616
>    23403
>   127251
>    14538
>
> I would love to now get the row frequencies avoiding a while loop, something
> like F=X./S, but that doesn't work (though it seems somehow consistent -- I
> realize that we don't recycle vectors in Octave/ Mat* ) ....
>
> Perhaps there is something like an "apply" command?
>
> I can do this in a while loop if I have to, but I don't want to...
>
> Tx!
>
>

In 3.2.0, the fastest way is also the most obvious (from math point of view):

F = X / diag (S)

In 3.0.x, take any of the bsxfun or repmat workarounds (or just use
the above and upgrade to 3.2 soon)

cheers

--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Webb S.
> In 3.2.0, the fastest way is also the most obvious (from math point of view):
>
> F = X / diag (S)
>
> In 3.0.x, take any of the bsxfun or repmat workarounds (or just use
> the above and upgrade to 3.2 soon)

Well ....

octave:7> version
ans = 3.2.0
octave:8> X/diag(S)
error: operator /: nonconformant arguments (op1 is 16x3, op2 is 16x16)

I will play with the bsxfun thing, though, just to understand it...

Thanks!

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Jaroslav Hajek-2
On Wed, Jun 24, 2009 at 4:48 PM, ws<[hidden email]> wrote:

>> In 3.2.0, the fastest way is also the most obvious (from math point of view):
>>
>> F = X / diag (S)
>>
>> In 3.0.x, take any of the bsxfun or repmat workarounds (or just use
>> the above and upgrade to 3.2 soon)
>
> Well ....
>
> octave:7> version
> ans = 3.2.0
> octave:8> X/diag(S)
> error: operator /: nonconformant arguments (op1 is 16x3, op2 is 16x16)
>
> I will play with the bsxfun thing, though, just to understand it...
>
> Thanks!

Oh yes, sorry, I confused rows and columns. diag(S) \ X




--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Judd Storrs
On Wed, Jun 24, 2009 at 10:56 AM, Jaroslav Hajek <[hidden email]> wrote:
On Wed, Jun 24, 2009 at 4:48 PM, ws<[hidden email]> wrote:
>> In 3.2.0, the fastest way is also the most obvious (from math point of view):
>>
>> F = X / diag (S)
>>
>> In 3.0.x, take any of the bsxfun or repmat workarounds (or just use
>> the above and upgrade to 3.2 soon)
>
> Well ....
>
> octave:7> version
> ans = 3.2.0
> octave:8> X/diag(S)
> error: operator /: nonconformant arguments (op1 is 16x3, op2 is 16x16)
>
> I will play with the bsxfun thing, though, just to understand it...
>
> Thanks!

Oh yes, sorry, I confused rows and columns. diag(S) \ X

Nah, you just didn't specifiy what S was ;) Depending on whether S is sum(X,1) or sum(X,2) you need one of the following:

X / diag(sum(S,1))
diag(sum(S,2)) \ X

The diag() stuff is really cool. Now I'll have to look into whether reshape+diag+reshape is faster than bsxfun for 3D arrays. Thanks Jaroslav!

--judd
 

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Webb S.
> diag(sum(S,2)) \ XThe diag() stuff is really cool. Now I'll have to look into
whether reshape+diag+reshape is faster than bsxfun for 3D arrays. Thanks
Jaroslav!--judd 

Cool indeed!  bsxfun vs diag vs repmat makes me remember why matrix math is so
much fun!

Thanks to all!
>
>
> _______________________________________________
> Help-octave mailing list
> Help-octave <at> octave.org
> https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
>




_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Jaroslav Hajek-2
In reply to this post by Judd Storrs
On Wed, Jun 24, 2009 at 7:17 PM, Judd Storrs<[hidden email]> wrote:

> On Wed, Jun 24, 2009 at 10:56 AM, Jaroslav Hajek <[hidden email]> wrote:
>>
>> On Wed, Jun 24, 2009 at 4:48 PM, ws<[hidden email]> wrote:
>> >> In 3.2.0, the fastest way is also the most obvious (from math point of
>> >> view):
>> >>
>> >> F = X / diag (S)
>> >>
>> >> In 3.0.x, take any of the bsxfun or repmat workarounds (or just use
>> >> the above and upgrade to 3.2 soon)
>> >
>> > Well ....
>> >
>> > octave:7> version
>> > ans = 3.2.0
>> > octave:8> X/diag(S)
>> > error: operator /: nonconformant arguments (op1 is 16x3, op2 is 16x16)
>> >
>> > I will play with the bsxfun thing, though, just to understand it...
>> >
>> > Thanks!
>>
>> Oh yes, sorry, I confused rows and columns. diag(S) \ X
>
> Nah, you just didn't specifiy what S was ;) Depending on whether S is
> sum(X,1) or sum(X,2) you need one of the following:
>
> X / diag(sum(S,1))
> diag(sum(S,2)) \ X
>
> The diag() stuff is really cool. Now I'll have to look into whether
> reshape+diag+reshape is faster than bsxfun for 3D arrays. Thanks Jaroslav!
>

Surely is, because reshape is an O(1) operation (shares data, only
changes their interpretation). However, reshape/diag/reshape is only
sufficient if you want to scale the leading/trailing dimension of a
N-d array. If you need to scale a dimension in the middle (which is
uncommon, but can happen), you can use permute it to move it to the
end (or beginning, but end is better), or just turn again to repmat or
bsxfun.


--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Judd Storrs
On Thu, Jun 25, 2009 at 12:25 AM, Jaroslav Hajek <[hidden email]> wrote:
Surely is, because reshape is an O(1) operation (shares data, only
changes their interpretation). However, reshape/diag/reshape is only
sufficient if you want to scale the leading/trailing dimension of a
N-d array. If you need to scale a dimension in the middle (which is
uncommon, but can happen), you can use permute it to move it to the
end (or beginning, but end is better), or just turn again to repmat or
bsxfun.

Don't underestimate diag() for the middle dimension ;-)

X = ones(256,256,256) ;
f = hamming(256) ;

# merge dimensions 2 and 3, use diag
tic
g = diag(repmat(f,256,1)) ;
reshape(reshape(X,256,65536)*g,256,256,256) ;
toc
Elapsed time is 0.150348 seconds.

# merge dimensions 1 and 2, use diag
tic
g = diag(reshape(repmat(f',1,256),65536,1)) ;
reshape(g*reshape(X,65536,256),256,256,256) ;
toc
Elapsed time is 0.1631 seconds.

# repmat and perform 3D element-by-element multiplication
# most of the performance loss is in creating the full 3D factor
tic
g = repmat(f',[256,1,256]) ;
X.*g ;
toc
Elapsed time is 0.318903 seconds.

# bsxfun
# Performance loss is presumably due to repeated function evaluations
tic
g = reshape(f,1,256,1) ;
bsxfun(@times,X,g) ;
toc
Elapsed time is 2.264 seconds.

Is there any reason why bsxfun style singleton dimension expansion shouldn't be the *default* behavior for the builtin element-wise operators (.*, .+, .-) in octave? bsxfun also seems like the logical extension of the scalar with matrix cases.

Matlab compatibility could be a reason, but I don't think it would break any working Matlab code (unless perhaps some strange code is relying on the error handling).  I think the gain in expressivity and simplicity for octave code could very well be worth it?

Getting back to the original problem about scaling the matrix by row or by column sums if "./" behaved like bsxfun but avoided the repeated function evaluations then the blatently obvious expression of the algorithm would both work and be at least as fast as using diag, while also scaling to higher dimensions sanely:

X ./ sum(X,1)
X ./ sum(X,2)
X ./ sum(X,3)
etc


--judd

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Jaroslav Hajek-2
On Thu, Jun 25, 2009 at 9:46 AM, Judd Storrs<[hidden email]> wrote:

> On Thu, Jun 25, 2009 at 12:25 AM, Jaroslav Hajek <[hidden email]> wrote:
>>
>> Surely is, because reshape is an O(1) operation (shares data, only
>> changes their interpretation). However, reshape/diag/reshape is only
>> sufficient if you want to scale the leading/trailing dimension of a
>> N-d array. If you need to scale a dimension in the middle (which is
>> uncommon, but can happen), you can use permute it to move it to the
>> end (or beginning, but end is better), or just turn again to repmat or
>> bsxfun.
>
> Don't underestimate diag() for the middle dimension ;-)
>
> X = ones(256,256,256) ;
> f = hamming(256) ;
>
> # merge dimensions 2 and 3, use diag
> tic
> g = diag(repmat(f,256,1)) ;
> reshape(reshape(X,256,65536)*g,256,256,256) ;
> toc
> Elapsed time is 0.150348 seconds.
>
> # merge dimensions 1 and 2, use diag
> tic
> g = diag(reshape(repmat(f',1,256),65536,1)) ;
> reshape(g*reshape(X,65536,256),256,256,256) ;
> toc
> Elapsed time is 0.1631 seconds.

Aww, cute. Manipulating both g and X to avoid permuting dimensions of
X didn't occur to me. This is of course much better than my permute()
solution. FYI, the permute() solution it seems 3x slower than the
above.

> # repmat and perform 3D element-by-element multiplication
> # most of the performance loss is in creating the full 3D factor
> tic
> g = repmat(f',[256,1,256]) ;
> X.*g ;
> toc
> Elapsed time is 0.318903 seconds.
>
> # bsxfun
> # Performance loss is presumably due to repeated function evaluations
> tic
> g = reshape(f,1,256,1) ;
> bsxfun(@times,X,g) ;
> toc
> Elapsed time is 2.264 seconds.

I'd thought bsxfun would be the loser. I intend to put in some
optimizations in the future.

> Is there any reason why bsxfun style singleton dimension expansion shouldn't
> be the *default* behavior for the builtin element-wise operators (.*, .+,
> .-) in octave? bsxfun also seems like the logical extension of the scalar
> with matrix cases.
>
> Matlab compatibility could be a reason, but I don't think it would break any
> working Matlab code (unless perhaps some strange code is relying on the
> error handling).  I think the gain in expressivity and simplicity for octave
> code could very well be worth it?
>
> Getting back to the original problem about scaling the matrix by row or by
> column sums if "./" behaved like bsxfun but avoided the repeated function
> evaluations then the blatently obvious expression of the algorithm would
> both work and be at least as fast as using diag,

Probably almost exactly as fast as diag, at least without repmat.

> while also scaling to
> higher dimensions sanely:
>
> X ./ sum(X,1)
> X ./ sum(X,2)
> X ./ sum(X,3)
> etc
>

I think the idea has been brought up multiple times in the past. A
major problem is implementation: somebody has to do it. Other minor
problems may include incorrect code that suddenly does some "magic"
rather than fail, and incompatibility with Matlab - some Octave users
limit themselves to Matlab-compatible subset. Scaling using diag, for
instance, has the advantage that it does work in Matlab.
But on general, I have no objections to that idea. However, in my
working, scaling/unscaling matrices is by far the most common
operation of this kind, and that is now solved elegantly using
diagonal matrices (which also work with sparse matrices efficiently),
so I have little motivation to actually do it.
I thought about optimizing bsxfun several times, but except the matrix
scaling case, It seems to me the result of bsxfun is usually a very
temporary expression that needs to be further reduced. A typical
instance is calculating the pairwise distances of two sets vectors. A
bsxfun + norm implementation would be better (with optimized bsxfun)
than a repmat + repmat + norm one, but still significantly slower than
a compiled function using loops.

--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

John W. Eaton
Administrator
On 25-Jun-2009, Jaroslav Hajek wrote:

| On Thu, Jun 25, 2009 at 9:46 AM, Judd Storrs<[hidden email]> wrote:
|
| > Is there any reason why bsxfun style singleton dimension expansion shouldn't
| > be the *default* behavior for the builtin element-wise operators (.*, .+,
| > .-) in octave? bsxfun also seems like the logical extension of the scalar
| > with matrix cases.
| >
| > Matlab compatibility could be a reason, but I don't think it would break any
| > working Matlab code (unless perhaps some strange code is relying on the
| > error handling).  I think the gain in expressivity and simplicity for octave
| > code could very well be worth it?
| >
| > Getting back to the original problem about scaling the matrix by row or by
| > column sums if "./" behaved like bsxfun but avoided the repeated function
| > evaluations then the blatently obvious expression of the algorithm would
| > both work and be at least as fast as using diag,
|
| Probably almost exactly as fast as diag, at least without repmat.
|
| > while also scaling to
| > higher dimensions sanely:
| >
| > X ./ sum(X,1)
| > X ./ sum(X,2)
| > X ./ sum(X,3)
| > etc
| >
|
| I think the idea has been brought up multiple times in the past. A
| major problem is implementation: somebody has to do it. Other minor
| problems may include incorrect code that suddenly does some "magic"
| rather than fail, and incompatibility with Matlab - some Octave users
| limit themselves to Matlab-compatible subset. Scaling using diag, for
| instance, has the advantage that it does work in Matlab.
| But on general, I have no objections to that idea. However, in my
| working, scaling/unscaling matrices is by far the most common
| operation of this kind, and that is now solved elegantly using
| diagonal matrices (which also work with sparse matrices efficiently),
| so I have little motivation to actually do it.

I'd be careful about making the .-ops do row/column scaling.  I think
it would be far too easy to shoot yourself in the foot if they behaved
this way, so I'd be opposed to making this change.

jwe

_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Judd Storrs


On Thu, Jun 25, 2009 at 5:20 AM, John W. Eaton <[hidden email]> wrote:
On 25-Jun-2009, Jaroslav Hajek wrote:

| On Thu, Jun 25, 2009 at 9:46 AM, Judd Storrs<[hidden email]> wrote:
|
| > Is there any reason why bsxfun style singleton dimension expansion shouldn't
| > be the *default* behavior for the builtin element-wise operators (.*, .+,
| > .-) in octave? bsxfun also seems like the logical extension of the scalar
| > with matrix cases.
| >
| > Matlab compatibility could be a reason, but I don't think it would break any
| > working Matlab code (unless perhaps some strange code is relying on the
| > error handling).  I think the gain in expressivity and simplicity for octave
| > code could very well be worth it?
| >
| > Getting back to the original problem about scaling the matrix by row or by
| > column sums if "./" behaved like bsxfun but avoided the repeated function
| > evaluations then the blatently obvious expression of the algorithm would
| > both work and be at least as fast as using diag,
|
| Probably almost exactly as fast as diag, at least without repmat.
|
| > while also scaling to
| > higher dimensions sanely:
| >
| > X ./ sum(X,1)
| > X ./ sum(X,2)
| > X ./ sum(X,3)
| > etc
| >
|
| I think the idea has been brought up multiple times in the past. A
| major problem is implementation: somebody has to do it. Other minor
| problems may include incorrect code that suddenly does some "magic"
| rather than fail, and incompatibility with Matlab - some Octave users
| limit themselves to Matlab-compatible subset. Scaling using diag, for
| instance, has the advantage that it does work in Matlab.
| But on general, I have no objections to that idea. However, in my
| working, scaling/unscaling matrices is by far the most common
| operation of this kind, and that is now solved elegantly using
| diagonal matrices (which also work with sparse matrices efficiently),
| so I have little motivation to actually do it.

I'd be careful about making the .-ops do row/column scaling.  I think
it would be far too easy to shoot yourself in the foot if they behaved
this way, so I'd be opposed to making this change.

What about a new type instead? In this form it seems like a small project I can work on by following diag as an example (I have limited time for this, but Robert Short's example is inspiring). I'm having trouble thinking of *the* awesome name for this and the name isn't really important for now, but what about something like adapt(), expand(), grow(), match(), merge() or join()? The synyax would look like this:

X / adapt(sum(X,1))
adapt(sum(X,2)) \ X

Would something like this be acceptable or useful to anyone except me? 

--judd


_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

John W. Eaton
Administrator
On 25-Jun-2009, Judd Storrs wrote:

| What about a new type instead? In this form it seems like a small project I
| can work on by following diag as an example (I have limited time for this,
| but Robert Short's example is inspiring). I'm having trouble thinking of
| *the* awesome name for this and the name isn't really important for now, but
| what about something like adapt(), expand(), grow(), match(), merge() or
| join()? The synyax would look like this:
|
| X / adapt(sum(X,1))
| adapt(sum(X,2)) \ X
|
| Would something like this be acceptable or useful to anyone except me?

Instead of a new type, what is wrong with just defining a function to
do the special operation?

jwe
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Judd Storrs


On Thu, Jun 25, 2009 at 10:55 AM, John W. Eaton <[hidden email]> wrote:
On 25-Jun-2009, Judd Storrs wrote:

| What about a new type instead? In this form it seems like a small project I
| can work on by following diag as an example (I have limited time for this,
| but Robert Short's example is inspiring). I'm having trouble thinking of
| *the* awesome name for this and the name isn't really important for now, but
| what about something like adapt(), expand(), grow(), match(), merge() or
| join()? The synyax would look like this:
|
| X / adapt(sum(X,1))
| adapt(sum(X,2)) \ X
|
| Would something like this be acceptable or useful to anyone except me?

Instead of a new type, what is wrong with just defining a function to
do the special operation?

Yes, you could write a bazillion specialty functions.

Maybe this is a difference in perspective based on the type of data I process. If you're thinking about things in terms of linear algebra and 2D matrix operations, diag() is the correct abstraction for scaling rows and columns. For matrix operations the advantages of diag() beyond performance are that it is mathematically expressive in the sense that a matrix expression maps pretty much verbatim into octave syntax. When you're working with 2D matrices you're generally thinking about matrix multiplication and matrix inversion and diag() fits right in. In these cases you end up with an octave expression that *looks* like the math.

Things are different when you approach working with 3D and 4D volumes, IMHO. Most of the stuff I do is thinking working on slices and planes. i.e. translating tensor type expressions:

X_ijk = (A_ij * B_k + C_ik) / (D_j + sum_all_j E_jk)

Of course the natural solution is to wrap this expression into for loops. As you know, this looks like the math but performs terribly:

X(i,j,k) = ( A(i,j) .* B(k) + C(i,k) ) .* ( D(j) / sum(E(:,k)) )

You can eat up memory and expand all of A,B,C, and D to full size. Then

sumE = sum(E,1)
X = (A.*B + C).*(D ./ sumE(k))

But this can eat memory very fast for large arrays. You can instead set up your dimensions correctly so that you can use bsxfun:

# A is Nx by Ny by 1
# B is 1 by 1 by Nz
# C is Nx by 1 by Nz
# D is 1 by Ny by 1
# E is 1 by Ny by Nz
X = bsxfun(@rdivide,bsxfun(@times,bsxfun(@plus,bsxfun(@times,A,B),C),D),sum(E,2))

But this is beginning to be unreadable and often there are faster ways. To get something that actually performs very well in octave, you end up doing all sorts of reshape, transpose, reshape, bsxfun, permute gymnastics and the resulting code doesn't really look like the math at all. You could put any solution in a function or write an .oct, and then use

X = compute_equation_42(A,B,C,D,E)

But that doesn't really preserve the mathematical expression either.

With the proposed type and the matrices dimensioned as for bxsfun the following could be expressive and efficient

X = ( adapt(A) .* adapt(B) + adapt(C) ) .* adapt(D) ./ adapt(sum(E,2))

Or (more likely) if A, B, C, and D were created as adapt type from the start then

X = (A .* B + C) .* ( D ./ sum(E,2) )

Also, I'm not a convinced as you are that modifying the element-wise builtins to have bsxfun behavior would be a problem. The only cases where the enhanced versions would succeed is when the dimensions of the arguements meshed in a bsxfun compatible way. It turns out that many octave/matlab functions already generate bsxfun compatible results (i.e. sum, mean, std, etc.).


--judd


_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Vector approach to row margin frequencies

Jaroslav Hajek-2
On Thu, Jun 25, 2009 at 6:28 PM, Judd Storrs<[hidden email]> wrote:

>
>
> On Thu, Jun 25, 2009 at 10:55 AM, John W. Eaton <[hidden email]> wrote:
>>
>> On 25-Jun-2009, Judd Storrs wrote:
>>
>> | What about a new type instead? In this form it seems like a small
>> project I
>> | can work on by following diag as an example (I have limited time for
>> this,
>> | but Robert Short's example is inspiring). I'm having trouble thinking of
>> | *the* awesome name for this and the name isn't really important for now,
>> but
>> | what about something like adapt(), expand(), grow(), match(), merge() or
>> | join()? The synyax would look like this:
>> |
>> | X / adapt(sum(X,1))
>> | adapt(sum(X,2)) \ X
>> |
>> | Would something like this be acceptable or useful to anyone except me?
>>
>> Instead of a new type, what is wrong with just defining a function to
>> do the special operation?
>
> Yes, you could write a bazillion specialty functions.
>
> Maybe this is a difference in perspective based on the type of data I
> process. If you're thinking about things in terms of linear algebra and 2D
> matrix operations, diag() is the correct abstraction for scaling rows and
> columns. For matrix operations the advantages of diag() beyond performance
> are that it is mathematically expressive in the sense that a matrix
> expression maps pretty much verbatim into octave syntax. When you're working
> with 2D matrices you're generally thinking about matrix multiplication and
> matrix inversion and diag() fits right in. In these cases you end up with an
> octave expression that *looks* like the math.
>
> Things are different when you approach working with 3D and 4D volumes, IMHO.
> Most of the stuff I do is thinking working on slices and planes. i.e.
> translating tensor type expressions:
>
> X_ijk = (A_ij * B_k + C_ik) / (D_j + sum_all_j E_jk)
>
> Of course the natural solution is to wrap this expression into for loops. As
> you know, this looks like the math but performs terribly:
>
> X(i,j,k) = ( A(i,j) .* B(k) + C(i,k) ) .* ( D(j) / sum(E(:,k)) )
>
> You can eat up memory and expand all of A,B,C, and D to full size. Then
>
> sumE = sum(E,1)
> X = (A.*B + C).*(D ./ sumE(k))
>
> But this can eat memory very fast for large arrays. You can instead set up
> your dimensions correctly so that you can use bsxfun:
>
> # A is Nx by Ny by 1
> # B is 1 by 1 by Nz
> # C is Nx by 1 by Nz
> # D is 1 by Ny by 1
> # E is 1 by Ny by Nz
> X =
> bsxfun(@rdivide,bsxfun(@times,bsxfun(@plus,bsxfun(@times,A,B),C),D),sum(E,2))
>
> But this is beginning to be unreadable and often there are faster ways. To
> get something that actually performs very well in octave, you end up doing
> all sorts of reshape, transpose, reshape, bsxfun, permute gymnastics and the
> resulting code doesn't really look like the math at all. You could put any
> solution in a function or write an .oct, and then use
>
> X = compute_equation_42(A,B,C,D,E)
>
> But that doesn't really preserve the mathematical expression either.
>
> With the proposed type and the matrices dimensioned as for bxsfun the
> following could be expressive and efficient
>
> X = ( adapt(A) .* adapt(B) + adapt(C) ) .* adapt(D) ./ adapt(sum(E,2))
>
> Or (more likely) if A, B, C, and D were created as adapt type from the start
> then
>
> X = (A .* B + C) .* ( D ./ sum(E,2) )
>

This is an excellent example that illustrates well the downside of
your proposal. Even if .*, + and ./ had the bsxfun behaviour you (and
others) proposed, you'd still not get anything equivalent to the
equivalent loop code: you'd still get this:
X1 = A .* B;
X2 = X1 + C;
X3 = D ./ sum (E, 2);
X = X2 .* X3;

I.e., it will push you only a small bit in the direction you want to
go. I don't think it really pays off.



--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
_______________________________________________
Help-octave mailing list
[hidden email]
https://www-old.cae.wisc.edu/mailman/listinfo/help-octave