Working patch for FFTW 3.0.x and Nd FFT's

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Working patch for FFTW 3.0.x and Nd FFT's

David Bateman-3
Dear All,

Please find attached a working patch for FFTW 3.0.x support and Nd
FFTs.  This patch also shows a significant speed up from the previous
code, with the proviso that the canonical FFTW wisdom has been
generated and installed in a system dependent file (/etc/fftw/wisdom
on my machine).

The testing of the code was done in a hierarchical fashion. Firstly I
showed that the "fft", "fft2", "ifft" and "ifft2" commands returned the
same results as previously. This was done by running the supplied script
"testfft.m" on an unpatched octave or even matlab, and then running the
script "testfft2.m" on a patched version of octave. This also gave the
relative speed of the version of the code, which I supply in the file "log".

The final level of testing was to test the Nd array support. Firstly
"fft(x,[],n)" are tested by taking 2-D slices through the data and
using the already tested fft over 2d data sets to prove that these
work. "fft2(x)" was tested in the same manner. To test "fftn(x)", the
formula

b = a;
for i=1:length(size(x))
  b = fft(b,[],i);
endfor

is equivalent and as the Nd fft's has been tested to be correct, the test
of fftn is then also completed. A similar series of tests was performed
for the inverse transforms. The test script is "testfft3.m" and is also
attached. Tests were performed of a series of 3d data as well as a 4d
data set.

Some provisos for the patches use

* You must upgrade to FFTW 3.0.x due to an API change between version 2
  and 3. If you don't upgrade the slower FFTPACK version of the code
  (that have also been tested) will be used.
* Although not necessary, running
    mkdir /etc/fftw
    fftw-wisdom -v -c -o /etc/fftw/wisdom
  is highly recommended. This will take a long time (12 hours + on my machine)
* The new function "fft_wisdom" can be used to manipulate and create the
  wisdom necessary for Octave. This is also incredible slow due to its use
  of "fftw-wisdom" internally.

Given all of the testing I've given the code, I now consider this
patch ready for inclusion into 2.1.54....

Cheers
David

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary

testfft.m (4K) Download Attachment
log (6K) Download Attachment
testfft2.m (6K) Download Attachment
testfft3.m (13K) Download Attachment
patch (85K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Working patch for FFTW 3.0.x and Nd FFT's

John W. Eaton-6
On 16-Feb-2004, David Bateman <[hidden email]> wrote:

| Given all of the testing I've given the code, I now consider this
| patch ready for inclusion into 2.1.54....

OK, I would like to apply the patch, but after a quick look at the
code, I have a couple of comments and questions.

In several places, you are using const_cast to cast away const.  Why
is this necessary?  Does the FFTW code really modify its argument, or
is it simply missing const qualifiers.  If the former, then I think we
need to find a way to avoid requiring the cast, otherwise this could
do the wrong thing when mixed with Octave's reference counting.  If it
is the latter, then I think we should ask the FFTW maintainers to add
the const qualifiers.

Thanks for being careful to include OCTAVE_QUIT in most loops so that
Octave can be interrupted when these calculations are happening.  I
did notice that the function convert_packcomplex_1d and
convert_packcomplex_Nd contain loops and do not use OCTAVE_QUIT.  Is
that intentional?  Is that code so simple that it can never run for a
long time, even if a user accidentally passes in a very large array?

jwe


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

David Bateman-3
According to John W. Eaton <[hidden email]> (on 02/16/04):

> On 16-Feb-2004, David Bateman <[hidden email]> wrote:
>
> | Given all of the testing I've given the code, I now consider this
> | patch ready for inclusion into 2.1.54....
>
> OK, I would like to apply the patch, but after a quick look at the
> code, I have a couple of comments and questions.
>
> In several places, you are using const_cast to cast away const.  Why
> is this necessary?  Does the FFTW code really modify its argument, or
> is it simply missing const qualifiers.  If the former, then I think we
> need to find a way to avoid requiring the cast, otherwise this could
> do the wrong thing when mixed with Octave's reference counting.  If it
> is the latter, then I think we should ask the FFTW maintainers to add
> the const qualifiers.

When FFTW3 creates plans with anything but the FFTW_ESTIMATE flag, then it
using the *in and *out arrays as testing space for the plans, effectively
destorying the data in them. However with FFTW_ESTIMATE doesn't touch them
except to check data alignment.

So const_cast is necessary to force FFTW to accept the argument. This will
only cause a problem we change from FFTW_ESTIMATE. I see no reason for this
as I've also supplied a function fft_wisdom to create wisdom seperately from
the fft itself.

> Thanks for being careful to include OCTAVE_QUIT in most loops so that
> Octave can be interrupted when these calculations are happening.  I
> did notice that the function convert_packcomplex_1d and
> convert_packcomplex_Nd contain loops and do not use OCTAVE_QUIT.  Is
> that intentional?  Is that code so simple that it can never run for a
> long time, even if a user accidentally passes in a very large array?

That is more of an oversight. Feel free to add the OCTAVE_QUITs there

Cheers
David

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

Daniel Sebald
In reply to this post by David Bateman-3


David Bateman wrote:

>According to Daniel J Sebald <[hidden email]> (on 02/16/04):
>  
>
>>Any ramifications of fft_wisdom not being correct on server environments?
>>
>>Do the fft_wisdom functions remove the existing wisdom files at the very
>>end of the trial and error process?  Otherwise mistakingly running the
>>command could lead to anguish.
>>    
>>
>
>This is a difficult issue. The way I treated it was
>
>1) The system wisdom file (usually in /etc/fftw/wisdom) is imported at
>   startup. Hopefully, the user won't install an incorrect wisdom file
>   here.
>2) The fft_wisdom command can import or export wisdom files. These can
>   be used to override the system wisdom if needed
>3) There is a flag to 'fft_wisdom' that forces it to overwrite a wisdom
>   file even if it exists.
>4) Furthemore, if 'fft_wisdom([257 257])' for example is called, the process
>   is to create an octave temporary file in $(TMPDIR)/oct-*. Invoke with
>   'system' the command 'fftw-wisdom -n -o $(TMPDIR)/oct-*'. The '-n' forces
>   the process to not use the system wisdom, so that bad wisdom can also
>   be overridden in this manner. The temporrary file is deleted when octave
>   exits and so won't be reused.
>
>There are still ways the user can import bad wisdom, for instance calling
>fft_wisdom('file') thinking that 'file' doesn't exist, when it does and
>contains wisdom for anotehr platform. Do you think it is worth splitting
>fft_wisdom into three commands? fft_save_wisdom, fft_load_wisdom and
>fft_create_wisdom?
>

No, probably not.  It is common for the meaning of an Octave function to
be dependent upon the type of variable passed into it, or not passed in.

fft_wisdom('mywisdom.wis');
would set up Octave to use a wisdom file other than the default, fine.

fft_wisdom();
fft_wisdom()  -- prints the contents of the wisdom file?
would run wisdom and overwrite the default file.  Maybe this is a bit dodgy.

save 'mywisdom.wis' fft_wisdom();
I'm not sure about the saving portion of it.

Dan



Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

John W. Eaton-6
In reply to this post by David Bateman-3
On 17-Feb-2004, David Bateman <[hidden email]> wrote:

| According to John W. Eaton <[hidden email]> (on 02/16/04):
|
| > Thanks for being careful to include OCTAVE_QUIT in most loops so that
| > Octave can be interrupted when these calculations are happening.  I
| > did notice that the function convert_packcomplex_1d and
| > convert_packcomplex_Nd contain loops and do not use OCTAVE_QUIT.  Is
| > that intentional?  Is that code so simple that it can never run for a
| > long time, even if a user accidentally passes in a very large array?
|
| That is more of an oversight. Feel free to add the OCTAVE_QUITs there

I'm not sure of the best locations for them since I don't know the
typical sizes of the data.  Should they always go inside the innermost
loops?

jwe


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

David Bateman-3
According to John W. Eaton <[hidden email]> (on 02/17/04):
> I'm not sure of the best locations for them since I don't know the
> typical sizes of the data.  Should they always go inside the innermost
> loops?

I'd rather they not go in the inner most loop. But the problem is what
happens in a case like "a = rand(2^20,1); fft(a)" which will call
convert_packcomplex_1d and will only call the outer loop once, but
the inner one "2^20 / 2 - 1" times. So, now that I think about it, I
really don't know either. It depends really on what data is handed to
them.

The fact is that these two routines occupy a minor proportion of the
time relative to the FFTW code itself. So a compromise might be to
just have a single call to OCTAVE_QUIT at the top of
convert_packcomplex_1d and convert_packcomplex_Nd.

Regards
David

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

John W. Eaton-6
On 17-Feb-2004, David Bateman <[hidden email]> wrote:

| According to John W. Eaton <[hidden email]> (on 02/17/04):
| > I'm not sure of the best locations for them since I don't know the
| > typical sizes of the data.  Should they always go inside the innermost
| > loops?
|
| I'd rather they not go in the inner most loop. But the problem is what
| happens in a case like "a = rand(2^20,1); fft(a)" which will call
| convert_packcomplex_1d and will only call the outer loop once, but
| the inner one "2^20 / 2 - 1" times. So, now that I think about it, I
| really don't know either. It depends really on what data is handed to
| them.
|
| The fact is that these two routines occupy a minor proportion of the
| time relative to the FFTW code itself. So a compromise might be to
| just have a single call to OCTAVE_QUIT at the top of
| convert_packcomplex_1d and convert_packcomplex_Nd.

OK, I added calls to OCTAVE_QUIT at the beginning and end of each of
these functions and in the Nd code, I also added calls to OCTAVE_QUIT
in between each set of loops.  Perhaps that will be enough to keep
Octave from being too sluggish to respond if an interrupt happens to
occur while one of these functions is running.

Thanks,

jwe


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

Paul Kienzle

On Feb 17, 2004, at 5:00 PM, John W. Eaton wrote:

> On 17-Feb-2004, David Bateman <[hidden email]> wrote:
>
> | According to John W. Eaton <[hidden email]> (on 02/17/04):
> | > I'm not sure of the best locations for them since I don't know the
> | > typical sizes of the data.  Should they always go inside the
> innermost
> | > loops?
> |
> | I'd rather they not go in the inner most loop. But the problem is
> what
> | happens in a case like "a = rand(2^20,1); fft(a)" which will call
> | convert_packcomplex_1d and will only call the outer loop once, but
> | the inner one "2^20 / 2 - 1" times. So, now that I think about it, I
> | really don't know either. It depends really on what data is handed to
> | them.
> |
> | The fact is that these two routines occupy a minor proportion of the
> | time relative to the FFTW code itself. So a compromise might be to
> | just have a single call to OCTAVE_QUIT at the top of
> | convert_packcomplex_1d and convert_packcomplex_Nd.
>
> OK, I added calls to OCTAVE_QUIT at the beginning and end of each of
> these functions and in the Nd code, I also added calls to OCTAVE_QUIT
> in between each set of loops.  Perhaps that will be enough to keep
> Octave from being too sluggish to respond if an interrupt happens to
> occur while one of these functions is running.

Is it possible to fall back to a long jump for the impatient?
That is, if you press Ctrl-C twice,  can you trigger a long
jump directly from the signal?

The main reason I see why not to do it this way is
because it may leak memory, or otherwise fail to
free resources.

This would be a problem in the last case I encountered
it, which was full(sparse(5000,5000,1)), which failed
to return immediately when Ctrl-C was pressed.

It would be nice to convert Ctrl-C into a C++ exception
directly.  IIRC, the ada runtime is supposed to be able
to convert signals into exceptions, but I was never able
to find it in the gnu ada code.  Maybe they don't.

The alternative approach is radical: garbage collection.
This is not something any of us have time to consider
in a pre-2.2 release.

Paul Kienzle
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

David Bateman-3
In reply to this post by John W. Eaton-6
According to John W. Eaton <[hidden email]> (on 02/17/04):
> OK, I added calls to OCTAVE_QUIT at the beginning and end of each of
> these functions and in the Nd code, I also added calls to OCTAVE_QUIT
> in between each set of loops.  Perhaps that will be enough to keep
> Octave from being too sluggish to respond if an interrupt happens to
> occur while one of these functions is running.

While the code is in FFTW its not going to respond in any case...

D.

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

David Bateman-3
In reply to this post by Daniel Sebald
According to Daniel J Sebald <[hidden email]> (on 02/17/04):
> It is common for the meaning of an Octave function to
> be dependent upon the type of variable passed into it, or not passed in.
>
> fft_wisdom('mywisdom.wis');
> would set up Octave to use a wisdom file other than the default, fine.


At the moment this call loads/saves the wisdom dependent on the existence
of the file 'mywisdom.wis', which as you stated, potentially allows the
user to fall into the trap of using an incorrect wisdom file.

> fft_wisdom();
> fft_wisdom()  -- prints the contents of the wisdom file?
> would run wisdom and overwrite the default file.  Maybe this is a bit dodgy.

I don't see the point of printing out the wisdom. Stuff like,

(fftw-3.0.1 fftw_wisdom
  (fftw_codelet_t2_8 1 #xc050 #x76a287c1 #x6467f8c0 #xbc5b4077 #x6b73469d)
  (fftw_dft_vrank_geq1_register 0 #xc050 #x27f03af7 #xbb4692da #x6548866d #xcee8
b671)
  (fftw_codelet_n1_8 0 #xc050 #x2b3c6956 #xa6db414f #xd90f414e #x335c0c22)
  (fftw_codelet_hc2r_8 1 #xc050 #x52f0d2c4 #xc3a97c59 #xf5c2f3ed #xa4aa2a33)
  (fftw_codelet_q1_4 0 #xc050 #xc1cdcb4f #x1970fc77 #x4b55d74c #x3ac37e7b)
  (fftw_codelet_q1_8 0 #xc050 #x228110fa #x9f6c00de #x83523c60 #xcfa30b3f)
  (fftw_codelet_m1_32 0 #xc050 #x6f893bf5 #xd45f2cbd #xa050da62 #xf302784b)
)

doesn't really tell the user much. Maybe it could print out the FFTs for
which wisdom already exists, but except for writing a parser for the wisdom
files themselves without Octave, I don't quite see how to access this
information.

As for making 'fftw_wisdom()' call the canonical wisdom command to generate
the system wisdom, this is a really really bad idea. It took 12 hours to
do this on my machine, and in that time you wouldn't be able to interrupt
the process.

> save 'mywisdom.wis' fft_wisdom();
> I'm not sure about the saving portion of it.

This might be possible, if we created a class to hold the wisdom info and
then did "wisdom = fftw_wisdom(); save -ascii 'mywisdom.wis' wisdom",
where fftw_wisdom would be of the octave_wisdom class. This is a lot of
extra work for no gain in functionality, since we can easily have a
seperate function to load/save the FFTW wisdom. Also the wisdom would be
stored twice in this case; once in the octave class octave_wisdom, and
again within FFTW itself.

Cheers
David


--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

Daniel Sebald


David Bateman wrote:

>As for making 'fftw_wisdom()' call the canonical wisdom command to generate
>the system wisdom, this is a really really bad idea. It took 12 hours to
>do this on my machine, and in that time you wouldn't be able to interrupt
>the process.
>

That was sort of my original concern, about the user getting stuck in
some situation he or she doesn't know what is going on.  (I'm wondering
what could possibly be taking 12 hours, but that's a different matter.)

>>save 'mywisdom.wis' fft_wisdom();
>>I'm not sure about the saving portion of it.
>>    
>>
>
>This might be possible, if we created a class to hold the wisdom info and
>then did "wisdom = fftw_wisdom(); save -ascii 'mywisdom.wis' wisdom",
>where fftw_wisdom would be of the octave_wisdom class. This is a lot of
>extra work for no gain in functionality, since we can easily have a
>seperate function to load/save the FFTW wisdom. Also the wisdom would be
>stored twice in this case; once in the octave class octave_wisdom, and
>again within FFTW itself.
>

Why not the name fft_wisdom as opposed to fftw_wisdom?  I realize that
FFTW is being used, but if later some other FFT package is used then
fft_wisdom could still apply.  (I may have come in late on this discussion.)

Dan



Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

David Bateman-3
According to Daniel J Sebald <[hidden email]> (on 02/18/04):
>
> That was sort of my original concern, about the user getting stuck in
> some situation he or she doesn't know what is going on.  (I'm wondering
> what could possibly be taking 12 hours, but that's a different matter.)

FFTW tests all possible ways it knows of to do a particular FFT and this
stores the information on the best one for the current architecture.
With the command "fftw-wisdom -c -v -o /etc/fftw/wisdom" it tests about
8 different ways FFTW might be called to do a particular FFT for the
following FFT sizes


     "1", "2", "4", "8", "16", "32", "64", "128", "256", "512", "1024",
     "2048", "4096", "8192", "16384", "32768", "65536", "131072",
     "262144", "524288", "1048576",

     "10", "100", "1000", "10000", "100000", "1000000",

     "2x2", "4x4", "8x8", "10x10", "16x16", "32x32", "64x64", "100x100",
     "128x128", "256x256", "512x512", "1000x1000", "1024x1024",

     "2x2x2", "4x4x4", "8x8x8", "10x10x10", "16x16x16", "32x32x32",
     "64x64x64", "100x100x100"

So its a lot of work.

> Why not the name fft_wisdom as opposed to fftw_wisdom?  I realize that
> FFTW is being used, but if later some other FFT package is used then
> fft_wisdom could still apply.  (I may have come in late on this discussion.)

My original patch had fft_wisdom, John preferred fftw_wisdom. He has CVS
write access, so he has the last say :-)

D.

--
David Bateman                                [hidden email]
Motorola CRM                                 +33 1 69 35 48 04 (Ph)
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax)
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as:

[x] General Business Information
[ ] Motorola Internal Use Only
[ ] Motorola Confidential Proprietary


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

Dmitri A. Sergatskov-2
David Bateman wrote:

>
>>Why not the name fft_wisdom as opposed to fftw_wisdom?  I realize that
>>FFTW is being used, but if later some other FFT package is used then
>>fft_wisdom could still apply.  (I may have come in late on this discussion.)
>
>
> My original patch had fft_wisdom, John preferred fftw_wisdom. He has CVS
> write access, so he has the last say :-)

I do not know what was John reasonings, but since 'wisdom' is the FFTW3.x peculiarity
it make sense (to me) to keep the reference in the name.

>
> D.
>

Dmitri.


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

John W. Eaton-6
On 18-Feb-2004, Dmitri A. Sergatskov <[hidden email]> wrote:

| I do not know what was John reasonings, but since 'wisdom' is the
| FFTW3.x peculiarity it make sense (to me) to keep the reference in
| the name.

That was my thinking, that it was specific to fftw.

jwe


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

Daniel Sebald
In reply to this post by David Bateman-3


David Bateman wrote:

>According to Daniel J Sebald <[hidden email]> (on 02/18/04):
>  
>
>>That was sort of my original concern, about the user getting stuck in
>>some situation he or she doesn't know what is going on.  (I'm wondering
>>what could possibly be taking 12 hours, but that's a different matter.)
>>    
>>
>
>FFTW tests all possible ways it knows of to do a particular FFT and this
>stores the information on the best one for the current architecture.
>With the command "fftw-wisdom -c -v -o /etc/fftw/wisdom" it tests about
>8 different ways FFTW might be called to do a particular FFT for the
>following FFT sizes
>
>
>     "1", "2", "4", "8", "16", "32", "64", "128", "256", "512", "1024",
>     "2048", "4096", "8192", "16384", "32768", "65536", "131072",
>     "262144", "524288", "1048576",
>
>     "10", "100", "1000", "10000", "100000", "1000000",
>
>     "2x2", "4x4", "8x8", "10x10", "16x16", "32x32", "64x64", "100x100",
>     "128x128", "256x256", "512x512", "1000x1000", "1024x1024",
>
>     "2x2x2", "4x4x4", "8x8x8", "10x10x10", "16x16x16", "32x32x32",
>     "64x64x64", "100x100x100"
>
>So its a lot of work.
>

That is what I suspected.  The following points aren't critically
important:  I'm guessing that with the typical nonlinear dependence of
computations based on FFT length, it is those very long FFT tests that
are taking the longest time.  (I also question the utility of a million
point FFT, but let's say someone has a valid scenario for requiring such
resolution.)  One would think that some logic could be applied to the
trends of algorithm performance for the shorter FFTs to deduce what
would be the most efficient algorithm for the very long FFTs.  Anyway,
one often gets a feeling of the kind of information that is trying to be
accumulated versus the amount of computer power that should be required,
and somehow 12 hours seems like a long time to build a "wisdom table".
 This isn't Octave's concern though, I guess.

>>Why not the name fft_wisdom as opposed to fftw_wisdom?  I realize that
>>FFTW is being used, but if later some other FFT package is used then
>>fft_wisdom could still apply.  (I may have come in late on this discussion.)
>>    
>>
>
>My original patch had fft_wisdom, John preferred fftw_wisdom. He has CVS
>write access, so he has the last say :-)
>
>D.
>

:-)

D.



Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

John W. Eaton-6
In reply to this post by Paul Kienzle
On 17-Feb-2004, Paul Kienzle <[hidden email]> wrote:

| Is it possible to fall back to a long jump for the impatient?
| That is, if you press Ctrl-C twice,  can you trigger a long
| jump directly from the signal?
|
| The main reason I see why not to do it this way is
| because it may leak memory, or otherwise fail to
| free resources.
|
| This would be a problem in the last case I encountered
| it, which was full(sparse(5000,5000,1)), which failed
| to return immediately when Ctrl-C was pressed.

I think the only safe thing to do in this case is to offer to abort,
optionally attempting to save the top-level workspace.  What about
something like the following patch?  With it, I see this behavior:

  octave:1> x = rand (10000000, 1);
  octave:2> fft (x);                  ## C-c four times while this runs
  abort [y/N]? y
  save top-level workspace [y/N]? y
  panic: Interrupt -- stopping myself...
  attempting to save variables to `octave-core'...
  save to `octave-core' complete

Unfortunately, there is no warning about the size of the workspace.
In this case, it turned into a 189MB file (text format -- maybe that
should also be changed?), which took 10-15 seconds to save to a
relatviely fast local disk.

| It would be nice to convert Ctrl-C into a C++ exception
| directly.  IIRC, the ada runtime is supposed to be able
| to convert signals into exceptions, but I was never able
| to find it in the gnu ada code.  Maybe they don't.

I'm not sure, but I think the only safe way to do that is to throw the
C++ exception outside the signal handler.

| The alternative approach is radical: garbage collection.
| This is not something any of us have time to consider
| in a pre-2.2 release.

Garbage collection could solve the memory allocation problem, but I
don't think it solves the general resource problem.  What happens to
a local fstream object created in some function if you just jump back
to the top-level from the a signal handler?  Do you have to register
each of those with something like an unwind-protect handler so that it
will be properly closed, or can you somehow make garbage collection
also clean up those kinds of temporary C++ objects?  The nice thing
about the way that we are doing things now is that as long as
resources are acquired in constructors and released in destructors,
they are cleaned up when an exception is thrown.

jwe


src/ChangeLog:

2004-02-18  John W. Eaton  <[hidden email]>

        * sighandlers.cc (my_friendly_exit): New optional arg, save_vars.
        Only call save_user_variables if save_vars is true.
        (sigint_handler): If interactive, offer to abort and save
        workspace after three consecutive interrupts.
        (sigint_handler, sigpipe_handler, sigfpe_handler):
        Increment octave_interrupt_handler instead of setting it to 1.

 
Index: src/sighandlers.cc
===================================================================
RCS file: /usr/local/cvsroot/octave/src/sighandlers.cc,v
retrieving revision 1.78
diff -u -r1.78 sighandlers.cc
--- src/sighandlers.cc 16 Dec 2003 05:11:26 -0000 1.78
+++ src/sighandlers.cc 18 Feb 2004 20:44:25 -0000
@@ -46,6 +46,7 @@
 #include "pager.h"
 #include "pt-bp.h"
 #include "sighandlers.h"
+#include "sysdep.h"
 #include "syswait.h"
 #include "toplev.h"
 #include "utils.h"
@@ -93,7 +94,8 @@
 #endif
 
 static void
-my_friendly_exit (const char *sig_name, int sig_number)
+my_friendly_exit (const char *sig_name, int sig_number,
+  bool save_vars = true)
 {
   static bool been_there_done_that = false;
 
@@ -112,7 +114,8 @@
 
       std::cerr << "panic: " << sig_name << " -- stopping myself...\n";
 
-      save_user_variables ();
+      if (save_vars)
+ save_user_variables ();
 
       if (sig_number < 0)
  exit (1);
@@ -235,7 +238,7 @@
   // here?
 
   if (can_interrupt)
-    octave_interrupt_state = 1;
+    octave_interrupt_state++;
 
   SIGHANDLER_RETURN (0);
 }
@@ -303,11 +306,7 @@
 // for SIGINT only.
 
 static RETSIGTYPE
-#if defined (ACK_USES_SIG) || defined (REINSTALL_USES_SIG)
 sigint_handler (int sig)
-#else
-sigint_handler (int)
-#endif
 {
   MAYBE_ACK_SIGNAL (sig);
 
@@ -334,7 +333,47 @@
       if (octave_interrupt_immediately)
  octave_jump_to_enclosing_context ();
       else
- octave_interrupt_state = 1;
+ {
+  octave_interrupt_state++;
+
+  if (interactive)
+    {
+      if (octave_interrupt_state > 3)
+ {
+  // XXX FIXME XXX -- might want to attempt to flush
+  // any pending input first...
+
+  std::cerr << "abort [y/N]? ";
+
+  int c = octave_kbhit ();
+
+  std::cerr << static_cast<char> (c) << std::endl;
+
+  if (c == 'y' || c == 'Y')
+    {
+      std::cerr << "save top-level workspace [y/N]? ";
+
+      c = octave_kbhit ();
+
+      std::cerr << static_cast<char> (c) << std::endl;
+
+      my_friendly_exit (sys_siglist[sig], sig,
+ (c == 'y' || c == 'Y'));
+    }
+  else
+    {
+      // We will still eventually interrupt and jump to
+      // the top level even if no additional interrupts
+      // happen, but we will have to wait until it is
+      // safe to do so.  It will take 3 more
+      // consecutive interrupts before we offer to
+      // abort again.
+
+      octave_interrupt_state = 1;
+    }
+ }
+    }
+ }
     }
 
   SIGHANDLER_RETURN (0);
@@ -357,7 +396,7 @@
   // here?
 
   if (pipe_handler_error_count  > 100)
-    octave_interrupt_state = 1;
+    octave_interrupt_state++;
 
   SIGHANDLER_RETURN (0);
 }


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

Dmitri A. Sergatskov-2
John W. Eaton wrote:
...
>
> Unfortunately, there is no warning about the size of the workspace.
> In this case, it turned into a 189MB file (text format -- maybe that
> should also be changed?), which took 10-15 seconds to save to a
> relatviely fast local disk.
>
...

Being bitten by ~1Gb octave-core few times myself, could we have some
equivalent of "ulimit -c" which limits size of octave_core?
May be just extend crash_dumps_octave_core to indicate
size in MB (this way current default "1" will not be completely
meaningless)?

Changing default format to octave-binary make sense IMHO.
Problem with binary formats is that they are not as robust
as text files, but I do not think those core files that
valuable to worry that much about them.

Sincerely,

Dmitri.


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

John W. Eaton-6
On 18-Feb-2004, Dmitri A. Sergatskov <[hidden email]> wrote:

| Being bitten by ~1Gb octave-core few times myself, could we have some
| equivalent of "ulimit -c" which limits size of octave_core?

Saving an incomplete (truncated) file would not be very useful.
Should Octave try to save as much data as possible by sorting the
variables by total size and then skipping the largest (if any) until
the total save size is less than the limit?

Currently we don't have a way to get the size (in bytes) of a
variable, but there were some changes posted recently for modifying
the output of the whos command that included a patch for that, I
think.

| May be just extend crash_dumps_octave_core to indicate
| size in MB (this way current default "1" will not be completely
| meaningless)?

If we decide to go the ulimit route, then I think this might be a
reasonable place to specify the limit.

| Changing default format to octave-binary make sense IMHO.
| Problem with binary formats is that they are not as robust
| as text files, but I do not think those core files that
| valuable to worry that much about them.

OK, with the following patch, saving the 10 million element vector
results in an 80MB file that can be saved to my local disk in a couple
of seconds.

Note that I introduced a new variable octave_core_format that allows
users to select their favorite format for this purpose.

jwe

src/ChangeLog:

2004-02-18  John W. Eaton  <[hidden email]>

        * load-save.cc (Voctave_core_format): New static_variable.
        (octave_core_format): New function.
        (symbols_of_load_save): Add DEFVAR for octave_core_format.
        (get_save_format): Rename from get_default_savae_format.
        Pass name of format as arg.  Change all uses.
        (save_user_variables): Use pass Voctave_core_format to
        get_save_format here.


Index: src/load-save.cc
===================================================================
RCS file: /usr/local/cvsroot/octave/src/load-save.cc,v
retrieving revision 1.186
diff -u -r1.186 load-save.cc
--- src/load-save.cc 23 Jan 2004 20:04:36 -0000 1.186
+++ src/load-save.cc 18 Feb 2004 22:26:43 -0000
@@ -87,6 +87,10 @@
 // "mat-binary", or "hdf5".
 static std::string Vdefault_save_format;
 
+// The output format for octave-core files.  May be one of "binary",
+// "text", "mat-binary", or "hdf5".
+static std::string Voctave_core_format;
+
 // The format string for the comment line at the top of text-format
 // save files.  Passed to strftime.  Should begin with `#' and contain
 // no newline characters.
@@ -910,12 +914,10 @@
 }
 
 static load_save_format
-get_default_save_format (void)
+get_save_format (const std::string& fmt)
 {
   load_save_format retval = LS_ASCII;
 
-  std::string fmt = Vdefault_save_format;
-
   if (fmt == "binary")
     retval = LS_BINARY;
   else if (fmt == "mat-binary" || fmt =="mat_binary")
@@ -1046,7 +1048,7 @@
 
       message (0, "attempting to save variables to `%s'...", fname);
 
-      load_save_format format = get_default_save_format ();
+      load_save_format format = get_save_format (Voctave_core_format);
 
       std::ios::openmode mode = std::ios::out|std::ios::trunc;
       if (format == LS_BINARY ||
@@ -1186,7 +1188,7 @@
 
   bool save_as_floats = false;
 
-  load_save_format format = get_default_save_format ();
+  load_save_format format = get_save_format (Vdefault_save_format);
 
   bool append = false;
 
@@ -1369,6 +1371,24 @@
   return status;
 }
 
+static int
+octave_core_format (void)
+{
+  int status = 0;
+
+  std::string s = builtin_string_variable ("octave_core_format");
+
+  if (s.empty ())
+    {
+      gripe_invalid_value_specified ("octave_core_format");
+      status = -1;
+    }
+  else
+    Voctave_core_format = s;
+
+  return status;
+}
+
 static std::string
 default_save_header_format (void)
 {
@@ -1417,6 +1437,18 @@
 It should have one of the following values: @code{\"ascii\"},\n\
 @code{\"binary\"}, @code{float-binary}, or @code{\"mat-binary\"}.  The\n\
 initial default save format is Octave's text format.\n\
+@seealso{octave_core_format}\n\
+@end defvr");
+
+  DEFVAR (octave_core_format, "binary", octave_core_format,
+    "-*- texinfo -*-\n\
+@defvr {Built-in Variable} octave_core_format\n\
+If Octave aborts, it attempts to save the contents of the top-level\n\
+workspace in a file using this format.  The value of\n\
+@code{octave_core_format} should have one of the following values:\n\
+@code{\"ascii\"}, @code{\"binary\"}, @code{float-binary}, or\n\
+@code{\"mat-binary\"}.  The default value is Octave's binary format.\n\
+@seealso{default_save_format}\n\
 @end defvr");
 
   DEFVAR (save_header_format_string, default_save_header_format (),


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

Dmitri A. Sergatskov-2
John W. Eaton wrote:

> On 18-Feb-2004, Dmitri A. Sergatskov <[hidden email]> wrote:
>
> | Being bitten by ~1Gb octave-core few times myself, could we have some
> | equivalent of "ulimit -c" which limits size of octave_core?
>
> Saving an incomplete (truncated) file would not be very useful.
> Should Octave try to save as much data as possible by sorting the
> variables by total size and then skipping the largest (if any) until
> the total save size is less than the limit?
>

...

I guess that would be ideal. In case it has a significant
overhead another option perhaps is to
save variables in chronological order -- so the newest variables
will be most likely lost. But that is what we would want
anyway -- we want to save older work and since the newest
vars either cause the crash or occur during the crash so
they have low fidelity anyway.

But even a truncated core would be useful. E.g., currently if I mistyped
rand(1000000,1) as rand(1000000)) octave will exhaust all memory and
die dumping 4Gig into core-file some 20 minutes later.
Since I do not have patience to wait those 20 minutes, I go and
kill -9 octave process ending up with truncated core.
Having ulimit would accomplish that with higher efficiency :).

I did few simple tests and it seems that octave can partially recover variables
from trancated either ascii or octave-binary files, and variables seems to
appear in chronological order automatically...

Sincerely,

Dmitri.


Reply | Threaded
Open this post in threaded view
|

Re: Working patch for FFTW 3.0.x and Nd FFT's

Paul Kienzle
In reply to this post by John W. Eaton-6
When I was reading up on it before, I understood that
you should not be doing blocking operations in a
signal handler.   At the very least, you should be
minimizing them.

How about:

        if (octave_interrupt_state == 2 && interactive) {
                std::cerr << "Press Ctrl-C again to abort" << std::endl;
        }
        if (octave_interrupt_state >= 3) {
                my_friendly_exit(sys_siglist[sig],sig,true);
        }

With ulimit, I don't see any reason not to save the workspace.

I would still be tempted to do something less drastic,
such as put a set jump before each binary function and
warn the user that the system may have leaked resources
and may be unstable.

Paul Kienzle
[hidden email]

On Feb 18, 2004, at 3:50 PM, John W. Eaton wrote:

>
>  static RETSIGTYPE
> -#if defined (ACK_USES_SIG) || defined (REINSTALL_USES_SIG)
>  sigint_handler (int sig)
> -#else
> -sigint_handler (int)
> -#endif
>  {
>    MAYBE_ACK_SIGNAL (sig);
>
> @@ -334,7 +333,47 @@
>        if (octave_interrupt_immediately)
>   octave_jump_to_enclosing_context ();
>        else
> - octave_interrupt_state = 1;
> + {
> +  octave_interrupt_state++;
> +
> +  if (interactive)
> +    {
> +      if (octave_interrupt_state > 3)
> + {
> +  // XXX FIXME XXX -- might want to attempt to flush
> +  // any pending input first...
> +
> +  std::cerr << "abort [y/N]? ";
> +
> +  int c = octave_kbhit ();
> +
> +  std::cerr << static_cast<char> (c) << std::endl;
> +
> +  if (c == 'y' || c == 'Y')
> +    {
> +      std::cerr << "save top-level workspace [y/N]? ";
> +
> +      c = octave_kbhit ();
> +
> +      std::cerr << static_cast<char> (c) << std::endl;
> +
> +      my_friendly_exit (sys_siglist[sig], sig,
> + (c == 'y' || c == 'Y'));
> +    }
> +  else
> +    {
> +      // We will still eventually interrupt and jump to
> +      // the top level even if no additional interrupts
> +      // happen, but we will have to wait until it is
> +      // safe to do so.  It will take 3 more
> +      // consecutive interrupts before we offer to
> +      // abort again.
> +
> +      octave_interrupt_state = 1;
> +    }
> + }
> +    }
> + }
>      }


12