__run_test_suite__ results in segfault on Windows systems

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

__run_test_suite__ results in segfault on Windows systems

John W. Eaton
Administrator
Can anyone else reproduce this problem?

Build and install a copy of Octave for Windows using recent stable
sources and then execute __run_test_suite__ at the command prompt.  Does
it crash Octave?  It does for me, and this is, I think relatively new
behavior.

First, I noticed that the GUI disappeared while running the tests.

Next, I modified octave.vbs to run Octave with gdb.  That worked but
didn't reveal much useful info because everything in the installer is
stripped.

Then I replaced the liboctave, libinterp, and libgui DLLs with ones that
are not stripped and tried again.  After running the tests again, I end
up at the gdb prompt.  A stack trace shows the following:

Thread 21 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 680.0xad0]
0x00000000010490aa in graphics_toolkit::initialize (go=..., this=0x4fb36858)
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/corefcn/graphics-toolkit.h:210
210
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/l
ibinterp/corefcn/graphics-toolkit.h: No such file or directory.
(gdb) where
#0  0x00000000010490aa in graphics_toolkit::initialize (go=...,
     this=0x4fb36858)
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/corefcn/graphics-toolkit.h:210
#1  base_graphics_object::initialize (this=0x58a161f0, go=...)
     at libinterp/corefcn/graphics.h:2866
#2  0x0000000000ce4401 in graphics_object::initialize (this=0x4fb368d0)
     at libinterp/corefcn/graphics.h:3093
#3  xinitialize (h=...)
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/corefcn/graphics.cc:3055
#4  0x0000000000d7bd0f in F__go_figure__ (args=...)
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/corefcn/graphics.cc:12656
#5  0x0000000000a3856f in octave_builtin::call (this=0x50a7a2b0, tw=...,
     nargout=1, args=...)
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/octave-value/ov-builtin.cc:65
#6  0x0000000000b47d0a in octave::tree_evaluator::visit_index_expression (
     this=0x4e59f4c8, idx_expr=...)
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/parse-tree/pt-eval.cc:2008
#7  0x0000000001093146 in octave::tree_evaluator::evaluate (this=0x4e59f4c8,
     expr=0x4da888d0, nargout=<optimized out>)
--Type <RET> for more, q to quit, c to continue without paging--
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/parse-tree/pt-eval.h:312
#8  0x0000000000b49332 in octave::tree_evaluator::visit_simple_assignment (
     this=0x4e59f4c8, expr=...)
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/parse-tree/pt-eval.cc:2681
#9  0x0000000001093146 in octave::tree_evaluator::evaluate (this=0x4e59f4c8,
     expr=0x522f01a0, nargout=<optimized out>)
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/parse-tree/pt-eval.h:312
#10 0x0000000000b44960 in octave::tree_evaluator::visit_statement (
     this=0x4e59f4c8, stmt=...)
     at
/scratch/build/mxe-octave-w64-32-stable/tmp-stable-octave/octave-5.0.1/li
binterp/parse-tree/pt-eval.cc:2776

At the point of the crash, I see the following for the graphics_toolkit
object:

(gdb) p *this
$1 = {rep = 0x4deac8e0}
(gdb) p *this.rep
$2 = {_vptr.base_graphics_toolkit = 0x6f9900d0, name = {
     static npos = 18446744073709551615,
     _M_dataplus = {<std::allocator<char>> =
{<__gnu_cxx::new_allocator<char>> =
{<No data fields>}, <No data fields>}, _M_p = 0x4deac8f8 "qt"},
     _M_string_length = 2, {_M_local_buf = "qt\000º\rd-º\rd-º\rd-º",
       _M_allocated_capacity = 13451671603771372657}}, count = {count = 3}}

So I'm not sure why the crash is happening.  The toolkit object and rep
pointer seem to be valid.  Maybe this is a threading problem?  In any
case, I'm not sure how to debug it.

This failure happens when running the tests in the bug-55308.tst file.
Unfortunately, it does not happen if I start Octave and run only this
test.  It seems to be dependent on some other state that I'm currently
only able to reproduce by running all the previous tests.  Needless to
say, this is a slow process.

Since __run_test_suite__ worked without crashing for Octave 4.4.1, I
suppose it will be possible to bisect, but that's likely to be quite a
slow process.

Clearly, we need to have buildbot running the tests for Windows systems.
  In the past, I've thought about doing that by having buildbot start a
VM running Windows, the copy the installer to the VM, execute it to
install Octave, and then execute the tests and return the log to the
master buildbot.  But I haven't actually done this yet.  Does anyone
have experience with doing something similar?

Comments?  Suggestions?

Thanks,

jwe


Reply | Threaded
Open this post in threaded view
|

Re: __run_test_suite__ results in segfault on Windows systems

mmuetzel
On 15. Januar 2019 at 17:11 Uhr "John W. Eaton" wrote:
> Can anyone else reproduce this problem?

You're not the only one with this problem. This is bug #55047:
https://savannah.gnu.org/bugs/index.php?55047

> Since __run_test_suite__ worked without crashing for Octave 4.4.1, I
> suppose it will be possible to bisect, but that's likely to be quite a
> slow process.

I tried to bisect but couldn't pinpoint a specific changeset because the source didn't cross-compile for a while.

> Clearly, we need to have buildbot running the tests for Windows systems.
>   In the past, I've thought about doing that by having buildbot start a
> VM running Windows, the copy the installer to the VM, execute it to
> install Octave, and then execute the tests and return the log to the
> master buildbot.  But I haven't actually done this yet.  Does anyone
> have experience with doing something similar?

A while back, I set up a Windows VM and started an SSH server on it. Nothing more. I'll try to find the respective mails if that would be helpful.

Markus

Reply | Threaded
Open this post in threaded view
|

Re: __run_test_suite__ results in segfault on Windows systems

John W. Eaton
Administrator
On 1/15/19 12:36 PM, "Markus Mützel" wrote:
> On 15. Januar 2019 at 17:11 Uhr "John W. Eaton" wrote:
>> Can anyone else reproduce this problem?
>
> You're not the only one with this problem. This is bug #55047:
> https://savannah.gnu.org/bugs/index.php?55047

Ah, I should have just searched the bug tracker for "segfault".

> I tried to bisect but couldn't pinpoint a specific changeset because the source didn't cross-compile for a while.

Even knowing a range of good/bad might be helpful.

> A while back, I set up a Windows VM and started an SSH server on it. Nothing more. I'll try to find the respective mails if that would be helpful.

Yeah, anything about how to make Windows start the ssh server that will
allow it to accept sftp and accept commands so we could run the
installer and tests.

Thanks for the bug number and info.

jwe


Reply | Threaded
Open this post in threaded view
|

Re: __run_test_suite__ results in segfault on Windows systems

Mike Miller-4
In reply to this post by John W. Eaton
On Tue, Jan 15, 2019 at 11:11:25 -0500, John W. Eaton wrote:
> Can anyone else reproduce this problem?

Yeah, it's been reported as https://savannah.gnu.org/bugs/?55047,
including a partial bisection.

--
mike

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: __run_test_suite__ results in segfault on Windows systems

mmuetzel
In reply to this post by John W. Eaton
Am 15. Januar 2019 um 18:40 Uhr "John W. Eaton" wrote:
> Yeah, anything about how to make Windows start the ssh server that will
> allow it to accept sftp and accept commands so we could run the
> installer and tests.

I hope this is helpful:
http://octave.1599824.n4.nabble.com/Behold-Buildbot-s-waterfall-display-has-never-been-so-green-td4685571.html#a4685624

Markus

Reply | Threaded
Open this post in threaded view
|

Re: __run_test_suite__ results in segfault on Windows systems

PhilipNienhuis
In reply to this post by John W. Eaton
John W. Eaton wrote

> On 1/15/19 12:36 PM, "Markus Mützel" wrote:
>> On 15. Januar 2019 at 17:11 Uhr "John W. Eaton" wrote:
>>> Can anyone else reproduce this problem?
>>
>> You're not the only one with this problem. This is bug #55047:
>> https://savannah.gnu.org/bugs/index.php?55047
>
> Ah, I should have just searched the bug tracker for "segfault".
>
>> I tried to bisect but couldn't pinpoint a specific changeset because the
>> source didn't cross-compile for a while.
>
> Even knowing a range of good/bad might be helpful.

I have several installers archived from before Nov 18; see attached picture.
I can install/uninstall them one by one to do some rough "bisection". Would
that help?
@Markus: if yes, any suggestion which one to try first?

(Note that I usually make cross-builds with octave "tips" but I do not
regularly update the mxe-octave build tree.)

Philip
Philip



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html

Reply | Threaded
Open this post in threaded view
|

Re: __run_test_suite__ results in segfault on Windows systems

PhilipNienhuis
In reply to this post by John W. Eaton
John W. Eaton wrote

> On 1/15/19 12:36 PM, "Markus Mützel" wrote:
>> On 15. Januar 2019 at 17:11 Uhr "John W. Eaton" wrote:
>>> Can anyone else reproduce this problem?
>>
>> You're not the only one with this problem. This is bug #55047:
>> https://savannah.gnu.org/bugs/index.php?55047
>
> Ah, I should have just searched the bug tracker for "segfault".
>
>> I tried to bisect but couldn't pinpoint a specific changeset because the
>> source didn't cross-compile for a while.
>
> Even knowing a range of good/bad might be helpful.

I have thse binaries archived, forgot to attach to other message
Philip
<http://octave.1599824.n4.nabble.com/file/t248596/Octave-Installers.png>




--
Sent from: http://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html

Reply | Threaded
Open this post in threaded view
|

Re: __run_test_suite__ results in segfault on Windows systems

mmuetzel
In reply to this post by PhilipNienhuis
Am 15. Januar 2019 um 19:38 Uhr PhilipNienhuis wrote:
> I have several installers archived from before Nov 18; see attached picture.
> I can install/uninstall them one by one to do some rough "bisection". Would
> that help?
> @Markus: if yes, any suggestion which one to try first?

Thanks for your help.

Anything between 28/09/2018 (hg id 332be8be16eb) and 23/10/2018 (hg id 14e844f1459a) might be interesting.
Could you please report (best at savannah) which hg id was the last one to successfully complete the test suite (with graphics_toolt qt) - and which was the first you get the crash with?

Markus