segfaults in graphics code (likely threading issues?)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

segfaults in graphics code (likely threading issues?)

John W. Eaton
Administrator
We have two bug reports about random crashes during the build or when
running the tests:

   https://savannah.gnu.org/bugs/?57591
   https://savannah.gnu.org/bugs/?56952

The crashes usually happen when executing some graphics code and my
guess is that they are all related to threading issues with the Qt
graphics toolkit.

We manage the graphics handle data in libinterp/corefcn/graphics.{h,cc}
and the Qt code is in the libgui/graphics directory.  The Qt graphics
display runs in the GUI thread, separate from the thread where the
interpreter runs.  To safely access data from one thread in the other,
we need some kind of thread safe queue or locking.  Currently, I think
all we are doing is using some fairly coarse locking on the gh_manager
object.  Is that sufficient?  Are there cases where we lock to obtain a
graphics_object in the GUI thread but then unlock while still using the
object?

There are also some questionable calls to unlock the mutex.  For
example, in qt_graphics_toolkit::initialize and
qt_graphics_toolkit::finalize, I see

         // FIXME: We need to unlock the mutex here but we have no way
to know if
         // if it was previously locked by this thread, and thus if we
should
         // re-lock it.

         gh_manager& gh_mgr = m_interpreter.get_gh_manager ();

         gh_mgr.unlock ();

Even if that is not the cause of the trouble, it seems worth
eliminating.  It seems to me that locks should be managed automatically
based on scope, not explicitly unlocked in functions separate from where
they are initially locked.

Is there a better way to pass data between the separate interpreter and
graphics toolkit threads?  Can we simply lock a mutex in graphics_object
methods so that we can safely access them in multiple threads?  Or do we
need a whole new way of passing data between the interpreter and the
graphics toolkit?

jwe

Reply | Threaded
Open this post in threaded view
|

Re: segfaults in graphics code (likely threading issues?)

Pantxo
Le 11/06/2020 à 16:26, John W. Eaton a écrit :

> We have two bug reports about random crashes during the build or when
> running the tests:
>
>   https://savannah.gnu.org/bugs/?57591
>   https://savannah.gnu.org/bugs/?56952
>
> The crashes usually happen when executing some graphics code and my
> guess is that they are all related to threading issues with the Qt
> graphics toolkit.
>
> We manage the graphics handle data in
> libinterp/corefcn/graphics.{h,cc} and the Qt code is in the
> libgui/graphics directory.  The Qt graphics display runs in the GUI
> thread, separate from the thread where the interpreter runs. To safely
> access data from one thread in the other, we need some kind of thread
> safe queue or locking.  Currently, I think all we are doing is using
> some fairly coarse locking on the gh_manager object.  Is that
> sufficient?  Are there cases where we lock to obtain a graphics_object
> in the GUI thread but then unlock while still using the object?
>
> There are also some questionable calls to unlock the mutex.  For
> example, in qt_graphics_toolkit::initialize and
> qt_graphics_toolkit::finalize, I see
>
>         // FIXME: We need to unlock the mutex here but we have no way
> to know if
>         // if it was previously locked by this thread, and thus if we
> should
>         // re-lock it.
>
>         gh_manager& gh_mgr = m_interpreter.get_gh_manager ();
>
>         gh_mgr.unlock ();
>
> Even if that is not the cause of the trouble, it seems worth
> eliminating.  It seems to me that locks should be managed
> automatically based on scope, not explicitly unlocked in functions
> separate from where they are initially locked.

Yeah, I am responsible for this. The reason I used this construct is
that for some reason (see [1]) we critically need to make sure the
interpreter thread doesn't hold the graphics mutex when we initialize
and finalize objects in the gui thread.

When you say "locks should be managed automatically based on scope", are
you referring to octave::autolock? In fact autolock objects do have a
similar problem.

I think octave::autolock guard is useful to make sure the graphics mutex
is unlocked when the guard gets out of scope.


     ~autolock (void)
          {
            if (m_lock_result)
              m_mutex.unlock ();
          }


Unfortunately this also means recursive constructs won't be reliable:


     void delete_objects (void)

     {

         ...

         // Critical section
         octave::autolock guard (gh_mgr.graphics_lock ());

         prepare_for_deletion ();

         // Do other critical stuff and hope prepare_for_deletion didn't
unlock the mutex

         ...

     }

     void prepare_for_deletion (void)

     {

         ...

         // Critical section
         octave::autolock guard (gh_mgr.graphics_lock ());

         ...

     }


Here prepare_for_deletion unlocks the mutex when it returns, so autolock
must be used with care. I think autolock would be more convenient if it
was able in some way to restore the previous state of the mutex rather
than blindly unlocking it in the destructor.

IIUC std::unique_lock has such capability ([2]) but this would require
using std::mutex rather than our octave::mutex.


>
> Is there a better way to pass data between the separate interpreter
> and graphics toolkit threads?  Can we simply lock a mutex in
> graphics_object methods so that we can safely access them in multiple
> threads?  Or do we need a whole new way of passing data between the
> interpreter and the graphics toolkit?
>
> jwe
>
>
Having access methods such as set and get protected against concurrent
access would not hurt, but it won't probably be enough. For example,
this does not allow creating critical sections in the code, where the
other thread cannot access the graphics_object at all until you have
done all you needed.

Pantxo

[1] https://savannah.gnu.org/bugs/?55225

[2] https://en.cppreference.com/w/cpp/thread/unique_lock/owns_lock