Quantcast

Deprecating --enable-bounds-check?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Deprecating --enable-bounds-check?

Rik-4
5/10/17

jwe,

What do you think about deprecating and eventually removing
--enable-bounds-check?

The current situation is that all of liboctave has Octave's own internal
bounds checking mechanism.  It is implemented around a #define selected by
--enable-bounds-check and defaults to being disabled.  Example code from
dim-vector.h shows how it works in practice.

  // Fast access with absolutely no checking

  octave_idx_type xelem (int i) const { return rep[i]; }

  // Safe access to to elements

  octave_idx_type elem (int i) const
  {
#if defined (OCTAVE_ENABLE_BOUNDS_CHECK)
    assert (i >= 0 && i < ndims ());
#endif
    return xelem (i);
  }

There are two philosophical issues for me.  First, I would prefer to use
standard, pre-existing blocks of code wherever possible rather than
handrolling our own.  In this case, if memory violations are a potential
issue then maybe the particular user should just use
--enable-address-sanitizer-flags and use an already proven, lightweight
bounds checker.  There was also talk of expanding the configure script to
easily allow the sanitizer to check other options besides just addresses.

The second issue is that it complicates the API to have both elem() and
xelem() functions.  Because we really do want to favor efficiency in the
core there are many times where a programmer has to carefully consider
which function to use.  That's tedious, and what if the programmer gets it
wrong?  It seems preferable to just have a single elem() function for the
API.  Also, the index operator () is currently aliased to elem().  This is
nice because it produces readable C++ code that looks like m-file code.
For example,

for (octave_idx_type i = 0; i < x.numel (); i++)
  x(i) = 1.0;

But right now, this is coded for performance as

for (octave_idx_type i = 0; i < x.numel (); i++)
  x.xelem (i) = 1.0;


Thoughts?

--Rik


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Deprecating --enable-bounds-check?

John W. Eaton
Administrator
On 05/11/2017 11:10 AM, Rik wrote:
> 5/10/17
>
> jwe,
>
> What do you think about deprecating and eventually removing
> --enable-bounds-check?

It seems reasonable to me.  We could make xelem, elem, and () all do the
same thing, and deprecate xelem (and possibly elem, or is there an
advantage to having both operator and member function forms?).

Way back when, there were no tools like valgrind or the address
sanitizer options for GCC (and no Clang at all).  So I thought it made
sense to add bounds checking as an option for the classes, even if it
couldn't catch everything.  You are right that now those tools are much
better than the bounds checking we have, so we might as well just use
them instead of trying to do a halfway job of it in the code.

Do we really need to deprecate the option?  I guess we could keep the
option but have it do nothing except print a message for a few releases.
  But I don't see any point in keeping the functionality at this point.

I'd also say that we should just standardize on using operator ()
instead of the elem method as well.

Do you want to work on these changes or should I?

jwe


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Deprecating --enable-bounds-check?

Rik-4
On 05/11/2017 10:07 AM, John W. Eaton wrote:

> On 05/11/2017 11:10 AM, Rik wrote:
>> 5/10/17
>>
>> jwe,
>>
>> What do you think about deprecating and eventually removing
>> --enable-bounds-check?
>
> It seems reasonable to me.  We could make xelem, elem, and () all do the
> same thing, and deprecate xelem (and possibly elem, or is there an
> advantage to having both operator and member function forms?).
>
> Way back when, there were no tools like valgrind or the address sanitizer
> options for GCC (and no Clang at all).  So I thought it made sense to add
> bounds checking as an option for the classes, even if it couldn't catch
> everything.  You are right that now those tools are much better than the
> bounds checking we have, so we might as well just use them instead of
> trying to do a halfway job of it in the code.

Definitely no criticism.  It was a different time, and now things have
changed for the better so we can clean up the code.

>
> Do we really need to deprecate the option?  I guess we could keep the
> option but have it do nothing except print a message for a few releases.
>  But I don't see any point in keeping the functionality at this point.

Just in case someone is using it, lets change configure.ac to print a
warning message that says they should configure with the address-sanitizer
option instead for detecting out-of-bound memory accesses.

>
> I'd also say that we should just standardize on using operator () instead
> of the elem method as well.

Without xelem, that is now possible and makes sense to me.

>
> Do you want to work on these changes or should I?

Why don't you go ahead if you have the time.  I will continue, slowly, to
try and rationalize some of the #includes.

--Rik


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Deprecating --enable-bounds-check?

John W. Eaton
Administrator
On 05/11/2017 01:16 PM, Rik wrote:

> Definitely no criticism.  It was a different time, and now things have
> changed for the better so we can clean up the code.

Sorry if my reply seemed defensive.  I certainly didn't intend that.  I
just thought it would be worth stating for the record how that feature
came about.

> Just in case someone is using it, lets change configure.ac to print a
> warning message that says they should configure with the address-sanitizer
> option instead for detecting out-of-bound memory accesses.
>
>> I'd also say that we should just standardize on using operator () instead
>> of the elem method as well.
>
> Without xelem, that is now possible and makes sense to me.
>
>> Do you want to work on these changes or should I?
>
> Why don't you go ahead if you have the time.  I will continue, slowly, to
> try and rationalize some of the #includes.

OK.

jwe



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Deprecating --enable-bounds-check?

John W. Eaton
Administrator
In reply to this post by Rik-4
On 05/11/2017 01:16 PM, Rik wrote:

>> It seems reasonable to me.  We could make xelem, elem, and () all do the
>> same thing, and deprecate xelem (and possibly elem, or is there an
>> advantage to having both operator and member function forms?).

Oops, I started looking at this and realized that I completely forgot
about the other difference between xelem and elem:  elem checks the
reference count.

We could still eliminate bounds checking by doing the following:

     deprecating checkelem
     removing the conditional call to it in ()
     replacing the elem method with () everywhere

Then, separately, we could decide to eliminate xelem.  If we eliminate
it, then all uses can be replaced by using the data method to get a
pointer to the underlying data vector then and indexing that directly
with "[]".  Doing that is probably better than using xelem anyway,
because it forces you to think about whether the object is const and/or
whether you want to make the representation unique instead of just
ignoring the reference count completely.

What do you think?

Either way, I can go ahead and do the first part now.

jwe


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Deprecating --enable-bounds-check?

Rik-4
On 05/11/2017 11:49 AM, John W. Eaton wrote:
> On 05/11/2017 01:16 PM, Rik wrote: > >>> It seems reasonable to me.  We could make xelem, elem, and () all do the >>> same thing, and deprecate xelem (and possibly elem, or is there an >>> advantage to having both operator and member function forms?). > > Oops, I started looking at this and realized that I completely forgot about the other difference between xelem and elem:  elem checks the reference count. > > We could still eliminate bounds checking by doing the following: > >     deprecating checkelem >     removing the conditional call to it in () >     replacing the elem method with () everywhere > > Then, separately, we could decide to eliminate xelem.  If we eliminate it, then all uses can be replaced by using the data method to get a pointer to the underlying data vector then and indexing that directly with "[]".  Doing that is probably better than using xelem anyway, because it forces you to think about whether the object is const and/or whether you want to make the representation unique instead of just ignoring the reference count completely.

A long time back I did some benchmarking on using fortran_vec and then indexing directly on the data.  The performance boost was about 10% which is enough to be noticeable and worth it for things like mapper functions.  For files outside the core, say a user's .oct file, I think the reduction in the clarity of the code probably isn't worth the 10%.

What is the 2x2 decision diagram for this?

          const    non-const
        |        |           |
 unique |        |           |
        |--------|-----------|
 same   |        |           |
        |        |           |


I can see that predicate test functions might be const, same.  And I could see mapper functions as being non-const, same.  When and how would the programmer decide on unique/same?


> > What do you think? > > Either way, I can go ahead and do the first part now.

That's better for the philosophy of one change / one changeset as well.

--Rik


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Deprecating --enable-bounds-check?

John W. Eaton
Administrator
On 05/11/2017 04:17 PM, Rik wrote:

> What is the 2x2 decision diagram for this?
>
>           const    non-const
>         |        |           |
>  unique |        |           |
>         |--------|-----------|
>  same   |        |           |
>         |        |           |

For a const object there is never a difference.  It can't be modified so
the indexing operator doesn't need to check the reference count or make
the representation unique.

For non-const, there is always a check of the reference count.  If the
reference count is greater than 1, then there is a copy on the first
call to the indexing operator.

jwe


Loading...