Question

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Question

Chuck Robey-3
I just got around to actually running a new octave script on my newly
ported 2.0.12, and I saw it was core-dumping on startup.  The specific
point it core-dumps (with a bus error) on is in the GNU stdc++ header
file bastring.h, line 69:

  charT* data () { return reinterpret_cast<charT *>(this + 1); }
  charT& operator[] (size_t s) { return data () [s]; }
  charT* grab () { if (selfish) return clone (); ++ref; return data (); }
  void release () { if (--ref == 0) delete this; }  <== HERE

  inline static void * operator new (size_t, size_t);
  inline static Rep* create (size_t);

This seems to me to be kicked off from octave.cc line 414, where:

  initialize_error_handlers ();

  initialize_globals (argv[0]);  <== HERE
 
  initialize_pathsearch ();

initialize_globals is supposed to be called with a string class
argument, not just any string, isn't it?  You're casting argv[0] as a
class string, but it's a simple C string, right?  I'm not totally at
home with c++, so I can't be trusted on this, but it seems you want a
constructor here ... I did check, and argv[0] is a pointer to a good
full octave path.

I can and will perform any tests you can imagine, and would in fact be
pleased to work off any guesses you can come up with.  The platform is
FreeBSD-current, using gcc 2.7.2.1.


----------------------------+-----------------------------------------------
Chuck Robey                 | Interests include any kind of voice or data
[hidden email]         | communications topic, C programming, and Unix.
213 Lakeside Drive Apt T-1  |
Greenbelt, MD 20770         | I run Journey2 and picnic (FreeBSD-current)
(301) 220-2114              | and jaunt (NetBSD).
----------------------------+-----------------------------------------------





Reply | Threaded
Open this post in threaded view
|

Question

John W. Eaton-6
On 15-May-1998, Chuck Robey <[hidden email]> wrote:

| I just got around to actually running a new octave script on my newly
| ported 2.0.12, and I saw it was core-dumping on startup.  The specific
| point it core-dumps (with a bus error) on is in the GNU stdc++ header
| file bastring.h, line 69:
|
|   charT* data () { return reinterpret_cast<charT *>(this + 1); }
|   charT& operator[] (size_t s) { return data () [s]; }
|   charT* grab () { if (selfish) return clone (); ++ref; return data (); }
|   void release () { if (--ref == 0) delete this; }  <== HERE
|
|   inline static void * operator new (size_t, size_t);
|   inline static Rep* create (size_t);
|
| This seems to me to be kicked off from octave.cc line 414, where:
|
|   initialize_error_handlers ();
|
|   initialize_globals (argv[0]);  <== HERE
|  
|   initialize_pathsearch ();
|
| initialize_globals is supposed to be called with a string class
| argument, not just any string, isn't it?  You're casting argv[0] as a
| class string, but it's a simple C string, right?  I'm not totally at
| home with c++, so I can't be trusted on this, but it seems you want a
| constructor here ... I did check, and argv[0] is a pointer to a good
| full octave path.

The compiler is supposed to insert the string constructor
automatically.  I'd suspect that at least that part is correct.

| I can and will perform any tests you can imagine, and would in fact be
| pleased to work off any guesses you can come up with.  The platform is
| FreeBSD-current, using gcc 2.7.2.1.

My guess is that the copy of libstdc++ you have is out of sync with
the version of the compiler or include files that you used to build
Octave.

jwe


Reply | Threaded
Open this post in threaded view
|

Re: Question

Chuck Robey-3
On Fri, 15 May 1998, John W. Eaton wrote:

> The compiler is supposed to insert the string constructor
> automatically.  I'd suspect that at least that part is correct.

When I ran gdb, and tried to read the passed variable, it told me
incomplete type.  That's why I thought maybe it had been constructed
wrong, and the ref value was wrong (it might be trying to free a class
instance that hadn't been created).  The constructor is NOT called,
according to gdb.

>
> | I can and will perform any tests you can imagine, and would in fact be
> | pleased to work off any guesses you can come up with.  The platform is
> | FreeBSD-current, using gcc 2.7.2.1.
>
> My guess is that the copy of libstdc++ you have is out of sync with
> the version of the compiler or include files that you used to build
> Octave.

It's the libg++ 2.7.2, and there wasn't a libg++ 2.7.2.1 (2.7.2 was the
one that covered that release).  Does your software require something
more recent?  The compiler were using is 2.7.2.1, and we're not going to
be upgrading real quick here.

I'll try more experimentation, I wish I felt as at-home with C++ as with
C, but I'm not entirely lost.

----------------------------+-----------------------------------------------
Chuck Robey                 | Interests include any kind of voice or data
[hidden email]         | communications topic, C programming, and Unix.
213 Lakeside Drive Apt T-1  |
Greenbelt, MD 20770         | I run Journey2 and picnic (FreeBSD-current)
(301) 220-2114              | and jaunt (NetBSD).
----------------------------+-----------------------------------------------





Reply | Threaded
Open this post in threaded view
|

Re: Question

John W. Eaton-6
On 15-May-1998, Chuck Robey <[hidden email]> wrote:

| On Fri, 15 May 1998, John W. Eaton wrote:
|
| > The compiler is supposed to insert the string constructor
| > automatically.  I'd suspect that at least that part is correct.
|
| When I ran gdb, and tried to read the passed variable, it told me
| incomplete type.  That's why I thought maybe it had been constructed
| wrong, and the ref value was wrong (it might be trying to free a class
| instance that hadn't been created).  The constructor is NOT called,
| according to gdb.

What version of gdb are you using?

When gdb says incomplete type, it usually means that for some reason
it doesn't have enough information.  Was your libstdc++ compiled with
debugging symbols?

You probably don't see a call to the constructor because it was
probably inlined.

| > | I can and will perform any tests you can imagine, and would in fact be
| > | pleased to work off any guesses you can come up with.  The platform is
| > | FreeBSD-current, using gcc 2.7.2.1.
| >
| > My guess is that the copy of libstdc++ you have is out of sync with
| > the version of the compiler or include files that you used to build
| > Octave.
|
| It's the libg++ 2.7.2, and there wasn't a libg++ 2.7.2.1 (2.7.2 was the
| one that covered that release).  Does your software require something
| more recent?  The compiler were using is 2.7.2.1, and we're not going to
| be upgrading real quick here.

Currently, you should be able to build Octave with g++ 2.7.2, g++
2.8.x, or egcs 1.0.x (I use all three at various times).  Fairly soon,
you may have to have egcs or 2.8.x to build the development version
(2.1.x).

Are the C++ header files consistent with the library that you have
installed?

jwe


Reply | Threaded
Open this post in threaded view
|

Re: Question

Chuck Robey-3
On Fri, 15 May 1998, John W. Eaton wrote:

> On 15-May-1998, Chuck Robey <[hidden email]> wrote:
>
> | On Fri, 15 May 1998, John W. Eaton wrote:
> |
> | > The compiler is supposed to insert the string constructor
> | > automatically.  I'd suspect that at least that part is correct.
> |
> | When I ran gdb, and tried to read the passed variable, it told me
> | incomplete type.  That's why I thought maybe it had been constructed
> | wrong, and the ref value was wrong (it might be trying to free a class
> | instance that hadn't been created).  The constructor is NOT called,
> | according to gdb.
>
> What version of gdb are you using?
>
> When gdb says incomplete type, it usually means that for some reason
> it doesn't have enough information.  Was your libstdc++ compiled with
> debugging symbols?

I recompiled libstdc++ with debugging symbols, and I have gotten much
further.  First, my libstdc++ is in sync, I checked.  I don't load these
things myself, I run a full cvs archive of FreeBSD-current here, and I
was able to verify via cvs-log that it was a virgin import from the gnu
sources, with no patches.

FreeBSD is (historically and notoriously) slow to upgrade the compiler,
until new versions have a good track record.  My guess is that the
upgrade to 2.8.x is at least 3 months away, and I would not be surprised
to find it happens early next year, because the furious pace of locally
inspired changes will not allow too much other instability in the tree
right away.

I've done more investigation.  The SIGBUS occurs in line 198 of
octave.cc (ddd cut'n'paste):

    196 #endif
    197
    198   Vprogram_invocation_name = name;
    199   size_t pos = Vprogram_invocation_name.rfind ('/');
    200   Vprogram_name = (pos == NPOS)

This calls, indirectly, to libstdc++/std/bastring.cc line 147:

    146 // _lib.string.cons_ construct/copy/destroy:
    147   basic_string& operator= (const basic_string& str)
    148     {
    149       if (&str != this) { rep ()->release (); dat = str.rep()->grab (); }
    150       return *this;
    151     }

The release function is in bastring.h line 69:

     67   charT& operator[] (size_t s) { return data () [s]; }
     68   charT* grab () { if (selfish) return clone (); ++ref; return data ();}  
     69   void release () { if (--ref == 0) delete this; }
     70  
     71   inline static void * operator new (size_t, size_t);

I'm not sure about this, but on entering the function, ref is 1, and
that's the value of this.ref, right?  (understand please that I could be
stronger perhaps in C++, sorry).  It _looks_ like "this" is being
deleted, then immediately referenced, which would sure cause the SIGBUS.

Does this make sense?

----------------------------+-----------------------------------------------
Chuck Robey                 | Interests include any kind of voice or data
[hidden email]         | communications topic, C programming, and Unix.
213 Lakeside Drive Apt T-1  |
Greenbelt, MD 20770         | I run Journey2 and picnic (FreeBSD-current)
(301) 220-2114              | and jaunt (NetBSD).
----------------------------+-----------------------------------------------





Reply | Threaded
Open this post in threaded view
|

Re: Question

John W. Eaton-6
On 15-May-1998, Chuck Robey <[hidden email]> wrote:

| I recompiled libstdc++ with debugging symbols, and I have gotten much
| further.  First, my libstdc++ is in sync, I checked.  I don't load these
| things myself, I run a full cvs archive of FreeBSD-current here, and I
| was able to verify via cvs-log that it was a virgin import from the gnu
| sources, with no patches.
|
| FreeBSD is (historically and notoriously) slow to upgrade the compiler,
| until new versions have a good track record.  My guess is that the
| upgrade to 2.8.x is at least 3 months away, and I would not be surprised
| to find it happens early next year, because the furious pace of locally
| inspired changes will not allow too much other instability in the tree
| right away.
|
| I've done more investigation.  The SIGBUS occurs in line 198 of
| octave.cc (ddd cut'n'paste):
|
|     196 #endif
|     197
|     198   Vprogram_invocation_name = name;
|     199   size_t pos = Vprogram_invocation_name.rfind ('/');
|     200   Vprogram_name = (pos == NPOS)
|
| This calls, indirectly, to libstdc++/std/bastring.cc line 147:
|
|     146 // _lib.string.cons_ construct/copy/destroy:
|     147   basic_string& operator= (const basic_string& str)
|     148     {
|     149       if (&str != this) { rep ()->release (); dat = str.rep()->grab (); }
|     150       return *this;
|     151     }
|
| The release function is in bastring.h line 69:
|
|      67   charT& operator[] (size_t s) { return data () [s]; }
|      68   charT* grab () { if (selfish) return clone (); ++ref; return data ();}  
|      69   void release () { if (--ref == 0) delete this; }
|      70  
|      71   inline static void * operator new (size_t, size_t);
|
| I'm not sure about this, but on entering the function, ref is 1, and
| that's the value of this.ref, right?  (understand please that I could be
| stronger perhaps in C++, sorry).  It _looks_ like "this" is being
| deleted, then immediately referenced, which would sure cause the SIGBUS.

I believe that the `this' that's being deleted is a __bsrep object,
not a basic_string object.

The assignment operator checks to see that it is not trying to do an
assignment to itself.  If it is not, it deletes the current contents
of the string on the LHS of the operator= (possibly just decrementing
the reference count).  This part of the operation is done in the
rep()->release() function call.  Next it grabs a pointer to the data
from the string on the RHS of the operator=, assigning it to
this->dat.  The str.rep()->grab() function call also increments the
reference counter.

I think the code is ok.

Does the following simpler test program work correctly?

jwe


#include <string>
#include <iostream.h>

string Vprogram_invocation_name;
string Vprogram_name;

void
tryme (const string& name)
{
  Vprogram_invocation_name = name;
  size_t pos = Vprogram_invocation_name.rfind ('/');
  Vprogram_name = (pos == NPOS)
    ? Vprogram_invocation_name : Vprogram_invocation_name.substr (pos+1);
}

int
main (int argc, char **argv)
{
  tryme (argv[0]);

  cout << Vprogram_invocation_name << "\n";
  cout << Vprogram_name << "\n";

  return 0;
}


Reply | Threaded
Open this post in threaded view
|

Re: Question

Chuck Robey-3
On Sat, 16 May 1998, John W. Eaton wrote:

> I believe that the `this' that's being deleted is a __bsrep object,
> not a basic_string object.
>
> The assignment operator checks to see that it is not trying to do an
> assignment to itself.  If it is not, it deletes the current contents
> of the string on the LHS of the operator= (possibly just decrementing
> the reference count).  This part of the operation is done in the
> rep()->release() function call.  Next it grabs a pointer to the data
> from the string on the RHS of the operator=, assigning it to
> this->dat.  The str.rep()->grab() function call also increments the
> reference counter.
>
> I think the code is ok.
>
> Does the following simpler test program work correctly?

Thanks for the explanation.  I will spend more time with a reading of
the bastring.cc/.h code, I would profit by it.  Your test program ran
ok.  I don't yet understand why I'm getting the sigbus right at that
point ... but I can make some additional tests regarding the state of
the "this" I guess.

I've already done all the obvious things, like changing the compiling
flags, removing optimization, like that.  The npos operator is being
optimized out, but I haven't finished looking at possible side effects
(the compiler, I guess, thinks they're aren't any).

----------------------------+-----------------------------------------------
Chuck Robey                 | Interests include any kind of voice or data
[hidden email]         | communications topic, C programming, and Unix.
213 Lakeside Drive Apt T-1  |
Greenbelt, MD 20770         | I run Journey2 and picnic (FreeBSD-current)
(301) 220-2114              | and jaunt (NetBSD).
----------------------------+-----------------------------------------------





Reply | Threaded
Open this post in threaded view
|

Re: Question

Chuck Robey-3
In reply to this post by John W. Eaton-6
On Sat, 16 May 1998, John W. Eaton wrote:

> On 15-May-1998, Chuck Robey <[hidden email]> wrote:
> |     196 #endif
> |     197
> |     198   Vprogram_invocation_name = name;
> |     199   size_t pos = Vprogram_invocation_name.rfind ('/');
> |     200   Vprogram_name = (pos == NPOS)
> |
> | This calls, indirectly, to libstdc++/std/bastring.cc line 147:
> |
> |     146 // _lib.string.cons_ construct/copy/destroy:
> |     147   basic_string& operator= (const basic_string& str)
> |     148     {
> |     149       if (&str != this) { rep ()->release (); dat = str.rep()->grab (); }
> |     150       return *this;
> |     151     }
> |
> | The release function is in bastring.h line 69:
> |
> |      67   charT& operator[] (size_t s) { return data () [s]; }
> |      68   charT* grab () { if (selfish) return clone (); ++ref; return data ();}  
> |      69   void release () { if (--ref == 0) delete this; }
> |      70  
> |      71   inline static void * operator new (size_t, size_t);
> |
> | I'm not sure about this, but on entering the function, ref is 1, and
> | that's the value of this.ref, right?  (understand please that I could be
> | stronger perhaps in C++, sorry).  It _looks_ like "this" is being
> | deleted, then immediately referenced, which would sure cause the SIGBUS.
>
> I believe that the `this' that's being deleted is a __bsrep object,
> not a basic_string object.

I've not dropped this.  I just found that if I add the qualifier
"string" to the variables like in line 198 above (and several after) it
no longer gets the SIGBUS.  I still am getting a SIGBUS at line 211,
which I modified:

    210   char *hd = getenv ("HOME");
    211   string Vhome_directory = hd ? hd : "I have no home!";

I don't know why this is yet, and I haven't finished looking at why
string works for sure.  What is happening is that during the operator= a
release is done, but the "this" at that point is bogus, which causes the
SIGBUS.  I'm not sure, it might be the instruction before the release,
it's all inline, and gdb isn't clear on when it's happening.  The "this"
is bogus, tho.

----------------------------+-----------------------------------------------
Chuck Robey                 | Interests include any kind of voice or data
[hidden email]         | communications topic, C programming, and Unix.
213 Lakeside Drive Apt T-1  |
Greenbelt, MD 20770         | I run Journey2 and picnic (FreeBSD-current)
(301) 220-2114              | and jaunt (NetBSD).
----------------------------+-----------------------------------------------





Reply | Threaded
Open this post in threaded view
|

Re: Question

John W. Eaton-6
On 19-May-1998, Chuck Robey <[hidden email]> wrote:

| I've not dropped this.  I just found that if I add the qualifier
| "string" to the variables like in line 198 above (and several after) it
| no longer gets the SIGBUS.  I still am getting a SIGBUS at line 211,
| which I modified:
|
|     210   char *hd = getenv ("HOME");
|     211   string Vhome_directory = hd ? hd : "I have no home!";

By doing that, you are creating a local variable called
Vhome_directory that shadows the global one.  That's definitely not
what should happen.

Try changing it to

  Vhome_directory = hd ? string (hd) : string ("I have no home!");

If that fails to compile, do it like this instead:

  if (hd)
    Vhome_directory = string (hd);
  else
    Vhome_directory = string ("I have no home!");

There may be other places where you need to make similar changes.  Can
you please let me know where, so I can change my sources?

I think this problem is probably a bug in the compiler you are using,
but I'm willing to modify the Octave sources to work around the bug.

Thanks,

jwe


Reply | Threaded
Open this post in threaded view
|

Re: Question

Chuck Robey-3
On Tue, 19 May 1998, John W. Eaton wrote:

> On 19-May-1998, Chuck Robey <[hidden email]> wrote:
>
> | I've not dropped this.  I just found that if I add the qualifier
> | "string" to the variables like in line 198 above (and several after) it
> | no longer gets the SIGBUS.  I still am getting a SIGBUS at line 211,
> | which I modified:
> |
> |     210   char *hd = getenv ("HOME");
> |     211   string Vhome_directory = hd ? hd : "I have no home!";
>
> By doing that, you are creating a local variable called
> Vhome_directory that shadows the global one.  That's definitely not
> what should happen.
>
> Try changing it to
>
>   Vhome_directory = hd ? string (hd) : string ("I have no home!");
>
> If that fails to compile, do it like this instead:
>
>   if (hd)
>     Vhome_directory = string (hd);
>   else
>     Vhome_directory = string ("I have no home!");
>
> There may be other places where you need to make similar changes.  Can
> you please let me know where, so I can change my sources?
>
> I think this problem is probably a bug in the compiler you are using,
> but I'm willing to modify the Octave sources to work around the bug.

What about what I did to line 198, is that also hiding a global?  I
noticed it's also referred to in toplev.cc, so I tried to change line
198 back to what it was, and add an extern declaration at the top, but
it still got the SIGBUS.  Only adding the "string" at 211 got past the
SIGBUS ok.  BTW, I'm testing your suggested mod on line 211 now.

>
> Thanks,
>
> jwe
>
>

----------------------------+-----------------------------------------------
Chuck Robey                 | Interests include any kind of voice or data
[hidden email]         | communications topic, C programming, and Unix.
213 Lakeside Drive Apt T-1  |
Greenbelt, MD 20770         | I run Journey2 and picnic (FreeBSD-current)
(301) 220-2114              | and jaunt (NetBSD).
----------------------------+-----------------------------------------------





Reply | Threaded
Open this post in threaded view
|

Re: Question

Chuck Robey-3
In reply to this post by John W. Eaton-6
On Tue, 19 May 1998, John W. Eaton wrote:

> On 19-May-1998, Chuck Robey <[hidden email]> wrote:
>
> | I've not dropped this.  I just found that if I add the qualifier
> | "string" to the variables like in line 198 above (and several after) it
> | no longer gets the SIGBUS.  I still am getting a SIGBUS at line 211,
> | which I modified:

Final resolution -- this problem involves limitations of the FreeBSD
linker, which are probably not going to be resolved soon.  Took 2 steps:
first, built with --disable-shared.  Also, patched the configuration
script so that it linked the executeable "-lreadline" because FreeBSD's
readline works ok, and that somewhat helps in limiting the final size of
the executeable (which, after stripping, is ~3.0 megs).

----------------------------+-----------------------------------------------
Chuck Robey                 | Interests include any kind of voice or data
[hidden email]         | communications topic, C programming, and Unix.
213 Lakeside Drive Apt T-1  |
Greenbelt, MD 20770         | I run Journey2 and picnic (FreeBSD-current)
(301) 220-2114              | and jaunt (NetBSD).
----------------------------+-----------------------------------------------