From: Godmar Back (gback@cs.utah.edu)
Date: Sat Feb 06 1999 - 22:44:00 EST
> 
> On Feb  4, 1999, Alexandre Oliva <oliva@dcc.unicamp.br> wrote:
> 
> > On Feb  4, 1999, Godmar Back <gback@marker.cs.utah.edu> wrote:
> >>> On Feb  4, 1999, Godmar Back <gback@cs.utah.edu> wrote:
> 
> >>>> Hmmm, I was finally able to reproduce the library "not found"
> >>>> failure on a slower Linux machine with RedHat 5.1.  Apparently,
> >>>> it's dying somewhere in libltdl.
> 
> >>> Probably the context switch during malloc problem :-(
> 
> >> Actually, that's not likely since it happens during startup
> >> where there's only one thread runnable: we never switch from the
> >> signal handler there.
> 
> > Good point, but the interrupt handler is already set up, so it may be
> > screwing things up somehow.
> 
> I've got good and bad news: as soon as I moved initNativeThreads() to
> after initNative(), I didn't get any segmentation fault on
> Solaris/sparc and any failure to find libnative on GNU/Linux/x86.  So
> there really *is* something going on between the signal handler and
> dlopen.
Note that I not only got the failure to find libnative problem,
I got actual segfault because the library was linked half-way.
> 
> In fact, since Solaris was core dumping, I was able to get a stack
> trace, and it crashed several stack frames deep inside dlopen().
> Which indicates that it is not context switching, it is the signal
> handler is somehow negatively interfering with dlopen :-(
Well, not checking for EINTR would do it... although read & write
should be restarted.
> 
> I'll leave the tests running for some more time and, if I fail to find
> any further random failures, I'll install this change in CVS.
> 
I didn't get any segfault on Solaris since.
Could you show that patch before you install it?
I want to make sure it works on the OSKit too where threading&initialization
are non-standard and where it took a while to tweak them to make them work.
        - Godmar
This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 19:57:59 EDT