# Re: [OctDev] Fwd: Re: Problems with Statistics package

3 messages
Open this post in threaded view
|

## Re: [OctDev] Fwd: Re: Problems with Statistics package

 On Wed, 30 May 2012 11:25:14, marco atzeri wrote: > Hi Hannes, octave-forge mailing list is the right place for bugs and > patches on the statistics package anova.m is not part of the Octave Forge Statistics package. It is part of Octave core. I am cc'ing this mail along with the patch to the Octave maintainers list. Arno > -------- Original Message -------- > Subject: Re: Problems with Statistics package > Date: Wed, 30 May 2012 11:06:42 +0200 > From: Hannes <[hidden email]> > To: [hidden email] > > Ok, I think I got it pinned downed and solved. I have attached a patch. > > Heres the explenation: > >>  total_mean = mean (y(:)); >>  SSB = sum (group_count .* (group_mean - total_mean) .^ 2); > is always positive [sum of squares between] > >>  SST = sumsq (reshape (y, n, 1) - total_mean); > is always positive and >= SSB [sum of squares total] > >>  SSW = SST - SSB; > => should always be >=0 > but it is an ill conditioned operation if SST==SSB resulting in > possibly negative error. Note that if the error is positive, it > doesn't matter because it will produce f == Inf and p = 0 as expected. > >>  df_b = k - 1; >>  df_w = n - k; >>  v_b = SSB / df_b; >>  v_w = SSW / df_w; > this can become negative, if the SSW is negative or 0 if SSW is zero > or close to zero. > > >>  f = v_b / v_w; > this results in divide by zero if v_w is zero and results in *wrong > result without warning* if v_w is negative. > > > My patch changes this whole block to: > >    SSB = sum (group_count .* (group_mean - total_mean) .^ 2); >    SST = sumsq (reshape (y, n, 1) - total_mean); >    SSW = SST - SSB; >    if (SSW < 0) # machine error if SSB == SSW >      SSW = 0; >    endif >    df_b = k - 1; >    df_w = n - k; >    v_b = SSB / df_b; >    v_w = SSW / df_w; >    if (v_w != 0) >      f = v_b / v_w; >      pval = 1 - fcdf (f, df_b, df_w); >    elseif (v_b == 0) >      f = NaN; >      pval = 1; > > 0 / 0 is  NaN and the hypothesis mus be rejected > >    else >      f = Inf; >      pval = 0; > > x / 0 is Inf and hypothesis must be accepted (because there is no > variance inside the sample but there is outside > >    endif > > HTH, > Hannes > patch_anova.m (771 bytes) Download Attachment