Statistical test for equality ? Classic List Threaded 19 messages Open this post in threaded view
|

Statistical test for equality ?

 Hi,it there a statistical test if 2 samples are equal?The obvious choices would be correlation or paired t-test but both cannot tell if the samples are equal or if one sample is a multiple of the other THXstn _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

Re: Statistical test for equality ?

 On Mon, 16 Dec 2013 21:37:16 +0100 stn021 <[hidden email]> wrote: > it there a statistical test if 2 samples are equal? > > The obvious choices would be correlation or paired t-test but both > cannot tell if the samples are equal or if one sample is a multiple > of the other Do you know the probability density function applicable for the samples?  Are the samples independent? In general, what you are looking for is what fraction of the two PDFs overlap.  If they overlap very little, they are not likely to be equal.  If they overlap a lot, they are likely to be equal. Gord _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

Re: Statistical test for equality ?

 In reply to this post by stn021 Hello, I think the distributions must be discrete, since the probability that two samples from a continuous distribution will be identical is essentially zero. The answer should just be the probability of achieving the given result in one trial squared, if the two trials are independent. Or am I misunderstanding the question? -Brian
Open this post in threaded view
|

RE: Statistical test for equality ?

 In reply to this post by stn021 The chi square test will test for homogeneity in two populations but that too just gives ingo on proportions. It is difficult to see what is meant by "equal samples" We test for equality of means or proportions . Surely the original query is not asking about sample size. Sent from my Windows Phone From: briankaz Sent: 12/16/2013 5:35 PM To: [hidden email] Subject: Re: Statistical test for equality ? Hello, I think the distributions must be discrete, since the probability that two samples from a continuous distribution will be identical is essentially zero. The answer should just be the probability of achieving the given result in one trial squared, if the two trials are independent. Or am I misunderstanding the question? -Brian -- View this message in context: http://octave.1599824.n4.nabble.com/Statistical-test-for-equality-tp4660142p4660146.htmlSent from the Octave - General mailing list archive at Nabble.com. _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave_______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

Re: Statistical test for equality ?

 In reply to this post by stn021 stn021 wrote Hi, it there a statistical test if 2 samples are equal? The obvious choices would be correlation or paired t-test but both cannot tell if the samples are equal or if one sample is a multiple of the other This is a bit vague. Do you mean f.i. something similar to http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htmor http://www.itl.nist.gov/div898/handbook/prc/section2/prc22.htmi.e. what is assumed to be known, what is inferred from statistics about your data ? Once you can answer this question, Octave has all the required primitives to do the computation. Regards Pascal
Open this post in threaded view
|

Re: Statistical test for equality ?

 2013/12/17 CdeMills ... This is a bit vague... Hi,yes, the question is a bit vague. Also on first sight it also appears trivial.I would like to test if x1=x2. That means I have two samples, meaning two vectors x1 and x2. Now I want to know if x1(1) = x2(1) , x1(2) = x2(2) , ... , x1(end) = x2(end). Sounds easy, in octave I simply write x1 == x2The trouble is, that in reality x1 is not _exactly_ equal to x2. In reality it is more like x1 = x2+random noise. So the question could be asked like this: how much noise is allowed before the null-hypothesis x1==x2 should be replaced by the alternative hypthesis. Correlation answers to "how precisely do my samples match the equation x1 = b*x2+a"I would like to know "how precisely to my samples match the equation x1 = 1*x2+0" Correlation would give the same answer for x1=x2 and x2=5*x2 , so it cannot tell the difference between two equal sample and to highly correlated ones.The whole question revolves around simulation models. I would like to have some meaningful answer whether my model works. And to me it seemed obvious to ask if measured values are equal to simulated values, for the same set of independent variables. So x1 are measured values, x2 are results of the simulation, and the simulation works if the simulated values are statistically significantly equal to the measure values.  Do you mean f.i. something similar to http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm  or http://www.itl.nist.gov/div898/handbook/prc/section2/prc22.htm Both articles are about equal or non-equal means.My question is _not_ if mean(x1) == mean(x2) . Also that question is easily answered, for example with a t-test. Obviously if x1==x2 then mean(x1)==mean(x2), but the inverted conclusion does not work, if mean(x1)==mean(x2) then maybe x1==x2 of maybe not. Testing for equal means does not necessarily imply equality of the samples, only equality of the means.  THXstn _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

Re: Statistical test for equality ?

 You could trymax(abs(x2 - x1))rms(x2 - x1)   [rms function is in nan package, or you can write your own]or other such functions, depending on what form you think the noise takes _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

Re: Statistical test for equality ?

 2013/12/23 Nir Krakauer You could trymax(abs(x2 - x1))rms(x2 - x1)   [rms function is in nan package, or you can I assume that rms() = sqrt( mean((x1-x2).^2) ) , the root of the mean squares ? So yes, this function calculates how far the two vectors are apart and is indeed a measure for my question. It is unfortunately not a test in the statistical sense. For that there would have to be some kind of p-value which would indicate if or if not the null-hypthesis should be assumed to be true. Similar to for example t_test()   or other such functions, depending on what form you think the noise takes The noise follows a normal distribution, nothing special here. If x1==x2 then mean(noise) is near zero, std(noise) could have any value. _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

Re: Statistical test for equality ?

 This answer is based on fading memory of something learned in a course taken long ago and not subsequently used so please excuse its vagueness. I would have to revise the method to be more specific. If your model assumes that each pair of component vectors have been drawn from the same distribution, then I think you need to use the likelihood ratio test. Wikipedia pageThe test statistic raised by this method follows a chi-squared distribution (with the appropriate number of degrees of freedom). I note that someone has already suggested a chi-square test (which is sort of not surprising because you imply Gaussian noise, which has a distribution that belongs to the exponential family). However good you think Octave is, it's much, much better.
Open this post in threaded view
|

Re: Statistical test for equality ?

Open this post in threaded view
|

Re: Statistical test for equality ?

Open this post in threaded view
|

Re: Statistical test for equality ?

Open this post in threaded view
|

Re: Statistical test for equality ?

 Hi all,thanks for all the replies.It seems that statistical tests always revolve around distributions and parameters. They are very well suited to prove that two samples are different. But they only give hints as to whether samples are equal. For example you check if the distribution of the results of two coins are identical, you check if two samples have the same mean etc.I put some independent data into a simulation-model and calculate a result. My input-data is not arbitrary, it has been observed, for example in a physical experiment. Also the results of the experiment have been observed. The simulation-model should then output the same result as the experiment for the same input-data, otherwise the model is not correct. (Obviously in real-life experiments the measurements are never exact so even a perfect simulation-model will never exactly match the observed values.) Assume a near perfect model, then the model-results (x1) will be very close to the observed results (x2). That case will lead to positive results in any previously mentioned statistical test. A bit more formally: - x1==x2 implies mean(x1)=mean(x2) and- x1==x2 implies distribution(x1)==distribution(x2), whatever the distribution may be.However the reverse conclusion is not necessarily correct. If mean(x1)==mean(x2) then maybe x1==x2 or maybe x1 is completely unrelated to x2 except for equal means. The same applies to distribution, equal distribution-parameters may mean that x1==x2 or not. So statistical tests will show if my model generates data that looks similar to the original because it has equal means and equal distribution, but the test will not show if my model actually duplicates the observed reality. It is easy to construct data-sets that have equal means and equal distributions and are even highly correlated and still there can be non-trivial differences in the pairs, even if all tests show that the null-hypothesis should be assumed to be true. "Equality" in this context means that each related pair of observations should be equal. Or rather: equal enough, not too unequal, whatever "too unequal" may mean in this context. The usual statistical tests will only check if two samples are from the same target population, never if the same objects have been chosen for both samples. 2013/12/28 louis scott Nir replies with max(abs(x1-x2)) more formally, this the Kolmogorov-Smirnov distance. Yes, some kind of distance might be the answer. I already commented on rms(), see mail from Dec 24th. The same applies for the Kolmogorov-Smirnov distance. I can easily calculate distances but the question remains: how big can the distance get before the samples are not equal any more? Kolmogorov-Smirnov-tests seem rather sensitive. I have so far not found any sample from reality that the kolmogorov_smirnov_test(x,"norm") considered a normal distribution. No matter how "normal" the hist() and normplot() may look for the sample.   If you are ok with assuming gaussian process and only care for the difference in means,Gaussian is fine, difference in means is not helping...THXstn _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

Re: Statistical test for equality ?

Open this post in threaded view
|

Re: Statistical test for equality ?

 ... I could go on with possible solutions to your problem, but it would be easier if I knew what the actual problem was. Hi João,this question is about research in economics. More specifically about forecasts of dynamic systems.One application could be prediction of stock-prices. This is a good example because there is lots of data available and it is my favourite field for testing models, statistics etc. Stock don't really help with small budgets but they do come in handy when you want to avoid small sample sizes. So if I want to predict a number of stock-prices I would take data from the past, like for example the price-history of the stock itself, fundamental data of the company, possibly data from related market, in short whatever I assume to be relevant. That I consider to be "fact", observed data that is taken as it is.Then I would design some kind of simulation that takes the data from the past and somehow generates a stock-price. Then I would probably wait until tomorrow and compare the generated price with tomorrow's price. If it matches for a large number of stocks then my model is OK, otherwise it is not. The question is: how do I know that my model is correct? Comparing means and distributions does not really help, that only tells me if my generated data is is in the same area of the coordinate system as my observed data. With the usual statistical test That I can check if two identical coins were used, that turn up heads or tails at 50% probability.But that is not the question here. The questions is how to predict one specific flip of one specific coin and to check the prediction. At the end of the day I would want to know if I should buy a specific stock or not. That means a working simulation is necessary and that simulation has to be checked before actually using it. Otherwise large sample but very small budget. At first that question seemed simple to me but now it seems like I am caught in some erroneous frequentist system...THXstn _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

Re: Statistical test for equality ?

 I apologise if the following is known to you and does not apply to your model. However, you have possibly opened a different can of worms there. Stock prices comprise time series, whose data cannot be modelled by processes that assume independence of successive variables because each term in the series cannot be considered independent of the others (classic example from textbooks: the temperature on one day will almost certainly depend in some way on that of the day before in a country that is located in a temperate zone because of the effects of seasons; in general, temperatures in summer are likely to be higher than those in winter). Techniques exist to raise equations that mimic the variation in some already recorded time series, so that the observed data and the simulated data (which requires a seed) raise graphs that look similar through the period of observation in which the data were recorded. Such curves work well in interpolation but they are notoriously bad at extrapolation (that is at predicting future trends). I realise that speculators nevertheless invest large resources in attempting to raise models that will be used to predict future stock prices, and the rewards of a successful model are potentially enormous. Simulated curves used in interpolation (and extrapolation at the user's risk) are often a composite of two techniques: one that smooths data (typically a moving average) to remove variation modelled as noise; and one that attempts to capture cyclic variation (typically autoregression) eg of the type described for temperature across seasons. I expect that you will know that combined models of this type are called ARIMA ones (autoregression integrated moving average). As noted, I apologise if all of that is known to you. It might provide avenues for further research if it is not. However good you think Octave is, it's much, much better.
Open this post in threaded view
|