# Statistical test for equality ?

19 messages
Open this post in threaded view
|

## Statistical test for equality ?

 Hi,it there a statistical test if 2 samples are equal?The obvious choices would be correlation or paired t-test but both cannot tell if the samples are equal or if one sample is a multiple of the other THXstn _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

 On Mon, 16 Dec 2013 21:37:16 +0100 stn021 <[hidden email]> wrote: > it there a statistical test if 2 samples are equal? > > The obvious choices would be correlation or paired t-test but both > cannot tell if the samples are equal or if one sample is a multiple > of the other Do you know the probability density function applicable for the samples?  Are the samples independent? In general, what you are looking for is what fraction of the two PDFs overlap.  If they overlap very little, they are not likely to be equal.  If they overlap a lot, they are likely to be equal. Gord _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

 In reply to this post by stn021 Hello, I think the distributions must be discrete, since the probability that two samples from a continuous distribution will be identical is essentially zero. The answer should just be the probability of achieving the given result in one trial squared, if the two trials are independent. Or am I misunderstanding the question? -Brian
Open this post in threaded view
|

## RE: Statistical test for equality ?

 In reply to this post by stn021 The chi square test will test for homogeneity in two populations but that too just gives ingo on proportions. It is difficult to see what is meant by "equal samples" We test for equality of means or proportions . Surely the original query is not asking about sample size. Sent from my Windows Phone From: briankaz Sent: 12/16/2013 5:35 PM To: [hidden email] Subject: Re: Statistical test for equality ? Hello, I think the distributions must be discrete, since the probability that two samples from a continuous distribution will be identical is essentially zero. The answer should just be the probability of achieving the given result in one trial squared, if the two trials are independent. Or am I misunderstanding the question? -Brian -- View this message in context: http://octave.1599824.n4.nabble.com/Statistical-test-for-equality-tp4660142p4660146.htmlSent from the Octave - General mailing list archive at Nabble.com. _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave_______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

 In reply to this post by stn021 stn021 wrote Hi, it there a statistical test if 2 samples are equal? The obvious choices would be correlation or paired t-test but both cannot tell if the samples are equal or if one sample is a multiple of the other This is a bit vague. Do you mean f.i. something similar to http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htmor http://www.itl.nist.gov/div898/handbook/prc/section2/prc22.htmi.e. what is assumed to be known, what is inferred from statistics about your data ? Once you can answer this question, Octave has all the required primitives to do the computation. Regards Pascal
Open this post in threaded view
|

## Re: Statistical test for equality ?

 2013/12/17 CdeMills ... This is a bit vague... Hi,yes, the question is a bit vague. Also on first sight it also appears trivial.I would like to test if x1=x2. That means I have two samples, meaning two vectors x1 and x2. Now I want to know if x1(1) = x2(1) , x1(2) = x2(2) , ... , x1(end) = x2(end). Sounds easy, in octave I simply write x1 == x2The trouble is, that in reality x1 is not _exactly_ equal to x2. In reality it is more like x1 = x2+random noise. So the question could be asked like this: how much noise is allowed before the null-hypothesis x1==x2 should be replaced by the alternative hypthesis. Correlation answers to "how precisely do my samples match the equation x1 = b*x2+a"I would like to know "how precisely to my samples match the equation x1 = 1*x2+0" Correlation would give the same answer for x1=x2 and x2=5*x2 , so it cannot tell the difference between two equal sample and to highly correlated ones.The whole question revolves around simulation models. I would like to have some meaningful answer whether my model works. And to me it seemed obvious to ask if measured values are equal to simulated values, for the same set of independent variables. So x1 are measured values, x2 are results of the simulation, and the simulation works if the simulated values are statistically significantly equal to the measure values.  Do you mean f.i. something similar to http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm  or http://www.itl.nist.gov/div898/handbook/prc/section2/prc22.htm Both articles are about equal or non-equal means.My question is _not_ if mean(x1) == mean(x2) . Also that question is easily answered, for example with a t-test. Obviously if x1==x2 then mean(x1)==mean(x2), but the inverted conclusion does not work, if mean(x1)==mean(x2) then maybe x1==x2 of maybe not. Testing for equal means does not necessarily imply equality of the samples, only equality of the means.  THXstn _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

 You could trymax(abs(x2 - x1))rms(x2 - x1)   [rms function is in nan package, or you can write your own]or other such functions, depending on what form you think the noise takes _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

 2013/12/23 Nir Krakauer You could trymax(abs(x2 - x1))rms(x2 - x1)   [rms function is in nan package, or you can I assume that rms() = sqrt( mean((x1-x2).^2) ) , the root of the mean squares ? So yes, this function calculates how far the two vectors are apart and is indeed a measure for my question. It is unfortunately not a test in the statistical sense. For that there would have to be some kind of p-value which would indicate if or if not the null-hypthesis should be assumed to be true. Similar to for example t_test()   or other such functions, depending on what form you think the noise takes The noise follows a normal distribution, nothing special here. If x1==x2 then mean(noise) is near zero, std(noise) could have any value. _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

 This answer is based on fading memory of something learned in a course taken long ago and not subsequently used so please excuse its vagueness. I would have to revise the method to be more specific. If your model assumes that each pair of component vectors have been drawn from the same distribution, then I think you need to use the likelihood ratio test. Wikipedia pageThe test statistic raised by this method follows a chi-squared distribution (with the appropriate number of degrees of freedom). I note that someone has already suggested a chi-square test (which is sort of not surprising because you imply Gaussian noise, which has a distribution that belongs to the exponential family). However good you think Octave is, it's much, much better.
Open this post in threaded view
|

## Re: Statistical test for equality ?

 Hi, the wikipedia-article about Likelihood-ratio is a bit too much "mathmatese" for me to be sure. Sounds interesting though. Could someone give a short english explanation? The example in the wikipedia-article seems to me like the answer to a different question than mine. It is about testing whether two coins having the same probability of coming up heads. That would be chi-square or maybe t_test. My question is more about testing whether the first toss of coin 1 is identical to the first toss of coin 2. Same for the second toss etc. I do not want to know if the 2 coins both come up heads 50% of the time. Instead I want to know if they both come up with the same side each time. If coin 1 shows heads then coin 2 should show heads too. Same for tails. The question is _not_ : are the coins alike ?Instead it is: is it the same coin ?I do not think that chi-square will answer this. (Correct me if I'm wrong) Coins are a bad example though. It is more about continuous variables, for example any (rational) number between 0 and 100. The distribution can be assumed to be normal.THXstn 2013/12/24 pathematica This answer is based on fading memory of something learned in a course taken long ago and not subsequently used so please excuse its vagueness. I would have to revise the method to be more specific. If your model assumes that each pair of component vectors have been drawn from the same distribution, then I think you need to use the likelihood ratio test. Wikipedia page The test statistic raised by this method follows a chi-squared distribution (with the appropriate number of degrees of freedom). I note that someone has already suggested a chi-square test (which is sort of not surprising because you imply Gaussian noise, which has a distribution that belongs to the exponential family). ----- However good you think Octave is, it's much, much better. -- View this message in context: http://octave.1599824.n4.nabble.com/Statistical-test-for-equality-tp4660142p4660360.html Sent from the Octave - General mailing list archive at Nabble.com. _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

 Again, my attempt to explain will be hampered by fading memory and, possibly, imperfect understanding of the subject from my relatively brief exposure to it. However, here is an attempt. Please excuse my summarising things that everyone will know; I do this in an attempt to identify the concept behind the Likelihood test, to explore whether it might do what you want it to do. I'm afraid it's not going to be much of a recipe for doing the test. To do that, it will be necessary to identify the number of degrees of freedom for the particular test you will undertake, and to calculate the likelihoods so that the statistic might be compared against a suitable member of the Chi square distribution. As I remember it, the likelihood ratio test is based in Bayesian rather than frequentist statistics. Roughly speaking, frequentist statistics is concerned with identifying a notional "average" representative of something that might be measured, together with a range of values described as a "confidence interval" (typically derived from the standard error of the mean, which is often taken as the exemplar of averageness) which attempts to quantify the probability that the true mean (which is not known) lies somewhere in the region of the estimated mean. The mathematical models assume that errors in measurement follow some probability distribution, typically but not necessarily Gaussian. The parameters of the model are viewed as fixed and attempts are made to find them (eg mean and standard deviation). Experimental measurements are viewed as variables and the pattern of distribution of data points is predicted using the models using the estimated parameters, with decisions made about the probability of observing particular values given the parameters. Examples of the flaws inherent in frequentist methods are highlighted in such jokes as "The average human being has one boob and one testicle". Bayesian statistics treats parameters as variables rather than fixed. In contrast, observed measurements are treated as fixed (constants after measurement). Once again mathematical models are required to draw inferences about the probability of observing something. The probability distributions that describe the probability of observing some measurement, given some particular values of parameters are similar (eg mean and variance of a normal distribution). However, extra probability distributions are required that describe the distribution of the parameters that have been used in the probability density distribution that models the errors in measurements. These are the prior and the posterior distributions. The prior distribution summarises belief "so far" about the value of the parameters before some particular set of measurements is taken. This is combined with the experimental data (which are treated as fixed but which are modelled as though they have been sampled from some particular probability distribution) to derive the posterior distribution. The posterior distribution summaries updated belief about the value of the parameters in the distribution that models error in measurements given the set of data that have been sampled. The use of the coin tossing example simplifies discussion because it is typically modelled by a Bernoulli distribution, which has only one parameter, p (the probability of observing one of two possible outcomes, eg heads). The prior (and the posterior) distribution that describes belief in the value of p will appear odd to frequentists because p can only take values between 0 and 1, so it will be defined only on this interval, and the area beneath the kernel of the distribution will be normalised to 1 by a suitable normalising factor. Given the nature of the thing being modelled, a distribution often used as a prior for a Bernoulli trial is the Beta one (eg Wikipedia page on Beta distribution). It is defined on [0,1]. Like other priors/posteriors for Bayesian modelling, it might be multimodal given its parameters (the Beta distribution has two parameters). In the likelihood ratio test, the two sets of data would be used to calculate the likelihood for each given the particular distribution with its particular parameter that has been used to model the process (here, Bernoulli(0.5) would seem sensible). The posterior would provide a distribution of the probability that p takes some particular value given the data (note this is a probability of a particular probability, with the words used in a subtly different way). For a fair coin, it would be expected that the posterior would be a Beta distribution with a single mode somewhere near 0.5. The nearest thing to a confidence interval for the value of p (the one which is the parameter for the Bernoulli distribution) would be a credible interval; as posteriors may be multimodal, it is often not possible merely to bracket some mode to find an interval whose area is some proportion of one, modelling the probability of observing that value for it. It is often also inappropriate to form a symmetrical interval about a mode, as you might imagine the shape of the curve in which some mode is not located at 0.5. Instead, a decision must be made about the way that the credible interval is found. While others exist, a way that has merit is called the "highest posterior density" (often abbreviated to HPD) to find some credible interval (or credible region, which might comprise the union of two separate intervals for some multimodal distribution). The interval(s) are bounded by the values of p for which the likelihood takes the same value (ie the bound form the projection onto the x axis of the intersections with the graph of a horizontal line drawn across the graph so that the area encompassed by the bounds provides the desired credible interval; note that this might define more than one separate interval for some multimodal graph, depending on the height of the horizontal line). Anyway, in the likelihood ratio test, (in the case of tossing two separate coins), you would be testing the hypothesis that the respective values of p for each of the coins are the same (that is, you are seeing whether the two coins are "equally fair" or "equally biased"). This is another way of quantifying the probability that the two sets of data have been sampled from the same distribution (or, more strictly, the same distribution with the same parameter). In the last sentence, the word "distribution" refers to the one modelling the error in the measurement of the data rather than the ones (ie the prior and the posterior) modelling the beliefs about the values of the parameters for the sampling distribution before and after the data have been sampled.   However good you think Octave is, it's much, much better.
Open this post in threaded view
|

## Re: Statistical test for equality ?

 Pathematica that was an enjoyable read, I shall use the "one boob and testicle" from now on.Nir replies with max(abs(x1-x2)) more formally, this the Kolmogorov-Smirnov distance. If you are ok with assuming gaussian process and only care for the difference in means, a google of "wald test octave" yields some good answers in particular octave code by michael creel.  This is for general linear restrictions, not just equality. For a host of straightforward tests, including equal means and variances as well as the KS distance see: http://www.gnu.org/software/octave/doc/interpreter/Tests.html On Sat, Dec 28, 2013 at 1:08 AM, pathematica wrote: Again, my attempt to explain will be hampered by fading memory and, possibly, imperfect understanding of the subject from my relatively brief exposure to it. However, here is an attempt. Please excuse my summarising things that everyone will know; I do this in an attempt to identify the concept behind the Likelihood test, to explore whether it might do what you want it to do. I'm afraid it's not going to be much of a recipe for doing the test. To do that, it will be necessary to identify the number of degrees of freedom for the particular test you will undertake, and to calculate the likelihoods so that the statistic might be compared against a suitable member of the Chi square distribution. As I remember it, the likelihood ratio test is based in Bayesian rather than frequentist statistics. Roughly speaking, frequentist statistics is concerned with identifying a notional "average" representative of something that might be measured, together with a range of values described as a "confidence interval" (typically derived from the standard error of the mean, which is often taken as the exemplar of averageness) which attempts to quantify the probability that the true mean (which is not known) lies somewhere in the region of the estimated mean. The mathematical models assume that errors in measurement follow some probability distribution, typically but not necessarily Gaussian. The parameters of the model are viewed as fixed and attempts are made to find them (eg mean and standard deviation). Experimental measurements are viewed as variables and the pattern of distribution of data points is predicted using the models using the estimated parameters, with decisions made about the probability of observing particular values given the parameters. Examples of the flaws inherent in frequentist methods are highlighted in such jokes as "The average human being has one boob and one testicle". Bayesian statistics treats parameters as variables rather than fixed. In contrast, observed measurements are treated as fixed (constants after measurement). Once again mathematical models are required to draw inferences about the probability of observing something. The probability distributions that describe the probability of observing some measurement, given some particular values of parameters are similar (eg mean and variance of a normal distribution). However, extra probability distributions are required that describe the distribution of the parameters that have been used in the probability density distribution that models the errors in measurements. These are the prior and the posterior distributions. The prior distribution summarises belief "so far" about the value of the parameters before some particular set of measurements is taken. This is combined with the experimental data (which are treated as fixed but which are modelled as though they have been sampled from some particular probability distribution) to derive the posterior distribution. The posterior distribution summaries updated belief about the value of the parameters in the distribution that models error in measurements given the set of data that have been sampled. The use of the coin tossing example simplifies discussion because it is typically modelled by a Bernoulli distribution, which has only one parameter, p (the probability of observing one of two possible outcomes, eg heads). The prior (and the posterior) distribution that describes belief in the value of p will appear odd to frequentists because p can only take values between 0 and 1, so it will be defined only on this interval, and the area beneath the kernel of the distribution will be normalised to 1 by a suitable normalising factor. Given the nature of the thing being modelled, a distribution often used as a prior for a Bernoulli trial is the Beta one (eg Wikipedia page on Beta distribution  ). It is defined on [0,1]. Like other priors/posteriors for Bayesian modelling, it might be multimodal given its parameters (the Beta distribution has two parameters). In the likelihood ratio test, the two sets of data would be used to calculate the likelihood for each given the particular distribution with its particular parameter that has been used to model the process (here, Bernoulli(0.5) would seem sensible). The posterior would provide a distribution of the probability that p takes some particular value given the data (note this is a probability of a particular probability, with the words used in a subtly different way). For a fair coin, it would be expected that the posterior would be a Beta distribution with a single mode somewhere near 0.5. The nearest thing to a confidence interval for the value of p (the one which is the parameter for the Bernoulli distribution) would be a credible interval; as posteriors may be multimodal, it is often not possible merely to bracket some mode to find an interval whose area is some proportion of one, modelling the probability of observing that value for it. It is often also inappropriate to form a symmetrical interval about a mode, as you might imagine the shape of the curve in which some mode is not located at 0.5. Instead, a decision must be made about the way that the credible interval is found. While others exist, a way that has merit is called the "highest posterior density" (often abbreviated to HPD) to find some credible interval (or credible region, which might comprise the union of two separate intervals for some multimodal distribution). The interval(s) are bounded by the values of p for which the likelihood takes the same value (ie the bound form the projection onto the x axis of the intersections with the graph of a horizontal line drawn across the graph so that the area encompassed by the bounds provides the desired credible interval; note that this might define more than one separate interval for some multimodal graph, depending on the height of the horizontal line). Anyway, in the likelihood ratio test, (in the case of tossing two separate coins), you would be testing the hypothesis that the respective values of p for each of the coins are the same (that is, you are seeing whether the two coins are "equally fair" or "equally biased"). This is another way of quantifying the probability that the two sets of data have been sampled from the same distribution (or, more strictly, the same distribution with the same parameter). In the last sentence, the word "distribution" refers to the one modelling the error in the measurement of the data rather than the ones (ie the prior and the posterior) modelling the beliefs about the values of the parameters for the sampling distribution before and after the data have been sampled. ----- However good you think Octave is, it's much, much better. -- View this message in context: http://octave.1599824.n4.nabble.com/Statistical-test-for-equality-tp4660142p4660405.html Sent from the Octave - General mailing list archive at Nabble.com. _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

 Hi all,thanks for all the replies.It seems that statistical tests always revolve around distributions and parameters. They are very well suited to prove that two samples are different. But they only give hints as to whether samples are equal. For example you check if the distribution of the results of two coins are identical, you check if two samples have the same mean etc.I put some independent data into a simulation-model and calculate a result. My input-data is not arbitrary, it has been observed, for example in a physical experiment. Also the results of the experiment have been observed. The simulation-model should then output the same result as the experiment for the same input-data, otherwise the model is not correct. (Obviously in real-life experiments the measurements are never exact so even a perfect simulation-model will never exactly match the observed values.) Assume a near perfect model, then the model-results (x1) will be very close to the observed results (x2). That case will lead to positive results in any previously mentioned statistical test. A bit more formally: - x1==x2 implies mean(x1)=mean(x2) and- x1==x2 implies distribution(x1)==distribution(x2), whatever the distribution may be.However the reverse conclusion is not necessarily correct. If mean(x1)==mean(x2) then maybe x1==x2 or maybe x1 is completely unrelated to x2 except for equal means. The same applies to distribution, equal distribution-parameters may mean that x1==x2 or not. So statistical tests will show if my model generates data that looks similar to the original because it has equal means and equal distribution, but the test will not show if my model actually duplicates the observed reality. It is easy to construct data-sets that have equal means and equal distributions and are even highly correlated and still there can be non-trivial differences in the pairs, even if all tests show that the null-hypothesis should be assumed to be true. "Equality" in this context means that each related pair of observations should be equal. Or rather: equal enough, not too unequal, whatever "too unequal" may mean in this context. The usual statistical tests will only check if two samples are from the same target population, never if the same objects have been chosen for both samples. 2013/12/28 louis scott Nir replies with max(abs(x1-x2)) more formally, this the Kolmogorov-Smirnov distance. Yes, some kind of distance might be the answer. I already commented on rms(), see mail from Dec 24th. The same applies for the Kolmogorov-Smirnov distance. I can easily calculate distances but the question remains: how big can the distance get before the samples are not equal any more? Kolmogorov-Smirnov-tests seem rather sensitive. I have so far not found any sample from reality that the kolmogorov_smirnov_test(x,"norm") considered a normal distribution. No matter how "normal" the hist() and normplot() may look for the sample.   If you are ok with assuming gaussian process and only care for the difference in means,Gaussian is fine, difference in means is not helping...THXstn _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

Open this post in threaded view
|

## Re: Statistical test for equality ?

 ... I could go on with possible solutions to your problem, but it would be easier if I knew what the actual problem was. Hi João,this question is about research in economics. More specifically about forecasts of dynamic systems.One application could be prediction of stock-prices. This is a good example because there is lots of data available and it is my favourite field for testing models, statistics etc. Stock don't really help with small budgets but they do come in handy when you want to avoid small sample sizes. So if I want to predict a number of stock-prices I would take data from the past, like for example the price-history of the stock itself, fundamental data of the company, possibly data from related market, in short whatever I assume to be relevant. That I consider to be "fact", observed data that is taken as it is.Then I would design some kind of simulation that takes the data from the past and somehow generates a stock-price. Then I would probably wait until tomorrow and compare the generated price with tomorrow's price. If it matches for a large number of stocks then my model is OK, otherwise it is not. The question is: how do I know that my model is correct? Comparing means and distributions does not really help, that only tells me if my generated data is is in the same area of the coordinate system as my observed data. With the usual statistical test That I can check if two identical coins were used, that turn up heads or tails at 50% probability.But that is not the question here. The questions is how to predict one specific flip of one specific coin and to check the prediction. At the end of the day I would want to know if I should buy a specific stock or not. That means a working simulation is necessary and that simulation has to be checked before actually using it. Otherwise large sample but very small budget. At first that question seemed simple to me but now it seems like I am caught in some erroneous frequentist system...THXstn _______________________________________________ Help-octave mailing list [hidden email] https://mailman.cae.wisc.edu/listinfo/help-octave
Open this post in threaded view
|

## Re: Statistical test for equality ?

 I apologise if the following is known to you and does not apply to your model. However, you have possibly opened a different can of worms there. Stock prices comprise time series, whose data cannot be modelled by processes that assume independence of successive variables because each term in the series cannot be considered independent of the others (classic example from textbooks: the temperature on one day will almost certainly depend in some way on that of the day before in a country that is located in a temperate zone because of the effects of seasons; in general, temperatures in summer are likely to be higher than those in winter). Techniques exist to raise equations that mimic the variation in some already recorded time series, so that the observed data and the simulated data (which requires a seed) raise graphs that look similar through the period of observation in which the data were recorded. Such curves work well in interpolation but they are notoriously bad at extrapolation (that is at predicting future trends). I realise that speculators nevertheless invest large resources in attempting to raise models that will be used to predict future stock prices, and the rewards of a successful model are potentially enormous. Simulated curves used in interpolation (and extrapolation at the user's risk) are often a composite of two techniques: one that smooths data (typically a moving average) to remove variation modelled as noise; and one that attempts to capture cyclic variation (typically autoregression) eg of the type described for temperature across seasons. I expect that you will know that combined models of this type are called ARIMA ones (autoregression integrated moving average). As noted, I apologise if all of that is known to you. It might provide avenues for further research if it is not. However good you think Octave is, it's much, much better.
Open this post in threaded view
|