Explore Related Concepts

Best Results From Wikipedia Yahoo Answers Youtube

From Wikipedia


In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. The lower the p-value, the less likely the result is if the null hypothesis is true, and consequently the more "significant" the result is, in the sense of statistical significance. One often rejects the null hypothesis when the p-value is less than 0.05 or 0.01, corresponding respectively to a 5% or 1% chance of rejecting the null hypothesis when it is true (Type I error).

A closely related concept is the E-value, which is the average number of times in multiple testing that one expects to obtain a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. The E-value is the product of the number of tests and the p-value.

Coin flipping example

For example, an experiment is performed to determine whether a coin flip is fair (50% chance, each, of landing heads or tails) or unfairly biased (> 50% chance of one of the outcomes).

Suppose that the experimental results show the coin turning up heads 14 times out of 20 total flips. The p-value of this result would be the chance of a fair coin landing on heads at least 14 times out of 20 flips. The probability that 20 flips of a fair coin would result in 14 or more heads can be computed from binomial coefficients as

\begin{align} & \operatorname{Prob}(14\text{ heads}) + \operatorname{Prob}(15\text{ heads}) + \cdots + \operatorname{Prob}(20\text{ heads}) \\ & = \frac{1}{2^{20}} \left[ \binom{20}{14} + \binom{20}{15} + \cdots + \binom{20}{20} \right] = \frac{60,\!460}{1,\!048,\!576} \approx 0.058 \end{align}

This probability is the (one-sided) p-value.

Because there is no way to know what percentage of coins in the world are unfair, the p-value does not tell us the probability that the coin is unfair. It measures the chance that a fair coin gives such result.


Traditionally, one rejects the null hypothesis if the p-value is smaller than or equal to the significance level, often represented by the Greek letter α (alpha). If the level is 0.05, then results that are only 5% likely or less, given that the null hypothesis is true, are deemed extraordinary.

When we ask whether a given coin is fair, often we are interested in the deviation of our result from the equality of numbers of heads and tails. In such a case, the deviation can be in either direction, favoring either heads or tails. Thus, in this example of 14 heads and 6 tails, we may want to calculate the probability of getting a result deviating by at least 4 from parity (two-sided test). This is the probability of getting at least 14 heads or at least 14 tails. As the binomial distribution is symmetrical for a fair coin, the two-sided p-value is simply twice the above calculated single-sided p-value; i.e., the two-sided p-value is 0.115.

In the above example we thus have:

  • null hypothesis (H0): fair coin;
  • observation O: 14 heads out of 20 flips; and
  • p-value of observation O given H0 = Prob(≥ 14 heads or ≥ 14 tails) = 0.115.

The calculated p-value exceeds 0.05, so the observation is consistent with the null hypothesis — that the observed result of 14 heads out of 20 flips can be ascribed to chance alone — as it falls within the range of what would happen 95% of the time were this in fact the case. In our example, we fail to reject the null hypothesis at the 5% level. Although the coin did not fall evenly, the deviation from expected outcome is small enough to be reported as being "not statistically significant at the 5% level".

However, had a single extra head been obtained, the resulting p-value (two-tailed) would be 0.0414 (4.14%). This time the null hypothesis – that the observed result of 15 heads out of 20 flips can be ascribed to chance alone – is rejected when using a 5% cut-off. Such a finding would be described as being "statistically significant at the 5% level".

Critics of p-values point out that the criterion used to decide "statistical significance" is based on the somewhat arbitrary choice of level (often set at 0.05). Furthermore, it is necessary to use a reasonable null hypothesis to assess the result fairly, but the choice of a null hypothesis often entails assumptions.

To understand both the original purpose of the p-value p and the reasons p is so often misinterpreted, it helps to know that p constitutes the main result of statistical significance testing (not to be confused with hypothesis testing), popularized by Ronald A. Fisher. Fisher promoted this testing as a method of statistical inference. To call this testing inferential is misleading, however, since inference makes statements about general hypotheses based on observed data, such as the post-experimental probability a hypothesis is true. As explained above, p is instead a statement about data assuming the null hypothesis; consequently, indiscriminately considering p as an inferential result can lead to confusion, including many of the misinterpretations noted in the next section.

On the other hand, Bayesian inference, the main alternative to significance testing, generates probabilistic statements about hypotheses based on data (and a priori estimates), and therefore truly constitutes inference. Bayesian methods can, for instance, calculate the probability that the null hypothesis H0 above is true assuming an a priori estimate of the probability that a coin is unfair. Since a priori we would be quite surprised that a coin could consistently give 75% heads, a Bayesian analysis would find the null hypothesis (that the coin is fair) quite probable even if a test gave 15 heads out of 20 tries (which as we saw above is considered a "significant" result at the 5% level according to its p-value).

Strictly speaking, then, p is a statement about data rather than about any hypothesis, and hence it is not inferential. This raises the question, though, of how science has been able to advance using significance testing. The reason is that, in many situations, papproximates some useful post-experimental probabilities about hypotheses, such as the post-experimental probability of the null hypothesis. When this approximation holds, it could help a researcher to judge the post-experimental plausibility of a hypothesis.

Even so, this approximation does not eliminate th

From Yahoo Answers

Question:How do we deduce the significance of the difference between two sets of data using calculated values for t?

Answers:Look up the value of the calculated t in tables and see if it is to the right or left of the table value at p = 0.05 (5% level) at the correct degrees of freedom. Is significant if calculated value is to the right of the 5% level.

Question:If they say 'ESTIMATE the value of 163.8/0.249 to 1 significant figure' do I round it off to 1 sf first, or calculate the answer and round it off? What is they say estimate but they don't say estimate the value? There is a big difference then.

Answers:1 significant figure means 1 digit number except from zero you can have as many 0's as u like but must be at the end or beginning. the answer is 200/ 0.200 hope that helped?

Question:So I became interested in these Ipod BAC calculators. I blew a few bux trying some of them and some are too simple but others are spot on (as much as they can be.....nice graphs and lots of calculations and it's still never a true representation) yet they only calculate the present time. It got me thinking. Is there a big difference between consuming say......3 drinks at once and 3 drinks over 3 hours? If your body can flush out a half pint (8 ounces of beer) in 1 hour does it matter one way or another? Drinking and driving is bad period. I'm just asking an honest question so don't flame me.

Answers:Of course there's a difference between drinking 3 drinks over 3 hours and drinking them all at once. You metabolize alcohol at the same rate no matter how much of it you drink, so if you have 3 drinks all at once, all the alcohol enters your bloodstream at one time so you'll probably be legally drunk very soon. If you have one drink each hour, much of the first drink will be gone by the time you have the second one. Not all of it, but some of it. One drink an hour might or might not make you legally drunk within 3 hours.

Question:I added 3 masses : 60.09g + 93.86g + 96.11g = 250.06g Then I averaged the mass: 250.06 / 3 = 83.35333333 on my calculator. Since 250.06 only has 5 significant figures (I think, correct me please), what would I round 83.35333333 to?

Answers:Your 3 is an exact number, you are averaging 3 data points. Exact numbers are not considered for sig figs, So, since 250.06 is 5 sig figs round to 5 sig figs. 83.353

From Youtube

Level of Significance in Hypothesis Testing :This video describes the use of level of significance in determining when to reject the null hypothesis

02 Significant Figures in Calculations :Bryant discusses how to use significant figures in calculations. Need to review significant figures? Here: www.youtube.com