Explore Related Concepts


formula for cumulative frequency distribution
Best Results From Wikipedia Yahoo Answers Youtube
From Wikipedia
In probability theory and statistics, the cumulative distribution function (CDF), or just distribution function, describes the probability that a realvalued random variableX with a given probability distribution will be found at a value less than or equal to x. Intuitively, it is the "area so far" function of the probability distribution. Cumulative distribution functions are also used to specify the distribution of multivariaterandom variables.
Definition
For every real numberx, the CDF of a realvalued random variableX is given by
 x \mapsto F_X(x) = \operatorname{P}(X\leq x),
where the righthand side represents the probability that the random variable X takes on a value less than or equal to x. The probability that X lies in the interval (a, b) is therefore F_X(b)F_X(a) if a < b.
If treating several random variables X, Y, ... etc. the corresponding letters are used as subscripts while, if treating only one, the subscript is omitted. It is conventional to use a capital F for a cumulative distribution function, in contrast to the lowercase f used for probability density functions and probability mass functions. This applies when discussing general distributions: some specific distributions have their own conventional notation, for example the normal distribution.
The CDF of X can be defined in terms of the probability density functionÆ’ as follows:
 F(x) = \int_{\infty}^x f(t)\,dt.
Note that in the definition above, the "less than or equal to" sign, "â‰¤", is a convention, not a universally used one (e.g. Hungarian literature uses "<"), but is important for discrete distributions. The proper use of tables of the binomial and Poisson distributions depend upon this convention. Moreover, important formulas like Levy's inversion formula for the characteristic function also rely on the "less or equal" formulation.
Properties
Every cumulative distribution function F is (not necessarily strictly) monotone nondecreasing (see monotone increasing) and rightcontinuous. Furthermore, we have
 \lim_{x\to \infty}F(x)=0, \quad \lim_{x\to +\infty}F(x)=1.
Every function with these four properties is a CDF. The properties imply that all CDFs are cÃ dlÃ g functions.
If X is a discrete random variable, then it attains values x_{1}, x_{2}, ... with probability p_{i} = P(x_{i}), and the CDF of X will be discontinuous at the points x_{i} and constant in between:
 F(x) = \operatorname{P}(X\leq x) = \sum_{x_i \leq x} \operatorname{P}(X = x_i) = \sum_{x_i \leq x} p(x_i).
If the CDF F of X is continuous, then X is a continuous random variable; if furthermore F is absolutely continuous, then there exists a Lebesgueintegrable function f(x) such that
 F(b)F(a) = \operatorname{P}(a\leq X\leq b) = \int_a^b f(x)\,dx
for all real numbers a and b. (The first of the two equalities displayed above would not be correct in general if we had not said that the distribution is continuous. Continuity of the distribution implies that P (X = a) = P (X = b) = 0, so the difference between "<" and "â‰¤" ceases to be important in this context.) The function f is equal to the derivative of Falmost everywhere, and it is called the probability density function of the distribution of X.
Point probability
The "point probability" that X is exactly b can be found as
 \operatorname{P}(X=b) = F(b)  \lim_{x \to b^{}} F(x).
Kolmogorov–Smirnov and Kuiper's tests
The Kolmogorov–Smirnov test is based on cumulative distribution functions and can be used to test to see whether two empirical distributions are different or whether an empirical distribution is different from an ideal distribution. The closely related Kuiper's test (ËˆkaÉªpÉ™rz) is useful if the domain of the distribution is cyclic as in day of the week. For instance we might use Kuiper's test to see if the number of tornadoes varies during the year or if sales of a product vary by day of the week or day of the month.
Complementary cumulative distribution function
Sometimes, it is useful to study the opposite question and ask how often the random variable is above a particular level. This is called the complementary cumulative distribution function (ccdf) or exceedance, and is defined as
F_c(x) = \operatorname{P}(X > x) = 1  F(x).
This has applications in statisticalhypothesis testing, for example, because onesided Pvalue is the probability of observing a test statistic at least as extreme as the one observed; hence, the onesided Pvalue is simply given by the ccdf.
From Yahoo Answers
Answers:The power of excel ValueCountCumSum 133 247 3512 4820 51030 6636 7440 8343 9548 10149 11049 12049
Answers:I will give you an example, by using the prime numbers up to 50. They are 2,3,5,7,11,13,17,19,23,29,31,37,41,43,47. We begin by dividing into appropriate intervals, usually of the same length. We will divide into five equal intervals, The choice of how many intervals to use is purely stylistic. We make a rectangular chart. Down the left side, we list our intervals. Across the top, we list headings for columns that are, in order, frequency (how many in the interval), relative frequency, and cumulative frequency: Interval Freq. Rel. Freq. Cumul. Freq. 110.........4......4/15.=78.7 %. 4 1120.......4......4/15=78.7%.. 8 2130.......2......2/15=39.3%.. 10 3140.......2......2/15=39.3%. 12 4150.......3......3/15=20.0%. 15 Total.......15... ..1 = 100.0%. 15 To explain, we begin with frequency. There are 4 primes between 11 and 20, for example, so the frequency is 4. There are 15 primes in our whole sample, so the relative frequency for the intereval 1120 is 4/15, or frequency divided by total. The cumulative frequency is the SUM of the frequencies up to here, or 4 + 4 = 8 (the first 4 is for the interval 110, and the second 4 is for the interval 1120). Cumulative frequency, thus, is like keeping a running total. To check it, for the last interval in your table, it will always be the total number of data points in your sample. There are two ways to do this in excel. One is to compute it by hand as I have shown above, and then just enter these data in a spreadsheet. Then, use the usual commands for making pie charts and histograms. The second is to enter the data for frequency, and label your intervals. Then, you can insert the next two columns, with the insert column command. By using arrays, you can call a(n) the nth entry in the first column, and in this example, set b(n)=a(n)/15 for all n (I cannot remember the exact excel command here, but it is a lot like this). For the third column, you would put the first entry in (call it c(1), and then let c(n+1)=c(n)+a(n+1) for the rest of your table. That will give cumulative frequency. The commands are slightly dependent on your system and on how you begin, but this is the key idea.
Answers:Your question appears to contain errors in it. First, f cannot be a cumulative frequency distribution since it decreases for certain values of x. A cumulative distribution should always be monotonically increasing. Therefore, I believe what you have is a frequency distribution. Second, you need to identify how the x bins are set up. That is, x1 x2 x3 f1 f2 f3 Are f1, f2, and f3 the #s less than x1, between x1 and x2, and between x2 and x3 respectively? Or are they between x1 and x2, between x2 and x3, and beyond x3 respectively? I'm going to assume the former scenario. Let N be the total number of observations. Let F(x) be the *cumulative* frequency, the number of observations less than x. Thus we have: x 20 30 40 50 65 100 F 660 2800 4610 6760 9250 12020 The median is the x value for which F(x) = (N+1)/2 (+1 because N is even in this case). From the values of F, and since (N+1)/2 = 6010.5, the median must be between 40 and 50. Typically, to get the value, one linearly interpolates the value using the bracketing F values. Thus, (N+1)/2 = F(x1) + (F(x2)F(x1))/(x2x1) * (x  x1) Solve for x: x = x1 + ( (N+1)/2  F(x1) )*(x2x1)/(F(x2)F(x1)) = 40 + (12021 / 2  4610)*(10)/(67604610) = 40 + (2801 / 2150) * 10 = 46.51395 Obviously this is not what you are quoting, and perhaps I have misunderstood your setup somehow. The above should be correct given the discussion I provided, and it should give you enough to figure out your problem if I in fact have misunderstood your question. Good luck.
Answers:At the median the cumulative frequency should be 50% of the total (that is, the total frequency), at the lower quartile it should be 25% and at the upper quartile it should be 75%. I think this is what the formula means by 'Required Cumulative Frequency'. If you choose the interval which should contain those cumulative frequencies (say, for the lower quartile, one that ranges from 23% to 31%) then that formula should give you an estimate of the value for that cumulative frequency.
From Youtube