how to calculate relative frequency

Best Results From Wikipedia Yahoo Answers Youtube

From Wikipedia

Frequency distribution

In statistics, a frequency distribution is a tabulation of the values that one or more variables take in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way the table summarizes the distribution of values in the sample.

Univariate frequency tables

Univariate frequency distributions are often presented as lists ordered by quantity showing the number of times each value appears. For example, if 100 people rate a five-point Likert scale assessing their agreement with a statement on a scale on which 1 denotes strong agreement and 5 strong disagreement, the frequency distribution of their responses might look like:

A different tabulation scheme aggregates values into bins such that each bin encompasses a range of values. For example, the heights of the students in a class could be organized into the following frequency table.

A Frequency Distribution shows us a summarized grouping of data divided into mutually exclusive classes and the number of occurrences in a class. It is a way of showing unorganized data e.g. to show results of an election, income of people for a certain region, sales of a product within a certain period, student loan amounts of graduates, etc. Some of the graphs that can be used with frequency distributions are histograms, line graphs, bar charts and pie charts. Frequency distributions are used for both qualitative and quantitative data..

Joint frequency distributions

Bivariate joint frequency distributions are often presented as (two-way) contingency tables:

The total row and total column report the marginal frequencies or marginal distribution, while the body of the table reports the joint frequencies.


Managing and operating on frequency tabulated data is much simpler than operation on raw data. There are simple algorithms to calculate median, mean, standard deviation etc. from these tables.

Statistical hypothesis testing is founded on the assessment of differences and similarities between frequency distributions. This assessment involves measures of central tendency or averages, such as the mean and median, and measures of variability or statistical dispersion, such as the standard deviation or variance.

A frequency distribution is said to be skewed when its mean and median are different. The kurtosis of a frequency distribution is the concentration of scores at the mean, or how peaked the distribution appears if depicted graphically—for example, in a histogram. If the distribution is more peaked than the normal distribution it is said to be leptokurtic; if less peaked it is said to be platykurtic.

Letter frequency distributions are also used in frequency analysis to crack codes and refer to the relative frequency of letters in different languages.

Letter frequency

The frequency of letters in text has often been studied for use in cryptography, and frequency analysis in particular. No exact letter frequency distribution underlies a given language, since all writers write slightly differently. Linotype machines sorted the letters' frequencies as etaoin shrdlu cmfwyp vbgkqj xz based on the experience and custom of manual compositors. Likewise,Modern International Morse code encodes the most frequent letters with the shortest symbols; arranging the Morse alphabet into groups of letters that require equal amounts of time to transmit, and then sorting these groups in increasing order, yields e it san hurdm wgvlfbk opjxcz yq. Similar ideas are used in modern data-compression techniques such as Huffman coding.

More recent analyses show that letter frequencies, like word frequencies, tend to vary, both by writer and by subject. One cannot write an essay about x-rays without using frequent Xs, and the essay will have an especially strange letter frequency if the essay is about the frequent use of x-rays to treat zebras in Qatar. Different authors have habits which can be reflected in their use of letters. Hemingway's writing style, for example, is visibly different from Faulkner's. Letter, bigram, trigram, word frequencies, word length, and sentence length can be calculated for specific authors, and used to prove or disprove authorship of texts, even for authors whose styles aren't so divergent.

Accurate average letter frequencies can only be gleaned by analyzing a large amount of representative text. With the availability of modern computing and collections of large text corpora, such calculations are easily made. This [ Deafandblind link] details examples from a variety of sources, (press reporting, religious text, scientific text and general fiction) and there are differences especially for general fiction with the position of 'h' and 'i'. The example differs from the linotype 'etaoin shrdlu' to come out as 'etaoHn Isrdlu'. There is an unproven statement that conversation is similar in frequency to general fiction.

Herbert S. Zim, in his classic introductory cryptography text "Codes and Secret Writing", gives the English letter frequency sequence as "ETAON RISHD LFCMU GYPWB VKXJQ Z", the most common letter pairs as "TH HE AN RE ER IN ON AT ND ST ES EN OF TE ED OR TI HI AS TO", and the most common doubled letters as "LL EE SS OO TT FF RR NN PP CC".

The 'top twelve' letters comprise about 80% of the total usage. The 'top eight" letters comprise about 65% of the total usage. A spy using the VIC cipher or some other cipher based on a straddling checkerboard typically uses a mnemonic such as "a sin to err" (dropping the second "r") to remember the top 8 characters.

The use of letter frequencies and frequency analysis plays a fundamental role in several games, including hangman, Scrabble, Wheel of Fortune,Definition,Bananagrams, and cryptograms.

Letter frequencies had a strong effect on the design of some keyboard layouts. The most-frequent letters are on the bottom row of the Blickensderfer typewriter. The most-frequent letters are on the home row of the Dvorak Simplified Keyboard.

Relative frequencies of letters in the English language

The letter frequencies for English are listed below. However, this table differs slightly from others, such as Cornell University Math Explorer's Project, which produced [ this table] after measuring over 40,000 words.

In English, the space is slightly more frequent than the top letter (7% more frequent than, or 107% as frequent as, e), and the non-alphabetic characters (digits, punctuation, etc.) occupy the fourth position, between t and a.

Relative frequencies of the first letters of a word in the English language

First Letter of a word frequencies:

Relative frequencies of letters in other languages

*See Turkish dotted and dotless I

The figure below illustrates the frequency distributions of the 26 most common Latin letters across some languages.

Based on these tables, the 'etaoin shrdlu'-equivalent results for each language is as follows:

  • French: 'esait nrulo'; (Indo-European: Romance; traditionally, 'esartinulop' is used, in part for its ease of pronunciation)
  • Spanish: 'eaosr nidlc'; (Indo-European: Romance)
  • Portuguese: 'aeosr indmt' (Indo-European: Romance)
  • Italian: 'eaion lrtsc'; (Indo-European: Romance)
  • Esperanto: 'aieon lsrtk' (artificial language – influenced by Indo-European languages, Romance, Germanic mostly)
  • German: 'enisr atdhu'; (Indo-European: Germanic)
  • Swedish: 'eantr slido'; (Indo-European: Germanic)
  • Turkish: 'aeinr ldkmu'; (Turkic: a non Indo-European language)
  • Dutch: 'enati rodsl'; (Indo-European: Germanic)
  • Polish: 'aoiez nscwr'; (Indo-European: Slavic)

All these languages use a basically similar 25+ character alphabet.

From Yahoo Answers

Question:Also, is the cumulative frequency just all the frequencies added up? Or is there some equation to it?

Answers:frequency is the number of observations of a given type. relative frequency is the number of observations of a given type divided by the total number of observations. for example, if you are looking at the color of cars passing through an intersection in a time unit and observe { green, blue, red, red, white, black, green, red, black, black, black, red, blue} there are 13 observations the frequency of green cars is: 2 the frequency of blue cars is: 2 the frequency of red cars is: 4 the frequency of white cars is: 1 the frequency of black cars is: 4 the relative frequency of green cars is: 2/13 the relative frequency of blue cars is: 2/13 the relative frequency of red cars is: 4/13 the relative frequency of white cars is: 1/13 the relative frequency of black cars is: 4/13

Question:Greetings all, I need some help please. In Excel I need to calculate the Frequency, cumulative Frequency, relative Frequency and cumulative relative Frequency. If I understand the function of f in excel, the array would be, for example, a column I want to calculate #1-279. Out of those 279 items I want the f on 38 of them. Thus the bin would be 38 of the item out of the 279. However, from there, how do I calculate the cf, rf and rcf? Thanks, Justin

Answers:you are asking about histogramming concepts here. here, a range of an independent variable, say from 0 to 1, is broken up into "bins", say 100. thus the width of the bin is 0.01. then one loops through the data. if a value is in bin #1, i.e it's value is >=0 and <0.01, then the number of counts in this bin is incremented by one. after all of the data values are placed in the histogram the rest is easy. the frequency of a given value is just the number of counts in the corresponding bin. the cumulative frequency up to bin N is the sum of the frequencies in all bins below and up to bin N. relative frequency just means normalizing the number of counts in each bin by the total number of counts in the histogram. cumulative relative frequency up to bin N is the sum of the relative frequencies in all bins below and up to bin N.

Question:& cumulative relative frequency distribution showing "greather than or within" relative frequencies, in excel. oops I made a typo, I meant to say "greater than."

Answers:I will give you an example, by using the prime numbers up to 50. They are 2,3,5,7,11,13,17,19,23,29,31,37,41,43,47. We begin by dividing into appropriate intervals, usually of the same length. We will divide into five equal intervals, The choice of how many intervals to use is purely stylistic. We make a rectangular chart. Down the left side, we list our intervals. Across the top, we list headings for columns that are, in order, frequency (how many in the interval), relative frequency, and cumulative frequency: Interval Freq. Rel. Freq. Cumul. Freq. 1-10.........4......4/15.=78.7 %. 4 11-20.......4......4/15=78.7%.. 8 21-30.......2......2/15=39.3%.. 10 31-40.......2......2/15=39.3%. 12 41-50.......3......3/15=20.0%. 15 Total.......15... ..1 = 100.0%. 15 To explain, we begin with frequency. There are 4 primes between 11 and 20, for example, so the frequency is 4. There are 15 primes in our whole sample, so the relative frequency for the intereval 11-20 is 4/15, or frequency divided by total. The cumulative frequency is the SUM of the frequencies up to here, or 4 + 4 = 8 (the first 4 is for the interval 1-10, and the second 4 is for the interval 11-20). Cumulative frequency, thus, is like keeping a running total. To check it, for the last interval in your table, it will always be the total number of data points in your sample. There are two ways to do this in excel. One is to compute it by hand as I have shown above, and then just enter these data in a spreadsheet. Then, use the usual commands for making pie charts and histograms. The second is to enter the data for frequency, and label your intervals. Then, you can insert the next two columns, with the insert column command. By using arrays, you can call a(n) the nth entry in the first column, and in this example, set b(n)=a(n)/15 for all n (I cannot remember the exact excel command here, but it is a lot like this). For the third column, you would put the first entry in (call it c(1), and then let c(n+1)=c(n)+a(n+1) for the rest of your table. That will give cumulative frequency. The commands are slightly dependent on your system and on how you begin, but this is the key idea.

Question:Hello, could someone help me with these calculations. Given that I know the air volume in a room, the air temperature and atmospheric pressure how can I calculate the absolute amount of water contained in the air. My ultimate goal is to compute how the relative humidity changes for every liter of water added/removed from the air.

Answers:Humidity can be measured indirectly with dry and wet-bulb thermometers. The dry-bulb is an ordinary mercury-in-glass thermometer and measures the air temperature (for more see reference below)

From Youtube

Math Lessons : How to Calculate Relative Error :To calculate relative error, divide the experimental value by the real value and then divide that number by the real number to get a percentage of relative error. Calculate relative error, often used for science experiments, with advice from a standardized test prep instructor in this free video on mathematics. Expert: Brian Leaf Contact: Bio: Brian Leaf, MA, is the author of McGraw-Hill's Top 50 Skills for SAT/ACT Success series and has instructed SAT, ACT, GED and SSAT preparation to thousands of students. Filmmaker: David Pakman