Explore Related Concepts

example of group and ungrouped data

Best Results From Wikipedia Yahoo Answers Youtube


From Wikipedia

Data type

In computer programming, a data type (or datatype) is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on that type; and the way the values of that type are stored.

Overview

Almost all programming languages explicitly include the notion of data type, though different languages may use different terminology. Common data types may include:

For example, in the Java programming language, the "int" type represents the set of 32-bitintegers ranging in value from -2,147,483,648 to 2,147,483,647, as well as the operations that can be performed on integers, such as addition, subtraction, and multiplication. Colors, on the other hand, are represented by three bytes denoting the amounts each of red, green, and blue, and one string representing that color's name; allowable operations include addition and subtraction, but not multiplication.

Most programming languages also allow the programmer to define additional data types, usually by combining multiple elements of other types and defining the valid operations of the new data type. For example, a programmer might create a new data type named "complex number" that would include real and imaginary parts. A data type also represents a constraint placed upon the interpretation of data in a type system, describing representation, interpretation and structure of values or objects stored in computer memory. The type system uses data type information to check correctness of computer programs that access or manipulate the data.

Classes of data types

Algebraic data types

Function types

Machine data types

All data in computers based on digital electronics is represented as bits (alternatives 0 and 1) on the lowest level. The smallest addressable unit of data is usually a group of bits called a byte (usually an octet, which is 8 bits). The unit processed by machine code instructions is called a word (as of 2008, typically 32 or 64 bits). Most instructions interpret the word as a binary number, such that a 32-bit word can represent unsigned integer values from 0 to 2^{32}-1 or signed integer values from -2^{31} to 2^{31}-1. Because of two's complement, the machine language and machine don't need to distinguish between these unsigned and signed data types for the most part.

There is a specific set of arithmetic instructions that use a different interpretation of the bits in word as a floating-point number.

Object types

Pointer and reference data types

Primitive data types


Descriptive statistics

Descriptive statistics describe the main features of a collection of data quantitatively. Descriptive statistics are distinguished from inferential statistics (or inductive statistics), in that descriptive statistics aim to summarize a data set quantitatively without employing a probabilistic formulation, rather than use the data to make inferences about the population that the data are thought to represent. Even when a data analysis draws its main conclusions using inferential statistics, descriptive statistics are generally also presented. For example in a paper reporting on a study involving human subjects, there typically appears a table giving the overall sample size, sample sizes in important subgroups (e.g., for each treatment or exposure group), and demographic or clinical characteristics such as the average age, the proportion of subjects of each sex, and the proportion of subjects with related comorbidities.

Inferential statistics

Inferential statistics tries to make inferences about a population from the sample data. We also use inferential statistics to make judgments of the probability that an observed difference between groups is a dependable one, or that it might have happened by chance in this study. Thus, we use inferential statistics to make inferences from our data to more general conditions; we use descriptive statistics simply to describe what's going on in our data.

Use in statistical analyses

Descriptive statistics provide simple summaries about the sample and the measures. Together with simple graphics analysis, they form the basis of quantitative analysis of data.

Descriptive statistics summarize data. For example, the shooting percentage in basketball is a descriptive statistic that summarizes the performance of a player or a team. This number is the number of shots made divided by the number of shots taken. A player who shoots 33% is making approximately one shot in every three. One making 25% is hitting once in four. The percentage summarizes or describes multiple discrete events. Or, consider the scourge of many students, the grade point average. This single number describes the general performance of a student across the range of their course experiences.

Describing a large set of observations with a single indicator risks distorting the original data or losing important detail. For example, the shooting percentage doesn't tell you whether the shots are three-pointers or lay-ups, and GPA doesn't tell you whether the student was in difficult or easy courses. Despite these limitations, descriptive statistics provide a powerful summary that may enable comparisons across people or other units.

Univariate analysis

Univariate analysis involves the examination across cases of a single variable, focusing on three characteristics: the distribution; the central tendency; and the dispersion. It is common to compute all three for each study variable.

Distribution

The distribution is a summary of the frequency of individual or ranges of values for a variable. The simplest distribution would list every value of a variable and the number of cases who had that value. For instance, computing the distribution of gender in the study population means computing the percentages that are male and female. The gender variable has only two, making it possible and meaningful to list each one. However, this does not work for a variable such as income that has many possible values. Typically, specific values are not particularly meaningful (income of 50,000 is typically not meaningfully different from 51,000). Grouping the raw scores using ranges of values reduces the number of categories to something for meaningful. For instance, we might group incomes into ranges of 0-10,000, 10,001-30,000, etc.

Frequency distributions are depicted as a table or as a graph. Table 1 shows an age frequency distribution with five categories of age ranges defined. The same frequency distribution can be depicted in a graph as shown in Figure 2. This type of graph is often referred to as a histogram or bar chart.

Central tendency

The central tendency of a distribution locates the "center" of a distribution of values. The three major types of estimates of central tendency are the mean, themedian, and themode.

The mean is the most commonly used method of describing central tendency. To compute the mean, take the sum of the values and divide by the count. For example, the mean quiz score is determined by summing all the scores and dividing by the number of students taking the exam. For example, consider the test score values:

15, 20, 21, 36, 15, 25, 15

The sum of these 7 values is 147, so the mean is 147/7 =21.

The median is the score found at the middle of the set of values, i.e., that has as many cases with a larger value as have a smaller value. One way to compute the median is to sort the values in numerical order, and then locate the value in the middle of the list. For example, if there are 500 values, the median is the average of the two values in 250th and 251st positions. If there are 501 values, the value in 250th position is the median. Sorting the 7 scores above produces:

15, 15, 15, 20, 21, 25, 36

There are 7 scores and score #4 represents the halfway point. The median is 20. If there are an even number of observations, then the median is the mean of the two middle scores. In the example, if there were an 8th observation, with a value of 25, the median becomes the average of the 4th and 5th scores, in this case 20.5.

The mode is the most frequently occurring value in the set. To determine the mode, compute the distribution as above. The mode is the value with the greatest frequency. In the example, the modal value 15, occurs three times. In some distributions there is a "tie" for the highest frequency, i.e., there are multiple modal values. These are called multi-modal distributions.

Notice that the three measures typically produce different results. The term "average" obscures the difference between them and is better avoided. The three values are equal if the distribution is perfectly "normal" (i.e., bell-shaped).

Dispersion

Dispersion is the spread of values around the central tendency. There are two common measures of dispersion, the range and the standard deviation. The range is simply the highest value minus the lowest value. In our example distribution, the high value is 36 and the low is 15, so the range is 36 − 15 = 21.

Th

Qualitative research

Qualitative research is a method of inquiry employed in many different academic disciplines, traditionally in the social sciences, but also in market research and further contexts. Qualitative researchers aim to gather an in-depth understanding of human behavior and the reasons that govern such behavior. The qualitative method investigates the why and how of decision making, not just what, where, when. Hence, smaller but focused samples are more often needed, rather than large samples.

Qualitative methods produce information only on the particular cases studied, and any more general conclusions are only hypotheses (informative guesses). Quantitative methods can be used to verify which of such hypotheses are true.

History

Until the 20 1970s, the phrase 'qualitative research' was used only to refer to a discipline of anthropology or sociology. During the 1970s and 1980s qualitative research began to be used in other disciplines, and became a significant type of research in the fields of education studies, social work studies, women's studies, disability studies, information studies, management studies, nursing service studies, political science, psychology, communication studies, and many other fields. Qualitative research occurred in the consumer products industry during this period, with researchers investigating new consumer products and product positioning/advertising opportunities. The earliest consumer research pioneers including Gene Reilly of The Gene Reilly Group in Darien, CT, Jerry Schoenfeld of Gerald Schoenfeld & Partners in Tarrytown, NY and Martin Calle of Calle & Company, Greenwich, CT, also Peter Cooper in London, England, and Hugh Mackay in Mission, Australia. There continued to be disagreement about the proper place of qualitative versus quantitative research. In the late 1980s and 1990s after a spate of criticisms from the quantitative side, new methods of qualitative research evolved, to address the perceived problems with reliability and imprecise modes of data analysis. During this same decade, there was a slowdown in traditional media advertising spending, so there was heightened interest in making research related to advertising more effective. To this date the present representative concept of antropy has a way and hence fort.

In the last thirty years the acceptance of qualitative research by journal publishers and editors has been growing. Prior to that time many mainstream journals were far more likely to publish research articles based upon the natural sciences and which featured quantitative analysis, than they were to publish articles based on qualitative methods.

Distinctions from quantitative research

(In simplified terms - Qualitative means a non-numerical data collection or explanation based on the attributes of the graph or source of data. For example, if you are asked to explain in qualitative terms a thermal image displayed in multiple colours, then you would explain the colour differences rather than the heat's numerical value.)

First, in qualitative research, cases can be selected purposefully, according to whether or not they typify certain characteristics or contextual locations.

Second, the researcher's role receives greater critical attention. This is because in qualitative research the possibility of the researcher taking a 'neutral' or transcendental position is seen as more problematic in practical and/or philosophical terms. Hence qualitative researchers are often exhorted to reflect on their role in the research process and make this clear in the analysis.

Third, while qualitative data analysis can take a wide variety of forms, it differs from quantitative research in its focus on language, signs and meaning. In addition, qualitative research approaches analysis holistically and contextually, rather than being reductionistic and isolationist. Nevertheless, systematic and transparent approaches to analysis are almost always regarded as essential for rigor. For example, many qualitative methods require researchers to carefully code data and discern and document themes consistently and reliably.

Perhaps the most traditional division between the uses of qualitative and quantitative research in the social sciences is that qualitative methods are used for exploration (i.e., hypothesis-generating) or for explaining puzzling quantitative results. Quantitative methods, by contrast, are used to test hypotheses. This is because establishing content validity — do measures measure what a researcher thinks they measure? — is seen as one of the strengths of qualitative research. Some consider quantitative methods to provide more representative, reliable and precise measures through focused hypotheses, measurement tools and applied mathematics. By contrast, qualitative data is usually difficult to graph or display in mathematical terms.

Qualitative research is often used for policy and program evaluation research since it can answer certain important questions more efficiently and effectively than quantitative approaches. This is particularly the case for understanding how and why certain outcomes were achieved (not just what was achieved) but also for answering important questions about relevance, unintended effects and impact of programs such as: Were expectations reasonable? Did processes operate as expected? Were key players able to carry out their duties? Did the program cause any unintended effects?

Qualitative approaches have the advantage of allowing for more diversity in responses as well as the capacity to adapt to new developments or issues during the research process itself. While qualitative research can be expensive and time-consuming to conduct, many fields of research employ qualitative techniques that have been specifically developed to provide more succinct, cost-efficient and timely results.


From Yahoo Answers

Question:I have a vast amount of data in an excel spreadsheet e.g. average weight from 0 to 10,000 or amount from 0 to 1 million I need to group the entries by categories of let's say: 0-200 201-300 301-400 etc. The idea is to create Thresholds that could be grouped in a pivot table or an excel chart. There could be a way with the IF function but excel inly supports 6 consecutive IFs in the same function. Would there be any way to create a table with thresholds and the link it to a column of my data. Thank you in advance Thanks for your answer. I was thinking of making a column with a label for each value according to which categorie it falls into. For example any value between 0-200 would get a lable of "0-200" , in order to be possible for them to be grouped in a chart or pivot. Instead of having thousands of values, we end up with let's say 10 thresholds. Even in a database, we would have the same issue.

Answers:This might be something a pivot table would be good for, but I've never used them, so I'll suggest something else... Create a table somewhere, as you mentioned, with your thresholds. Say in AA1:AB10 AA1:AA10 = the lower limit (i.e. 0, 201, 301...) AB1:AB10 = upper limit (i.e. 200, 300, 400...) In your new column next to your data, use a VLOOKUP formula to look at your data (say in B1) and return the threshold it belongs to: = VLOOKUP( B1, $AA$1:$AB$10, 1, TRUE) Copy/drag the formula down. The above will result in either "0" or "201" or "301" for each value in column B. If this is all you use, the second column of your lookup table isn't necessary. However, if you want the result to be "0 - 200" or "201 - 300" then use this formula: = VLOOKUP( B1,$AA$1:$AB$10, 1, TRUE)&" - " &VLOOKUP( B1,$AA$1:$AB$10, 2, TRUE) The "2" tells it to return a result from the second column (AB). The TRUE tells it to match with B1 or the next closest but lower value. (FYI: FALSE demands an exact match only).

Question:Ok, I have to explain to my class : 1. How to draw and use a cumulative frequency graph. 2. How to calculate the mean for grouped data. As I haven't been in school the last few days so I have no idea what to do. Please can you tell me how to do it, so I won't make a fool out of myself in front of the class.

Answers:The mean is simply the average. Mean = Sum of data divided by the total number of observations. There is no way someone can explain the graph by typing - you need to check out http://www.onlinemathlearning.com/cumulative-frequency-graph.html - it will give you a visual example of creating a graph - its very easy - you just need to see it visually. Without giving a specific example this is tough to explain by typing. You will have to do some reading to get through this. Good Luck.

Question:Need examples of a 3 or more comparative study Question Details: My brain is literally fried with all the research I have been doing. My problem is I am to do a comparative study with 3 or more groups and I need some examples. The one in the book is on how a salesclerk responds to a shopper by the way they are dressed. The three groups are: casual, dressy and sloppy. Independant Variable: style of clothing Dependant Variable: clerks' response times I am looking for any examples with 3 more more groups to compare or 2 groups with 1 controlled group. Another is how children of different age groups respond to the intake of candy at a certain time of day. Group 1: ages 4-7 group 2: ages 8-11 Group 3: ages12-14. Dependent Variable:the activity level of each group Independent Variable: the candy This is already been done, but I wanted to give another example of what I am looking for. This is for a PSY class if this adds any ideas Thanks for any help!

Answers:There's one a friend of mine and I did in college for a psyche class. The subject comes into the room, a simple (virtually bare) lab room with two chairs. The experimenter is seated at the far side of the room in one chair; the other chair is placed conveniently next to the door. The experimenter greets the subject and then says "Pull up a chair and we'll get started." As soon as the subject does so, the experimenter instructs Subject to hold still; he measures the position and angle of the subject's chair with respect to his own. The test result is the position of the chair (bearing, angle, and distance). We didn't have time to try this with experimenters of both sexes, but the results suggested that the angle of the chair was loosely correclated with the subject's sex; women occasionally took a side-to-side orientation, while the remaining subjects pulled up face-to-face, just beyond their concept of "personal space". The overall distance correlated with the subject's anxiety level (estimated by several interview questions immediately after the measurement). Another experiment used a "T" box to present a grid with pairs of letters for 250ms, followed by a spot indicating one of the pairs. The classic experiment in short-term memory did the same thing with only single letters. We were trying to achieve a run in which the subjects would get one of the two letters correct significantly more often than the other; our hypothesis was that reading direction would show the left-hand letter identified correctly more often. We divided subjects by age and sex. As it turned out, the 250ms time was far too short to let enough data enter short-term memory; our results were not significant. Also, we concluded that we should have included a division for the subject's native language; several who began reading other than left-to-right (Arabic and Mandarin) identified enough letters that it appeared they had no preference of side. Most subjects failed to properly identify more than one letter out of 20 attempts.

Question:For my data management assignment we are doing Chapter 2 which is One Variable Statistics we had to collect some data on of the idea I choose -Number of hours spent on working on homework / studying for school per night. Are there any noticeable trends? For my collected data, I must separate them into at least two distinct subgroups. Which I did males/females each subgroup must contain 15 people. I am asking does the females and males have to be seperate calculations right ??????????? for example The three measures of central tendency: mean, median and mode. Standard deviation and variance of the data: this will be useful in comparing different sets of data. A frequency table, with appropriate intervals and histogram, for each set of the data; A box-and-whisker-plot for each set of data (including calculations of the quartiles, IQR, semi-IQR; Any outliers in the data sets.

Answers:you should be using the exact same initial calculation on both groups... in a more advanced project, you would want to address any variance between the groups using some method (typically regression). However, from the information you provided, it looks like using the standard deviation comparing the individual results of each group from the collective whole is adequate. You would then also want to compare the other components - central tendency, mean, etc - between the two groups. As far as visual display, a histogram is a good one look representation, but you also may want to look at using a scatter plot (preferably in different colors for the groups) to quickly show your edge cases. The quartiles, IQR, etc. are easily found after formulating the rest.

From Youtube

Measurement Engineering Honda's Group :[Approach to Problems in Measurement Engineering From an Inverse Problem Viewpoint Honda's Group] Measurement is the basis of science, so much so that its been said, Theres no science without measurement. This is because measurement gives science objectivity. The Honda Group is studying measurement from various perspectives, such as: What should we measure? And how should we measure it? Q.This is a measuring instrument with a simple mechanism, where a spring extends when you exert a force on it. Its for measuring mass. The direct problem is to investigate how the springs extension depends on the force. To do a measurement, we want to measure how big the force is when an unknown force is applied. So we understand the direct problem: that the spring stretches this much when the original force is applied. Then, conversely, if we measure how much the spring stretches when an unknown force is applied, we can determine how much force was applied. Determining how much force was applied is called the inverse problem. Determining an unknown force is an inverse problem. By studying such inverse problems, its possible to obtain new ways of measuring data. And those results can lead to new developments. Every day, the Honda Group searches for ways to measure unknown quantities. QA typical example of an inverse problem arises in X-ray CT. Let me explain an example from our research where we use a similar approach to X-ray CT. This is Process Tomography, or fluid flow tomography ...

Calculating sample variance :The difference between calculating sample variance for grouped and ungrouped data is illustrated.