Back to Main Statistics Home
General Statistics
Chapter 2
Definition of Key Terms
Introduction to Descriptive Statistics

Introduction to Frequency Distributions

quantitative population characteristics is one that can be expressed numerically.

qualitative population characteristics is one that is non-numerical.

An attribute  is the particular observation of qualitative characteristics.

A  variable (variate)  is the particular observation of quantitative characteristics.

Nominal data is when numbers are used to represent categories or attributes and have no quantitative significance or meaning.

Ordinal data is when values given to observations are ranked by importance, strength, or severity.

Interval Scale is when the values given to observations reflect true differences between the values and arithmetic operation can be done with the values.

Ratio Data is data that allow for all basic arithmetic operations, including division and multiplication.

An independent variable is one that is used to determine an effect (the source or causality quantity). Often these variables are either controlled by the researcher or experimenter, or manipulated, or used to classify data.

dependent variable is the variable that measures the effects of the independent variable. Often its value is dependent on the value of the independent variable.

discrete number can only exist at specific points on a scale such as days of the week, it can only be {1, 2, 3, 4, 5, 6, 7}.

continuous number may have any possible values of a number, such as whole number or fractional parts, an example is number of seconds after the start of a race, 12.5635.. seconds.

discrete approximation of a continuous value is a single number that is an approximation of a continuous measurements that is exact only in concept or principle.

Raw data is data that is not usually summarized or organized in any meaningful way.

Class intervals are one way of categorizing raw data according to numerical constant intervals.

Frequency is the numerical count of data in each class interval.

Sample Frequency Distribution is a representation of data showing there frequency in all class intervals.

Introduction to Graphs

Shapes of distributions

Central tendency and Variability

Central tendency is a measure of location or a value around which the observations from a sample tends to cluster and which typifies their magnitude.

Dispersion is a measure of variability or dispersion among observation values. It gives an indication of the distribution or spread of the data.

The arithmetic mean is a statistical parameter used to measure the average of a group of data and is defined as the sum of all the data divided by the number of data values.


 

The sample median is the central or "middle" observation for a group of data arranged in increasing order and it is denoted by the symbol, m.
 

resistant measure is a measurement that is not easily resistance to extreme values in the data, such as the median.

The mode is a measure of central tendency and is the most frequently occurring value(s).

The weighted average is the sum of the midpoints of each class or category of data times their relative frequencies. 

percentage is parameter with unit as, % that represents the proportion of a value compared to another number relative to the other number being consided a whole.

Percentile is the value below which a stated percentage of the observations lie.

The percent rank is a percent number that indicates the percentage of observations that falls below a given value.

fractile is that point below which a stated fraction (or decimal equilvalence) of the values lie.

The first quartile is the same as the 25th percentile or 0.25-fractile.

The second quartile is the 50th percentile or the 0.05-fractile.

The third quartile equals the 75th percentile or 0.75-fractile (there is no fourth quartile).

The range is the difference between the the highest and lowest values in a distribution of data.

The interquartile range is the difference between the 1st and 3rd quartiles (25th and 75th percentile) of a distribution of data

The semi-interquartile range is half the interquartile range.

A deivation is when one value or number is substracted from another. (e - )

sum of square is when the deviations are squared and added together. 

The variance is the average sum of squared deviations from the mean or the mean of the squared deviations.

The population variance is denoted by the Greek symbol, sigma squared or  .

The sample variance is denoted by the symbol, s2 .