Terms in this set (40)
Individuals
the objects described by a set of data; may be people, animals, or things
Variable
any characteristic of an individual
Categorical Data
individual observations are qualitative.
categorical variable
places an individual into one of several groups or categories
quantitative variable
takes numerical values for which it makes sense to find an average
distribution
shows the values of a variable and how often these values occur
pie chart
Shows the distribution of a categorical variable as a "pie" whose slices are sized by the counts or percents for the categories. A pie chart must include all the categories that make up a whole.
A ______ of one of the categorical variables in a two-way table of counts in the distribution of values of that variable among all individuals described by the table (outside)
Marginal distribution
A _____ of a variable describes the values of that variable among individuals who have a specific value of another variable. There is a separate conditional distribution for each value of the other variable (inside)
Conditional distribution
Association
knowing the value of one variable helps predict the value of the other
Two-way table of counts
organizes the data about two categorical variables measured for the same set of individual
Frequency table
displays the count of observations in each category or class
Relative frequency table
Shows the percents of observations in each category or class.
Dot plot
a graph of numerical data in which each observation is represented by a dot on or above a horizontal measurement scale
Symmetry
the right and left sides of a graph are approximately mirror images of each other
Skewed to the right
the tail of the data is heading off to the right
Skewed to the left
the tail of the data is heading off to the left
Stemplot
A simple graphical display for fairly small data sets
Histogram
Displays the distribution of a quantitative variable.
Unimodal
quantitative data with only one peak
Bimodal
quantitative data with two clear peaks
SOCS
Shape (peaks, clusters, gaps), Center (mean, median or mode), Spread (range); describes the pattern of the distribution of a quantitative variable.
Outliers
observations that lie outside the overall pattern of a distribution
Range
the value of the maximum minus the minimum
Mean
the arithmetic average
Median
the value of the midpoint of a distribution
First quartile
when the observations in a data set are ordered from lowest to highest. the median value of the lower half the data
Third quartile
when the observations in a data set are ordered from the lowest to highest, the median value of the upper half of the data
Interquartile range (IQR)
the difference in the values of Q3 and Q1
Outlier
an observation that falls more than 1.5 times the IQR above the 3rd quartile or below the 1st quartile.
Five-number summary
consists of the values for; the minimum, Q1, median, Q3, and maximum.
Boxplot
A graph of the five-number summary
Standard deviation
measures the average distance of the observations from their mean
Variance
the average squared distance of the observations in a data set from their mean
'p'th percentile
value such that 'p' percent of the observations in the data set fall at or below that value
z-score
tells how many standard deviations the value is away from the mean
transform data
changes units of measurements
density curve
a curve that is always on or above the horizontal axis, and has an area of exactly 1 underneath it
normal distribution
specified by 2 numbers: its mean, and the standard deviation, and described by a Normal density curve
Empirical rule
data that has 68%, 95%, and 99.7% of observations within approximately 1, 2, and 3 standard deviation of the mean respectively.
