Basic Statistics: An Introduction with R by Tenko Raykov, ISBN-13: 978-1442218475
[PDF eBook eTextbook]
- Publisher: Rowman & Littlefield Publishers (October 4, 2012)
- Language: English
- 344 pages
- ISBN-10: 1442218479
- ISBN-13: 978-1442218475
Basic Statistics provides an accessible and comprehensive introduction to statistics using the free, state-of-the-art, powerful software program R. This book is designed to both introduce students to key concepts in statistics and to provide simple instructions for using R.
This concise book:
Table of Contents:
Preface xi
1 Statistics and Data 1
1.1. Statistics as a science 1
1.2. Collecting data 4
1.3. Why study statistics? 5
2 An Introduction to Descriptive Statistics: Data Description and Graphical
Representation 7
2.1. Descriptive statistics 7
2.2. Graphical means of data description for a single variable 8
2.2.1. Reading data into R 9
2.2.2. Graphical representation of data 11
2.2.2.1. Pie charts and barplots 13
2.2.2.2. Graphical representation of quantitative variables 18
3 Data Description: Measures of Central Tendency and Variability 23
3.1. Measures of central tendency 23
3.1.1. The mode 23
3.1.2. The median 25
3.1.3. The mean 27
3.2. Measures of variability 30
3.3. The boxplot 36
3.3.1. Quartiles 36
3.3.2. Definition of the boxplot and its empirical construction 38
3.3.3. Boxplots and comparison of groups of scores 40
4 Probability 43
4.1. The importance of probability 43
4.2. Definition of probability 43
4.2.1. Classical definition 44
4.2.2. Relative frequency definition 44
4.2.3. Subjective definition 45
4.3. Evaluation of event probability 45
4.4. Basic relations between events and their probabilities 48
4.5. Conditional probability and independence 50
4.5.1. Defining conditional probability 50
4.5.2. Event independence 52
4.6. Bayes’ theorem 52
5 Probability Distributions of Random Variables 55
5.1. Random variables 55
5.2. Probability distributions for discrete random variables 56
5.2.1. A start-up example 56
5.2.2. The binomial distribution 58
5.2.3. The Poisson distribution 63
5.3. Probability distributions for continuous random variables 65
5.3.1. The normal distribution 67
5.3.1.1. Probability density function 68
5.3.1.2. Graphing a normal distribution 68
5.3.1.3. Mean and variance of a normal distribution 70
5.3.1.4. The standard normal distribution 71
5.3.2. z-scores 72
5.3.3. How to obtain z-scores in empirical research 74
5.4. The normal distribution and areas under the normal density curve 75
5.5. Percentiles of the normal distribution 78
5.6. Using published statistical tables to work out normal probabilities 79
6 Random Sampling Distributions and the Central Limit Theorem 81
6.1. Random sampling distribution 81
6.1.1. Random sample 81
6.1.2. Sampling distribution 83
6.2. The random sampling distribution of the mean 85
6.2.1. Mean and variance of the random sampling distribution of the
mean 86
6.2.2. Standard error of the mean 87
6.3. The central limit theorem 88
6.3.1. The central limit theorem as a large-sample statement 89
6.3.2. When is normality obtained for a finite sample? 89
6.3.3. How large a sample size is sufficient for the central limit
theorem to be valid? 90
6.3.4. The central limit theorem for sums of random variables 91
6.3.5. Revisiting the random sampling distribution concept 92
6.3.6. An application of the central limit theorem 93
6.4. Assessing the normality assumption for a population distribution 94
7 Inferences about Single Population Means 99
7.1. Population parameters 99
7.2. Parameter estimation and hypothesis testing 100
7.3. Point and interval estimation of the mean 100
7.3.1. Point estimation 100
7.3.2. Interval estimation 101
7.3.3. Standard normal distribution quantiles for use in confidence
intervals 105
7.3.4. How good is an estimate, and what affects the width of a
confidence interval? 107
7.4. Choosing sample size for estimating the mean 109
7.5. Testing hypotheses about population means 111
7.5.1. Statistical testing, hypotheses, and test statistics 111
7.5.2. Rejection regions 113
7.5.3. The ‘‘assumption’’ of statistical hypothesis testing 113
7.5.4. A general form of a z-test 116
7.5.5. Significance level 117
7.6. Two types of possible error in statistical hypothesis testing 118
7.6.1. Type I and Type II errors 120
7.6.2. Statistical power 120
7.6.3. Type I error and significance level 121
7.6.4. Have we proved the null or alternative hypothesis? 121
7.6.5. One-tailed tests 123
7.6.5.1. Alternative hypothesis of mean larger than a
prespecified number 124
7.6.5.2. Alternative hypothesis of mean smaller than a
prespecified number 125
7.6.5.3. Advantages and drawbacks of one-tailed tests 127
7.6.5.4. Extensions to one-tailed null hypotheses 127
7.6.5.5. One- and two-tailed tests at other significance levels 129
7.7. The concept of p-value 130
7.8. Hypothesis testing using confidence intervals 135
8 Inferences about Population Means When Variances Are Unknown 137
8.1. The t-ratio and t-distribution 137
8.1.1. Degrees of freedom 138
8.1.2. Properties of the t-distribution 139
8.2. Hypothesis testing about the mean with unknown standard deviation 143
8.2.1. Percentiles of the t-distribution 143
8.2.2. Confidence interval and testing hypotheses about a given
population mean 144
8.2.3. One-tailed t-tests 147
8.2.4. Inference for a single mean at another significance level 147
8.3. Inferences about differences of two independent means 149
8.3.1. Point and interval estimation of the difference in two
independent population means 150
8.3.2. Hypothesis testing about the difference in two independent
population means 152
8.3.3. The case of unequal variances 156
8.4. Inferences about mean differences for related samples 157
8.4.1. The sampling distribution of the mean difference for two
related samples 158
9 Inferences about Population Variances 161
9.1. Estimation and testing of hypotheses about a single population
variance 161
9.1.1. Variance estimation 161
9.1.2. The random sampling distribution of the sample variance 162
9.1.3. Percentiles of the chi-square distribution 164
9.1.4. Confidence interval for the population variance 165
9.1.5. Testing hypotheses about a single variance 166
9.2. Inferences about two independent population variances 167
9.2.1. The F-distribution 167
9.2.2. Percentiles of the F-distribution 169
9.2.3. Confidence interval for the ratio of two independent
population variances 170
10 Analysis of Categorical Data 173
10.1. Inferences about a population probability (proportion) 173
10.2. Inferences about the difference between two population probabilities
(proportions) 176
10.3. Inferences about several proportions 178
10.3.1. The multinomial distribution 178
10.3.2. Testing hypotheses about multinomial probabilities 179
10.4. Testing categorical variable independence in contingency tables 182
10.4.1. Contingency tables 183
10.4.2. Joint and marginal distributions 183
10.4.3. Testing variable independence 184
11 Correlation 189
11.1. Relationship between a pair of random variables 189
11.2. Graphical trend of variable association 190
11.3. The covariance coefficient 193
11.4. The correlation coefficient 196
11.5. Linear transformation invariance of the correlation coefficient 201
11.6. Is there a discernible linear relationship pattern between two variables
in a studied population? 202
11.7. Cautions when interpreting a correlation coefficient 205
12 Simple Linear Regression 209
12.1. Dependent and independent variables 209
12.2. Intercept and slope 210
12.3. Estimation of model parameters (model fitting) 213
12.4. How good is the simple regression model? 216
12.4.1. Model residuals and the standard error of estimate 216
12.4.2. The coefficient of determination 217
12.5. Inferences about model parameters and the coefficient of
determination 221
12.6. Evaluation of model assumptions, and modifications 223
12.6.1. Assessing linear regression model assumptions via residual
plots 223
12.6.2. Model modification suggested by residual plots 228
13 Multiple Regression Analysis 239
13.1. Multiple regression model, multiple correlation, and coefficient of
determination 239
13.2. Inferences about parameters and model explanatory power 244
13.2.1. A test of significance for the coefficient of determination 244
13.2.2. Testing single regression coefficients for significance 245
13.2.3. Confidence interval for a regression coefficient 247
13.3. Adjusted R2 and shrinkage 248
13.4. The multiple F-test and evaluation of change in proportion of
explained variance following dropping or addition of predictors 250
13.5. Strategies for predictor selection 257
13.5.1. Forward selection 257
13.5.2. Backward elimination 258
13.5.3. Stepwise selection (stepwise regression) 262
13.6. Analysis of residuals for multiple regression models 264
14 Analysis of Variance and Covariance 269
14.1. Hypotheses and factors 269
14.2. Testing equality of population means 274
14.3. Follow-up analyses 281
14.4. Two-way and higher-order analysis of variance 285
14.5. Relationship between analysis of variance and regression analysis 291
14.6. Analysis of covariance 297
15 Modeling Discrete Response Variables 303
15.1. Revisiting regression analysis and the general linear model 303
15.2. The idea and elements of the generalized linear model 305
15.3. Logistic regression as a generalized linear model of particular
relevance in social and behavioral research 307
15.3.1. A ‘‘continuous counterpart’’ of regression analysis 307
15.3.2. Logistic regression and a generalized linear model with a
binary response 308
15.3.3. Further generalized linear models 309
15.4. Fitting logistic regression models using R 310
Epilogue 321
References 323
Index 325
About the Authors 331
Tenko Raykov is professor of measurement and quantitative methods at Michigan State University.
George A. Marcoulides is professor of research methods and statistics at the University of California, Riverside.
What makes us different?
• Instant Download
• Always Competitive Pricing
• 100% Privacy
• FREE Sample Available
• 24-7 LIVE Customer Support
Reviews
There are no reviews yet.