The R Companion to Elementary Applied Statistics includes traditional applications covered in elementary statistics courses as well as some additional methods that address questions that might arise during or after the application of commonly used methods. Beginning with basic tasks and computations with R, readers are then guided through ways to bring data into R, manipulate the data as needed, perform common statistical computations and elementary exploratory data analysis tasks, prepare customized graphics, and take advantage of R for a wide range of methods that find use in many elementary applications of statistics.
Features:
- Requires no familiarity with R or programming to begin using this book.
- Can be used as a resource for a project-based elementary applied statistics course, or for researchers and professionals who wish to delve more deeply into R.
- Contains an extensive array of examples that illustrate ideas on various ways to use pre-packaged routines, as well as on developing individualized code.
- Presents quite a few methods that may be considered non-traditional, or advanced.
- Includes accompanying carefully documented script files that contain code for all examples presented, and more.
R is a powerful and free product that is gaining popularity across the scientific community in both the professional and academic arenas. Statistical methods discussed in this book are used to introduce the fundamentals of using R functions and provide ideas for developing further skills in writing R code. These ideas are illustrated through an extensive collection of examples.
About the Author:
Christopher Hay-Jahans
received his Doctor of Arts in mathematics from Idaho State University in 1999. After spending three years at University of South Dakota, he moved to Juneau, Alaska, in 2002 where he has taught a wide range of undergraduate courses at University of Alaska Southeast.
- Preliminaries
- Bringing Data Into and Out of R
- Accessing Contents of Data Structures
- Altering and Manipulating Data
- Summaries and Statistics
- More on Computing with R
- Basic Charts for Categorical Data
- Basic Plots for Numeric Data
- Scatterplots, Lines, and Curves
- More Graphics Tools
- Tests for One and Two Proportions
- Tests for More than Two Proportions
- Tests of Variances and Spread
- Tests for One or Two Means
- Tests for More than Two Means
- Selected Tests for Medians, and More
- Dependence and Independence
First Steps
Running Code in R
Some Terminology
Hierarchy of Data Classes
Data Structures
Operators
Functions
R Packages
Probability Distributions
Coding Conventions
Some Book-keeping and Other Tips
Getting Quick Coding Help
Entering Data Through Coding
Number and Sample Generating Tricks
The R Data Editor
Reading Text Files
Reading Data from Other File Formats
Reading Data from the Keyboard
Saving and Exporting Data
Extracting Data from Vectors
Conducting Data Searches in Vectors
Working with Factors
Navigating Data Frames
Lists
Choosing an Access/Extraction Method
Additional Notes
More About the attach Function
About Functions and their Arguments
Alternative Argument Assignments in Function Calls
Altering Entries in Vectors
Transformations
Manipulating Character Strings
Sorting Vectors and Factors
Altering Data Frames
Sorting Data Frames
Moving Between Lists and Data Frames
Additional Notes on the merge Function
Univariate Frequency Distributions
Bivariate Frequency Distributions
Statistics for Univariate Samples
Measures of Central Tendency
Measures of Spread
Measures of Position
Measures of Shape
Five-Number Summaries and Outliers
Elementary Five-Number Summary
Tukey’s Five-Number
The boxplotstats Function
Computing with Numeric Vectors
Working with Lists, Data Frames and Arrays
The sapply Function
The tapply Function
The by Function
The aggregate Function
The apply Function
The sweep Function
For-loops
Conditional Statements and the switch Function
The if-then Statement
The if-then-else Statement
The switch Function
Preparing Your Own Functions
Preliminary Comments
Bar Charts
Dot Charts
Pie Charts
Exporting Graphics Images
Additional Notes
Customizing Plotting Windows
The plotnew and plotwindow Functions
More on the paste Function
The title Function
More on the legend Function
More on the mtext Function
The text Function
Histograms
Boxplots
Stripcharts
QQ-Plots
Normal Probability QQ-Plots
Interpreting Normal Probability QQ-Plots
More on Reference Lines for QQ-Plots
QQ-Plots for Other Distributions
Additional Notes
More on the ifelse Function
Revisiting the axis Function
Frequency Polygons and Ogives
Scatterplots
Basic Plots
Manipulating Plotting Characters
Plotting Transformed Data
Matrix Scatterplots
The matplot Function
Graphs of Lines
Graphs of Curves
Superimposing Multiple Lines and/or Curves
Time-series Plots
Partitioning Graphics Windows
The layout Function
The splitscreen Function
Customizing Plotted Text and Symbols
Inserting Mathematical Annotation in Plots
More Low-level Graphics Functions
The points and symbols Functions
The grid, segments and arrows Functions
Boxes, Rectangles and Polygons
Error Bars
Computing Bounds for Error Bars
The errorBarplot Function
Purpose and Interpretation of Error Bars
More R Graphics Resources
Relevant Probability Distributions
Binomial Distributions
Hypergeometric Distributions
Normal Distributions
Chi-square Distributions
Single Population Proportions
Estimating a Population Proportion
Hypotheses for Single Proportion Tests
A Normal Approximation Test
A Chi-square Test
An Exact Test
Which Approach Should be Used?
Two Population Proportions
Estimating Differences Between Proportions
Hypotheses for Two Proportions Tests
A Normal Approximation Test
A Chi-square Test
Fisher’s Exact Test
Which Approach Should be Used?
Additional Notes
Normal Approximations of Binomial Distributions
One- versus Two-sided Hypothesis Tests
Equality of Three or More Proportions
Pearson’s Homogeneity of Proportions Test
Marascuilo’s Large Sample Procedure
Cohen’s Small Sample Procedure
Simultaneous Pairwise Comparisons
Marascuilo’s Large Sample Procedure
Cohen’s Small Sample Procedure
Linear Contrasts of Proportions
Marascuilo’s Large Sample Approach
Cohen’s Small Sample Approach
The Chi-square Goodness-of-Fit Test
Relevant Probability Distributions
F Distributions
Using a Sample to Assess Normality
Single Population Variances
Estimating a Variance
Testing a Variance
Exactly Two Population Variances
Estimating the Ratio of Two Variances
Testing the Ratio of Two Variances
What if the Normality Assumption is Violated?
Two or More Population Variances
Assessing Spread Graphically
Levene’s Test
Levene’s Test with Trimmed Means
Brown-Forsythe Test
Fligner-Killeen Test
Student’s t-Distribution
Single Population Means
Verifying the Normality Assumption
Estimating a Mean
Testing a Mean
Can a Normal Approximation be Used Here?
Exactly Two Population Means
Verifying Assumptions
The Test for Dependent Samples
Tests for Independent Samples
Relevant Probability Distributions
Studentized Range Distribution
Dunnett’s Test Distribution
Studentized Maximum Modulus Distribution
Setting the Stage
Equality of Means — Equal Variances Case
Pairwise Comparisons — Equal Variances
Bonferroni’s Procedure
Tukey’s Procedure
t Tests and Comparisons with a Control
Dunnett’s Test and Comparisons with a Control
Which Procedure to Choose
Equality of Means — Unequal Variances Case
Large-sample Chi-square Test
Welch’s F Test
Hotelling’s T Test
Pairwise Comparisons — Unequal Variances
Large-sample Chi-square Test
Dunnett’s C Procedure
Dunnett’s T Procedure
Comparisons with a Control
Which Procedure to Choose
The Nature of Differences Found
All Possible Pairwise Comparisons
Comparisons with a Control
Relevant Probability Distributions
Distribution of the Signed Rank Statistic
Distribution of the Rank Sum Statistic
The One-sample Sign Test
The Exact Test
The Normal Approximation
Paired Samples Sign Test
Independent Samples Median Test
Equality of Medians
Pairwise Comparisons of Medians
Single Sample Signed Rank Test
The Exact Test
The Normal Approximation
Paired Samples Signed Rank Test
Rank Sum Test of Medians
The Exact Mann-Whitney Test
The Normal Approximation
The Wilcoxon Rank Sum Test
Using the Kruskal-Wallis Test to Test Medians
Working with Ordinal Data
Paired Samples
Independent Samples
More than Two Independent Samples
Some Comments on the Use of Ordinal Data
Assessing Bivariate Normality
Pearson’s Correlation Coefficient
An Interval Estimate of ρ
Testing the Significance of ρ
Testing a Null Hypothesis with ρ ≠
Kendall’s Correlation Coefficient
An Interval Estimate of τ
Exact Test of the Significance of τ
Approximate Test of the Significance of τ
Spearman’s Rank Correlation Coefficient
Exact Test of the Significance of ρS
Approximate Test of the Significance ρS
Correlations in General — Comments and Cautions
Chi-square Test of Independence
For the Curious — Distributions of rK and rS
Biography
Christopher Hay-Jahans received his Doctor of Arts in mathematics from Idaho State University in 1999. After spending three years at University of South Dakota, he moved to Juneau, Alaska, in 2002 where he has taught a wide range of undergraduate courses at University of Alaska Southeast.
"This book is written by a Professor of Mathematics with much experience in teaching statistics applied to the natural sciences. As mentioned in the Preface, the book addresses students (and teachers) of elementary statistics courses. Only basic preliminary statistical knowledge is necessary to start using the book, it is perfect for anyone jumping in to R, and it could readily serve as a reference manual rather than to be read from beginning to end... Several simple applied examples with detailed explanations are presented (coded in R) in order to make the methods more deeply understandable, and in some cases to compare different types of application (e.g. when different assumptions are filled, different research questions are of interest, or different types of data are recorded). All the richly-commented script files used in the book are available on the publisher’s website... At the end of the book, a highly informative Index aids quick searches. Nevertheless, the book can be ordered as an e-book as well... This second book of Professor Hay-Jahans, particularly together with the first one, is appropriate for undergraduate students as an
introductory book on statistics using R, but it could successfully be used also by PhD students, researchers, and teachers requiring a consistent and through reference."
- Márta Ladányi, ISCB December 2019