Frost J. Regression Analysis. An Intuitive Guide 2019
Textbook in PDF format My Approach to Teaching Regression and Statistics Correlation and an Introduction to Regression Graph Your Data to Find Correlations Interpret the Pearson’s Correlation Coefficient Examples of Positive and Negative Correlations Graphs for Different Correlations Discussion about the Correlation Scatterplots Pearson’s Correlation Measures Linear Relationships Hypothesis Test for Correlations Interpreting our Height and Weight Example Correlation Does Not Imply Causation How Strong of a Correlation is Considered Good? Common Themes with Regression Regression Takes Correlation to the Next Level Fundamental Terms and Goals of Regression Dependent Variables Independent Variables Simple versus Multiple Regression Goals of Regression Analysis Example of a Regression Analysis Regression Analyzes a Wide Variety of Relationships Using Regression to Control Independent Variables What does controlling for a variable mean? How do you control the other variables in regression? An Introduction to Regression Output Review and Next Steps Regression Basics and How it Works Data Considerations for OLS How OLS Fits the Best Line Observed and Fitted Values Residuals: Difference between Observed and Fitted Values Using the Sum of the Squared Errors (SSE) to Find the Best Line Implications of Minimizing SSE Other Types of Sums of Squares Displaying a Regression Model on a Fitted Line Plot Importance of Staying Close to Your Data Review and Next Steps Interpreting Main Effects and Significance Regression Notation Fitting Models is an Iterative Process Three Types of Effects in Regression Models Main Effects of Continuous Variables Graphical Representation of Regression Coefficients Confidence Intervals for Regression Parameters Example Regression Model with Two Linear Main Effects Interpreting P-Values for Continuous Independent Variables Recoding Continuous Independent Variables Standardizing the Continuous Variables Interpreting Standardized Coefficients Why Obtain Standardized Coefficients? Centering Your Continuous Variables Main Effects of Categorical Variables Coding Categorical Variables Interpreting the Results for Categorical Variables Example of a Model with a Categorical Variable Controlling for other Variables Blurring the Continuous and Categorical Line The Case for Including It as a Continuous Variable The Case for Including It as a Continuous Variable Constant (Y Intercept) The Definition of the Constant is Correct but Misleading The Y-Intercept Might Be Outside of the Observed Data The Constant Absorbs the Bias for the Regression Model Generally, It Is Essential to Include the Constant in a Regression Model Interpreting the Constant When You Center All the Continuous Independent Variables Review and Next Steps Fitting Curvature Example Curvature Graphing the Data for Regression with Polynomial Terms Graph Curvature with Main Effects Plots Why You Need to Fit Curves in a Regression Model Difference between Linear and Nonlinear Models Linear Regression Equations Nonlinear Regression Equations Finding the Best Way to Model Curvature Curve Fitting using Polynomial Terms in Linear Regression Curve Fitting using Reciprocal Terms in Linear Regression Curve Fitting with Log Functions in Linear Regression Curve Fitting with Nonlinear Regression Comparing the Curve-Fitting Effectiveness of the Different Models Closing Thoughts Another Curve Fitting Example Linear model Example of a nonlinear regression model Comparing the Regression Models and Making a Choice Review and Next Steps Interaction Effects Example with Categorical Independent Variables How to Interpret Interaction Effects Overlooking Interaction Effects is Dangerous! Example with Continuous Independent Variables Important Considerations for Interaction Effects Common Questions about Interaction Effects Interaction effects versus correlation between independent variables Combinations of significant and insignificant main effects and interaction effects When an interaction effect is significant but an underlying main effect is not significant, do you remove the main effect from the model? The coefficient sign for an interaction term isn’t what I expected. Different statistical software packages estimate different interaction effects for the same dataset. The lines in my interaction plot don’t cross even though the interaction effect is statistically significant? The lines in my interaction plot appear to have different slopes, but the interaction term is not significant. Review and Next Steps Goodness-of-Fit Assessing the Goodness-of-Fit R-squared Visual Representation of R-squared R-squared has Limitations Are Low R-squared Values Always a Problem? Are High R-squared Values Always Great? R-squared Is Not Always Straightforward Adjusted R-Squared and Predicted R-Squared Some Problems with R-squared What Is Adjusted R-squared? What Is the Predicted R-squared? Example of an Overfit Model and Predicted R-squared A Caution about Chasing a High R-squared Standard Error of the Regression vs. R-squared Standard Error of the Regression and R-squared in Practice Example Regression Model: BMI and Body Fat Percentage I Often Prefer the Standard Error of the Regression The F-test of Overall Significance Additional Ways to Interpret the F-test of Overall Significance Review and Next Steps Specify Your Model The Importance of Graphing Your Data Statistical Methods for Model Specification Adjusted R-squared and Predicted R-squared Mallows' Cp P-values for the independent variables Stepwise regression and Best subsets regression Real World Complications Practical Recommendations Theory Simplicity Residual Plots Omitted Variable Bias What Are the Effects of Omitted Variable Bias? Synonyms for Confounding Variables and Omitted Variable Bias What Conditions Cause Omitted Variable Bias? Practical Example of How Confounding Variables Can Produce Bias How the Omitted Confounding Variable Hid the Relationship Correlations, Residuals, and OLS Assumptions Predicting the Direction of Omitted Variable Bias How to Detect Omitted Variable Bias and Identify Confounding Variables Obstacles to Correcting Omitted Variable Bias Recommendations for Addressing Confounding Variables and Omitted Variable Bias What to Do When Including Confounding Variables is Impossible Automated Variable Selection Procedures How Stepwise Regression Works How Best Subsets Regression Works Comparing Stepwise to Best Subsets Regression Using Stepwise and Best Subsets on the Same Dataset Example of Stepwise Regression Example of Best Subsets Regression Using Best Subsets Regression in conjunction with Our Requirements Assess Your Candidate Regression Models Thoroughly Stepwise versus Best Subsets How Accurate is Stepwise Regression? When stepwise regression is most accurate The role of the number of candidate variables and authentic variables in stepwise regression accuracy The role of multicollinearity in stepwise regression accuracy The role of sample size in stepwise regression accuracy Closing Thoughts on Choosing the Correct Model Review and Next Steps Problematic Methods of Specifying Your Model Using Data Dredging and Significance Regression Example that Illustrates the Problems of Data Mining Using Stepwise Regression on Random Data Lessons Learned from the Data Mining Example How Data Mining Causes these Problems Let Theory Guide You and Avoid Data Mining Overfitting Regression Models Graphical Illustration of Overfitting Regression Models How Overfitting a Model Causes these Problems Applying These Concepts to Overfitting Regression Models How to Detect Overfit Models How to Avoid Overfitting Models Review and Next Steps Checking Assumptions and Fixing Problems Check Your Residual Plots! Deterministic Component Stochastic Error How to Check Residual Plots How to Fix Problematic Residual Plots Residual Plots are Easy! The Seven Classical OLS Assumptions OLS Assumption 1: The correctly specified regression model is linear in the coefficients and the error term OLS Assumption 2: The error term has a population mean of zero OLS Assumption 3: All independent variables are uncorrelated with the error term OLS Assumption 4: Observations of the error term are uncorrelated with each other OLS Assumption 5: The error term has a constant variance (no heteroscedasticity) OLS Assumption 6: No independent variable is a perfect linear function of other explanatory variables OLS Assumption 7: The error term is normally distributed (optional) Why You Should Care About the Classical OLS Assumptions Next Steps Heteroscedasticity How to Identify Heteroscedasticity with Residual Plots What Causes Heteroscedasticity? Heteroscedasticity in cross-sectional studies Heteroscedasticity in time-series models Example of heteroscedasticity Pure versus impure heteroscedasticity What Problems Does Heteroscedasticity Cause? How to Fix Heteroscedasticity Redefining the variables Weighted least squares regression Transform the dependent variable Multicollinearity Why is Multicollinearity a Potential Problem? What Problems Do Multicollinearity Cause? Do I Have to Fix Multicollinearity? Testing for Multicollinearity with Variance Inflation Factors (VIFs) Multicollinearity Example: Predicting Bone Density in the Femur Center the Independent Variables to Reduce Structural Multicollinearity Regression with Centered Variables Comparing Regression Models to Reveal Multicollinearity Effects How to Deal with Multicollinearity Next Steps Unusual Observations Observations in Regression Unusual Observations Outliers (Unusual Y-values) High Leverage Observations (Unusual X-values) Influential Points Managing Unusual Observations and Influential Points Next Steps Using Data Transformations to Fix Problems Determining which Variables to Transform Determining which Transformation to Use Box-Cox Transformation Johnson Transformation How to Interpret the Results for Transformed Data Use data transformation as a last resort! Cheat Sheet for Detecting and Solving Problems Using Regression to Make Predictions Explanatory versus Predictive Models The Regression Approach for Predictions Example Scenario for Regression Predictions Finding a Good Regression Model for Predictions Assess the Residual Plots Interpret the Regression Output Other Considerations for Valid Predictions Using our Regression Model to Make Predictions Interpreting the Regression Prediction Results Next Steps: Don’t Focus On Only the Fitted Values The Illusion of Predictability Studying How Experts Perceive Prediction Uncertainty Use a Regression Model to Make a Decision The Difference between Perception and Reality Low R-squared Values Should Have Warned of Low Precision Graph the Model to Highlight the Variability Graphs Are One Way to Pierce the Illusion of Predictability Display Prediction Intervals on Fitted Line Plots to Assess Precision Different Example of Using Prediction Intervals Tips, Common Questions, and Concerns Five Tips to Avoid Common Problems Tip 1: Conduct A Lot of Research Before Starting Tip 2: Use a Simple Model When Possible Tip 3: Correlation Does Not Imply Causation . . . Even in Regression Tip 4: Include Graphs, Confidence, and Prediction Intervals in the Results Tip 5: Check Your Residual Plots! Differences Between a Top Analyst and a Less Rigorous Analyst Identifying the Most Important Variables Do Not Associate Regular Regression Coefficients with the Importance of Independent Variables Do Not Link P-values to Importance Do Assess These Statistics to Identify Variables that might be Important Standardized coefficients Change in R-squared for the last variable added to the model Example of Identifying the Most Important Independent Variables in a Regression Model Cautions for Using Statistics to Pinpoint Important Variables Non-Statistical Issues that Help Find Important Variables Comparing Regression Lines with Hypothesis Tests Hypothesis Tests for Comparing Regression Constants Interpreting the Results Hypothesis Tests for Comparing Regression Coefficients Interpreting the Results How High Does R-squared Need to Be? How High Does R-squared Need to be is the Wrong Question Define Your Objectives for the Regression Model R-squared and Understanding the Relationships between the Variables R-squared and Predicting the Dependent Variable Using Prediction intervals to Assess Precision R-squared Is Overrated! Five Reasons Why R-squared can be Too High High R-squared Values can be a Problem Reason 1: R-squared is a biased estimate Reason 2: Overfitting your model Reason 3: Data mining and chance correlations Reason 4: Trends in Panel (Time Series) Data Reason 5: Form of a Variable Interpreting Models that have Significant Variables but a Low R-squared Comparing Regression Models with Low and High R-squared Values Regression Model Similarities Regression Model Differences Using Prediction Intervals to See the Variability Key Points about Low R-squared Values Choosing the Correct Type of Regression Continuous Dependent Variables Linear regression Advanced types of linear regression Nonlinear regression Categorical Dependent Variables Binary Logistic Regression Ordinal Logistic Regression Nominal Logistic Regression Count Dependent Variables Poisson regression Alternatives to Poisson regression for count data Examples of Other Types of Regression Using Log-Log Plots to Determine Whether Size Matters Does the Mass of Mammals Affect Their Metabolism? Example: Log-Log Plot of Mammal Mass and Basal Metabolic Rate Example: Log-Log Plot of Basal Metabolic Rate and Longevity Binary Logistic Regression: Statistical Analysis of the Republican Establishment Split How Does the Freedom Caucus Fit In? Data for these Analyses Graphing the House Republican Data Binary Logistic Regression Model of Freedom Caucus Membership Graphing the Results References About the Author
- Store Name: Digitalsmircea
- Vendor: Mircea Boroban
- No ratings found yet!
There are no inquiries yet.