Essentials of Data and Multiple Regression Analysis

Glauco Peres da Silva, University of São Paulo


This course is designed for students who are interested in reviewing their training in data analysis and multiple regression analysis. It prepares students for courses offered in the IPSA-USP Summer School that require a background in statistics and in multiple regression analysis including the Time Series Analysis and Pooled Time Series Analyses, and Spatial Econometrics courses.  The course will take place in the week preceding the commencement of the Summer School. The intensive course starts with a discussion of the logic of the data analysis, based on including basic probability; random variables and their distributions; confidence intervals and tests of hypotheses. After that it covers the basic assumptions of multivariate regression model and the central assumptions underlying the ordinary least squares approach. Similar to other IPSA-USP courses, the Essentials of Data and Multiple Regressions Analysis takes a “hands on” approach. To complement lectures, students apply the concepts taught in lectures to analyze problems using software packages commonly used in quantitative social science research including Excel and Stata.

For those of you considering enrolling in this course, watch the video below to find out more!


This course runs January 14-18,2019.

TEACHING FELLOW:  Mauricio Izumi, University of São Paulo


This course departs from the premise that the most effective way to learn multivariate statistics is by actively using the concepts discussed in class to solve problems. For each topic, we will have lectures that will be followed by sessions in which students will use empirical data to answer questions that are important to political scientists. For those students who will be studying multivariate regression analysis in the IPSA-USP Summer School, the course will provide an intuitive and basic review of linear regression in theory and practice.  




Monday, January 14th

Lecture 1. Probability

Lecture 2. Distribution of random variables

Tuesday, January 15th

Lecture 3. Joint distributions

Lecture 4. Confidence Intervals

Wednesday, January 16th

Lecture 5. An Introduction to the Multiple Regression Model

Lecture 6. The Linear Regression Model with a Single Regressor

Thursday, January 17th

Lecture 7. Hypothesis Tests and Confidence Intervals

Lecture 8. Assumptions of Ordinary Least Squares

Friday, January 18th

Lecture 9. The Linear Regression Model with Multiple Regressors

Lecture 10. Assessing Goodness of Fit

In the afternoon, classes will be focused on labs activities.


The course presumes students have some basic training in mathematics including arithmetic and algebra operations.


Casella, George, and Roger Berger. 2008. Statistical Inference. 2nd ed. Duxbury Advanced Series. Cengage Learning.

Kellstedt, Paul M., and Guy D. Whitten. 2013. The Fundamentals of Political Science Research. 2nd ed. Cambridge ;New York: Cambridge University Press.

Stock, James H., and Mark W. Watson. 2011. Introduction to Econometrics. 3rd ed. Boston: Pearson/Addison Wesley.


Gelman, Andrew, and Hal Stern. 2006. The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant. The American Statistician 60 (4): 328-331.

Gujarati, Damodar N., and Dawn C. Porter. 2009. Basic econometrics. 5th ed. Boston: McGraw-Hill Irwin.

Greene, W. H. 2012. Econometric analysis. 7th ed. Upper Saddle River, NJ: Pearson Prentice Hall.

Wooldridge, Jeffrey M. 2009. Introductory Econometrics: A Modern Approach. Cincinnati, OH: South-Western College.