Advanced Time Series Cross-Section Analyses

Andrew PhilipsUniversity of Colorado at Boulder and Lorena Barberia, University of São Paulo


Data collected over both units (e.g., municipalities, states, countries) and time (e.g., days, months, years)---known as time series cross-sectional data---are common in social science. By gaining leverage both across units and over time, this data structure helps us answer important questions that would be difficult if we only looked at a single year (e.g., cross section) or single country (e.g., time series): the relationship between growth and democracy, whether or not the resource curse exists, and how institutions shape political and economic outcomes. However, pooled time series often show types of heterogeneity that make standard regression approaches inappropriate. In this week, building off Module I's Essentials of Time Series for Time Series Cross-Section Analyses (TSCS) and Module II's Fundamentals of Time Series Cross-Section Analyses, we cover several advanced topics regarding these data. This includes a focus on establishing identification, model selection testing procedures, as well as more advanced estimation methods, such as GMM and SUR models.
During the first four days, the course will involve about three hours of lecture time with breaks, then lunch, and then three to four hours of hands-on instruction in analysis that takes place in smaller groups using Stata. On the fifth day, students will work on a specific project assignment that applies the concepts introduced in the course. 


This course runs January 27 - 31, 2020.



Topic 1:  Modeling Heterogeneity: Slopes
For the first topic, we will focus on testing various pooling assumptions about our coefficients of interest. We will introduce SUR (Seemingly Unrelated Regressions) models, which have two or more equations (one for each cross-sectional units) whose errors are correlated. This modeling strategy is appropriate for testing the pooling assumptions that we make in models of TSCS data but does not work well for models that include variables that have little or no within-unit variation. We will also discuss models that incorporate random slopes, another way to relax the assumption of a fixed effect across units.

Topic 2: The Mundlak Transformation and Missing Data

For the second topic, we will discuss the Mundlak transformation, an additional model to explore heterogeneity in effects between units, as well as within units. We will also discuss how to impute missing data, a common issue when working with TSCS data.

Topic 3: Models for Dichotomous Dependent Variables in TSCS

For the third topic, we will explore how to model dichotomous dependent variables. These models require us to think differently about event dependence than models with a continuous dependent variable.

Topic 4:  Modeling Dynamics with GMM Estimators
For the fourth topic, we will introduce the one and two-step generalized method of moments (GMM) estimators for dynamic panels, which have become increasingly popular. We will show how these models handle the endogeneity of regressors and unit fixed effects, as well as discuss some of the potential pitfalls that should be avoided in estimation.

Topic 5:  Student Presentations
For the last topic, we will have student presentations of your research project you have developed over the week. Everyone will provide feedback. If needed, we will also finish up any lectures.


A full-semester graduate-level course in multiple regression analysis and Essentials of TS for TSCS and Fundamentals of Time Series Cross-Section Analyses (offered in the IPSA-USP 2020 Summer School) or the equivalent background in time series and time series cross-section (TSCS) analysis.