\documentclass[12pt]{article}
\usepackage{pa}
\usepackage{natbib}
\bibliographystyle{chicago}

\usepackage{authblk}

\usepackage{setspace}
\usepackage{fullpage}
\usepackage{lscape}
\usepackage{rotating}
\usepackage{times}
\usepackage{mathrsfs}
\usepackage{amsmath}
\usepackage{amsthm}
\usepackage{amsfonts}
\usepackage{amssymb}

\usepackage{graphicx}
\usepackage{lscape}
\usepackage{multirow}
\usepackage{color}

%\renewcommand{\baselinestretch}{2.0}

%\usepackage[margin=1in]{geometry}

\def\Var{{\rm Var}\,}
\def\E{{\rm E}\,}
\def\arg{{\rm arg}\,}
\def\Cov{{\rm Cov}\,}
\def\N{{\rm N}\,}


\newcommand{\p}{\text{p}}
\newcommand{\zt}{\tilde{Z}}
\newcommand{\yt}{\tilde{Y}}
\newcommand{\zmlm}{\dddot{Z}}
\newcommand{\ymlm}{\dddot{Y}}
\newcommand{\ugeniid}{\tilde{U}^{\text{gen}}}
\newcommand{\ugenmlm}{\dddot{U}^{\text{gen}}}
\newcommand{\comment}[1]{}

%creates command for \note{initials}{content}, or just \note{content}
\makeatletter
\def\note#1{\def\tempa{#1}\futurelet\next\note@i}% Save first argument
\def\note@i{\ifx\next\bgroup\expandafter\note@ii\else\expandafter\note@end\fi}%Check brace
\def\note@ii#1{\textcolor{blue}{\tempa\ NOTE: #1}}%Two args
\def\note@end{\textcolor{blue}{NOTE: \tempa}}%Single args
\makeatother

\bibpunct{(}{)}{;}{a}{}{,}

\pagenumbering{gobble}

\title{Supplementary Materials for: Bias Amplification and Bias Unmasking}

%\author{}
\author[1]{Joel A. Middleton}
\author[2]{Marc A. Scott} 
\author[3]{Ronli Diakow} 
\author[2]{Jennifer L. Hill}
\affil[1]{Department of Political Science, University of California, Berkeley}
\affil[2]{Humanities and Social Sciences in the Professions, New York University, Steinhardt }
\affil[3]{New York City Department of Education}
\date{May, 2016\thanks{This research was partially supported by Institute of Education Sciences grants R305D110037 and R305B120017. For replication files see \cite{Middleton}.}}

\renewcommand\Authands{ and }
\renewcommand\footnotemark{}
\begin{document}

%\tracingall

\maketitle

\doublespacing

\numberwithin{equation}{section}


\appendix

\vspace{-.2in}
\section{Omitted Variable Bias}\label{OVB}
\vspace{-.2in}

To define the bias, start with a generic linear model,
\vspace{-.1in}
\begin{align}
Y&= S\beta^s + O\beta^o+\epsilon^y,
\end{align}
\vspace{-.1in}
where $S$ and $O$ are matrices of specified and omitted covariates, respectively. With respect to the error term, $\epsilon^y$, assume $\text{E}[\epsilon^y|S,O]=0$.

Imagine the regression of $Y$ on a set of covariates $S$ only. This leads to the well known expression for omitted variable bias \citep[for example see][p.\ 334]{Greene}
\vspace{-.1in}
\begin{align} \label{genericbias}
\text{Bias}\left[\widehat{\beta}^s \right]=
\left( S'S \right)^{-1} S'O\beta^o.
\end{align}
\vspace{-.1in}
From this generic equation we can derive biases for particular sets of conditioning variables, $S$, under an assumed model. 

To derive omitted variable bias, first collect variables into two groups: omitted variables, O, and included (specified) variables, S. Then we can write (in matrix notation) the general case $Y=S\beta^s+O\beta^o + \epsilon^y$. Now substitute $Y$ in $\left[ S'S\right]^{-1} S'Y$ and take the expected value.
\vspace{-.1in}
\begin{align}
\nonumber \text{E}\left[\begin{array}{ccc} \widehat{\beta}^s\end{array} \right]&=\text{E}\left[ \left( S'S \right)^{-1} S'Y\right]
\\ \nonumber&=\text{E}\left[\left( S'S \right)^{-1} S'\left( S\beta^s+O\beta^o +\epsilon_y\right)\right]
\\ \nonumber &= \beta^s+ \text{E}\left[\left( S'S \right)^{-1} S'\left(O\beta^o +\epsilon_y\right)\right]
\\ &= \beta^s+\left( S'S \right)^{-1} S'O\beta^o
\end{align}
\vspace{-.1in}
The last line follows from the fact that $\epsilon_y$ is independent of S and O.
Therefore, the bais is the last term,
\vspace{-.1in}
\begin{align}
\text{Bias}\left[\widehat{\beta}^s \right]=
 \left( S'S \right)^{-1} S'O\beta^o.
\end{align}
\vspace{-.1in}

For the bias when conditioning on $S=[Z,X]$ and $O=[U]$, use the inverse of the partition matrix \citep[cf.][section 2.6.3]{Greene} to arrive at
\vspace{-.1in}
\begin{align} \label{partitioned}
&\text{Bias}\left[\begin{array}{ccc}\widehat{\tau} \\ \widehat{\beta}^{y}\end{array}\right]\nonumber
\\=&\left[
\begin{array}{ccc}
\left(Z'Z-Z'X\left[X'X\right]^{-1}X'Z\right)^{-1} & -\left[Z'Z\right]^{-1}Z'X\left( X'X - X'Z\left[Z'Z\right] Z'X \right)^{-1}
\\ -\left[X'X\right]^{-1}X'Z \left(Z'Z-Z'X\left[X'X\right]^{-1}X'Z\right)^{-1}
& \left(X'X - X'Z \left[ Z'Z\right]^{-1}Z'X\right)^{-1}
\end{array}
\right] \nonumber
\\ &\times \left[ \begin{array}{ccc}Z'U \\ X'U \end{array}\right]\zeta^y \nonumber
\\=&\left[
\begin{array}{ccc}
\left(Z'Z-Z'X\left[X'X\right]^{-1}X'Z\right)^{-1}Z'U \zeta^y
\\ -\left[X'X\right]^{-1}X'Z \left(Z'Z-Z'X\left[X'X\right]^{-1}X'Z\right)^{-1}Z'U \zeta^y
\end{array}
\right]
\end{align}
\vspace{-.1in}
The last line follows from the fact that, by construction, $U \perp X$ and $\overline{U}=0$.


\vspace{-.2in}
\section{Illustrative Example}\label{IE}
\vspace{-.2in}

In this section we provide a simple numerical example to illustrate these biases. Suppose a researcher has city level data and is interested in the effect of (standardized) per capita real income, $Z$, on (standardized) proportion voting for the legislative party in power, $Y$. Suppose the proportion voting for the party in power is also affected by whether or not the local mayor is a member of the the same party (a reverse coat-tails effect) such that
\vspace{-.1in}
\begin{align}
 Y&=\frac{1}{2}Z+\frac{1}{2}U+\epsilon_y \nonumber
\\ \epsilon_y& \sim N\left(0,\frac{1}{{2}}\right). \nonumber
\end{align}
\vspace{-.1in}
where $U$ is an indicator coded -1 if the mayor is not of the incumbent party and 1 if the mayor is of the incumbent party and $\epsilon_y$ represents idiosyncratic factors. For simplicity say half of mayors are members of the party in power.

Now in turn suppose the treatment, (standardized) per capita income, is affected by whether the mayor is the same party as the party in power in the legislature (because the legislature rewards mayors of the same party with pork spending) and also by a development project aimed at increasing the incomes of the poor that was randomly assigned to half of the cities. The model for the treatment variable, $Z$ (per capita income), is
\vspace{-.1in}
\begin{align}
 Z&=\frac{1}{2}X+\frac{1}{2}U+\epsilon_z \nonumber
\\ \epsilon_x& \sim N\left(0,\frac{1}{\sqrt{2}}\right) \nonumber
\end{align}
\vspace{-.1in}
where $X$ is an indicator of whether the development project took place in the district (coded -1 for not treated and 1 for treated). 

Note that the example has been contrived such that $\E (Y)=\E (X)=\E (Z)=\E (U)=0$ and $V (Y)=V (X)=V (Z)=V (U)=1$. Note also that $X$ is an instrument because it was randomly assigned to districts and
only affects $Y$ through its effect on $Z$.

\begin{table}[]
\centering
\begin{tabular}{c c c}
\hline
Variable & Measure & Scale \\
\hline
Y & Standardized proportion voting for legislative party in power & $\mu=0, \sigma=1$ \\
Z & Standardized per capita income & $\mu=0, \sigma=1$ \\
U & Mayor member of party in power & 1 if yes, -1 if no\\
X & Development project in city & 1 if yes, -1 if no \\
\hline
\end{tabular}
\caption{Variables in illustrative example}
\label{tab:my_label}
\end{table}


Now suppose the researcher observes whether or not the development project occurs in each city but neglects to collect data on the party of the mayors. The researcher estimates the effect of (standardized) income per capita on (standardized) proportion voting for the political party in power in two ways. The first approach is to regress $Y$ (proportion voting for party in power) on $Z$ (per capita income). The second approach is to regress $Y$ (proportion voting for party in power) on both $Z$ (per capita income) and $X$ (development project instrument). 

We can compute the bias components for these specifications. The bias due to omitting $X$ (development project instrument) is
\vspace{-.1in}
\begin{align}
\chi \equiv &  \left(Z'Z\right)^{-1} Z'X{\beta}^y \nonumber
\\ =& \text{Cov}(Z,X)0 \nonumber
\\=& 0 \nonumber
\nonumber
\end{align}
\vspace{-.1in}
as expected since $X$ is an instrument. Meanwhile, the bias due to omitting $U$ (mayor is of incumbent party) is
\vspace{-.1in}
\begin{align}
\upsilon \equiv &\left(Z'Z\right)^{-1} Z'U{\zeta}^y\nonumber
\\ =& \text{Cov}(Z,U)\frac{1}{2} \nonumber
\\ =& \text{Cov}\left(\frac{1}{2}X+\frac{1}{2}U+\epsilon_z,U\right)\frac{1}{2}  \nonumber
\\ =& \frac{1}{4}V(U) \nonumber
\\=& 0.25. \nonumber
\nonumber
\end{align}
The bias due to amplification is 
\begin{align}
\alpha \equiv & \left( \frac{r^2_{Z|X}}{1-r^2_{Z|X}} \right) \upsilon \nonumber
\\ =& \left(\frac{\text{Cov}(X,Z)^2}{1-\text{Cov}(X,Z)^2} \right) 0.25 \nonumber
\\ =& \frac{1/4}{1-1/4}0.25 \nonumber
\\ =& 0.0833 \nonumber
\end{align}
So the bias when omitting the instrumental variable $X$ will be $(\chi+\upsilon)=(0+0.25)=0.25$ while the bias when including $X$ in the conditioning set will be $(\alpha+\upsilon)=(0.0833+0.25)=0.333$. Thus in this case the unadjusted estimator is less biased than the estimator that includes the instrument in the conditioning set. 

Why does this make sense intuitively? 
%Omitting $X$ is problematic because the development project affects per capita income in a way that in turn affects the proportion voting for the party in power; thus failing to control for $X$ will create comparisons between units (cities) that only differ on income because of the development project (and therefore aren't truly comparable). 
%On the other hand, when we condition on $X$ it sets up a comparisons within two groups: those that received the development project and those that did not. Since $X$ and $U$ are negatively correlated conditional on $Z$ (even though they are marginally uncorrelated), having a development project in the city makes is more likely that the mayor is a member of the party in power among precincts with higher per capita income and the reverse is true among precints with lower per capita income. Thus making comparisons within categories of $X$ merely serves to accentuate the bias caused by omitting $U$.
Controlling for $X$, there is a stronger (partial) correlation between $Z$ and $U$, exacerbating the bias due to $U$. Another way to say it is that the ``exogenous" (good) variability in $Z$ is controlled for when $X$ is included in the conditioning set.

The phenomenon of bias amplification is similar in the case of fixed effects in that conditioning on this additional variable sets up within-group comparisons that induce a negative relationship between U and X that exacerbates the bias. 

%Since we are not controlling for $U$, this creates a different type of bias because of the fact that the mayor
%being of an incumbent party ($U$) is positively related
%to income. Since $X$ is not directly related to $Y$, making
%comparisons within categories of $X$ merely serves to accentuate the bias caused by omitting $U$.

%$Z$ (per capita income). The second approach is to regress $Y$ (proportion voting for party in power) on both $Z$ (per capita income) and $X$ (development project instrument). 

\vspace{-.2in}
\section{Sensitivity Analysis of Water Outcome}\label{C}
\vspace{-.2in}

In the sensitivity plot, Figure 2, we examine the Water Infrastructure outcome. The interpretation of the plot is the same as in the case of the GOTV study. The preponderance of covariates whose benchmarking values fall in the bias-inducing region suggest that we should be concerned with including fixed effects in our analysis. 
%Indeed this is what we see. The point at (-0.08, 16.30) falls beyond the contour and in the region where fixed effects are bias increasing. 
Conducting sensitivity analysis on this data would have alerted the researcher for signs of potential trouble.

\begin{figure}[htbp]\label{DunningPlot}
\centering
\includegraphics[width=4in]{Figure2_v04.pdf}
\caption{Sensitivity Plot of Water Outcome}
\end{figure}

%RPD: Should you also show an example sensitivity analysis where FEs aren't a concern? 


\vspace{-.2in}
\begin{thebibliography}{99}
\vspace{-.2in}
\bibitem[Greene(2000)]{Greene}Greene, WH. (2000). {\it Econometric Analysis} Prentice Hall (4th Edition).
\bibitem[Middleton(2016)]{Middleton}Middleton, J.A., (2016), "Replication Data for: Bias Amplification and Bias Unmasking", http://dx.doi.org/10.791/DVN/UO5WQ4, Harvard Dataverse


\end{document}