This is a short essay about the notion of multiply robustness.
Consider an estimator composed of two true nuisances, $\mu_0$ and $\pi_0$. Let $\hat{\mu}$$\hat{\pi}$ denote an estimated nuisance function. $\mathbb{E}{P}[ f(\mu_0,\pi_0) ]$ represents the true quantity, while $\mathbb{E}{P}[ f(\hat{\mu}, \hat{\pi})]$ represents the function with the estimated nuisances plugged in. Then ,
Definition: Doubly Robustness If the bias term can be represented as
$$ \begin{align} |\mathbb{E}{P}[ f(\mu_0,\pi_0) ] - \mathbb{E}{P}[ f(\hat{\mu},\hat{\pi}) ]| = \|\hat{\mu} - \mu_0\|\|\hat{\pi} - \pi_0\| \end{align} $$
then, the estimator based on $\mathbb{E}_{P}[f(\hat{\mu},\hat{\pi})]$ is called doubly robust. Let's refer to Equation (1) as the "DR-decomposition"
Example: The AIPW estimator for the back-door adjustment is an example that admits the DR-decomposition.
Consider an estimator composed of two “sequence of” true nuisances, $\mu_0 := \{\mu^{i}0: i=1,\cdots,m\}$ and $\pi_0 := \{\pi^i_0: i=1,\cdots,m\}$. Let $\hat{\mu}^i$$\hat{\pi}^i$ for $i=1,\cdots,m$ denote an estimated nuisance function. $\mathbb{E}{P}[ f(\mu_0,\pi_0) ]$ represents the true quantity, while $\mathbb{E}_{P}[ f(\hat{\mu}, \hat{\pi})]$ represents the function with the estimated nuisances plugged in. Then ,
Definition: Sequential Doubly Robustness If the bias term can be represented as
$$ \begin{align} |\mathbb{E}{P}[ f(\mu_0,\pi_0) ] - \mathbb{E}{P}[ f(\hat{\mu},\hat{\pi}) ]| = \sum_{i=1}^{m}\|\hat{\mu}^i - \mu^i_0\|\|\hat{\pi}^i - \pi^i_0\| \end{align} $$
then, the estimator based on $\mathbb{E}_{P}[f(\hat{\mu},\hat{\pi})]$ is called sequential doubly robust. Let's refer to Equation (2) as the "SDR-decomposition".
Example: Our DR-mSBD estimator is an example that admits the SDR decomposition.
Consider an estimator composed of arbitrary “sequences of” true nuisances $\alpha_{1,0},\cdots,\alpha_{K,0}$, where
$$ \alpha_{k,0}:= \{\alpha^i_{k,0}: i=1,2,\cdots,m_k\}, \; k=1,\cdots,K. $$
Let $\hat{\alpha}^i_{k}$ $\forall k,i$ denote an estimated nuisance. Let $\alpha_0 := \{\alpha^{i}_{k,0}:\forall i,k\}$ and $\hat{\alpha} := \{\hat{\alpha}^i_k:\forall i,k\}$.
$\mathbb{E}{P}[ f(\alpha_0) ]$ represents the true quantity, while $\mathbb{E}{P}[ f(\hat{\alpha})]$ represents the function with the estimated nuisances plugged in. Then ,
Definition: Multiply Robustness If the bias term can be represented as
$$ \begin{align} |\mathbb{E}{P}[ f(\alpha_0) ] - \mathbb{E}{P}[ f(\hat{\alpha}) ]| = \sum_{k_1,k_2 \in [K] \atop k_1 \neq k_2} \sum_{i}\|\hat{\alpha}^i_{k_1} - \alpha^i_{k_1,0}\|\|\hat{\alpha}^i_{k_2} - \alpha^i_{k_2,0}\|\end{align} $$
then, the estimator based on $\mathbb{E}_{P}[ f(\hat{\alpha})]$ is called multiply robust. Let's refer to Equation (3) as the "MR-decomposition".
Example: Consider the front-door example, with the nuisance $\mu_0(XZ) := \mathbb{E}_{P}[Y \vert XZ]$, $\pi_0(X) := P(X)$ and $\xi_0(Z \vert X) := P(Z \vert X)$. Then, the MR estimator proposed by Fulcher et al., (2017) admits the following MR decomposition.
$$ \text{bias} = \|\hat{\mu} - \mu_0\|\| \hat{\xi} - \xi_0 \| +\|\hat{\pi} - \pi_0\|\| \hat{\xi} - \xi_0 \|. $$
Recall the SDR decomposition
$$ \begin{align*} |\mathbb{E}{P}[ f(\mu_0,\pi_0) ] - \mathbb{E}{P}[ f(\hat{\mu},\hat{\pi}) ]| = \sum_{i=1}^{m}\|\hat{\mu}^i - \mu^i_0\|\|\hat{\pi}^i - \pi^i_0\|. \end{align*} $$
At first glance, there are $2^{m-1}$ cases to make the above bias become zero. However, suppose $\hat{\mu}^i$ is constructed based on $\{\hat{\mu}^k: k >i\}$ (or $\hat{\pi}^i$ constructed based on $\{\hat{\pi}^k: k>i\}$). In this case, the number of possible cases where the above bias becomes zero is $m$. In this case, we say that the estimator is “m-robust” estimator (reference: Rotnitzky et al., (2017) and Luedtke et al., 2018).
Recently, novel methods for estimating the nuisance $\hat{\mu}^i$ and $\hat{\pi}^i$ are developed. These new methods estimate in the way that these are independent of other estimators $\hat{\mu}^k, \hat{\pi}^k$ $k \neq i$. Rotnitzky et al., (2017)Luedtke et al., 2018) reviewed those methods. In this case, the estimator is said $2^{m-1}$ robust.
Then, our estimator MR-gID is $m$-robust, not $2^{m-1}$ robust.