Definition of Cross-validatory Z-residual
We use Z-residual to diagnose shared frailty models in a Cox proportional hazard setting with a baseline function unspecified. Suppose there are g groups of individuals, with each group containing n_i individuals, indexed as i = 1, 2, , g in the case of clustered failure survival data. Let y_{ij} be a possibly right-censored observation for the jth individual from the ith group, and \delta_{ij} be the indicator for being uncensored. The normalized randomized survival probabilities (RSPs) for y_{ij} in the shared frailty model is defined as: \begin{equation}
S_{ij}^{R}(y_{ij}, \delta_{ij}, U_{ij}) =
\left\{
\begin{array}{rl}
S_{ij}(y_{ij}), & \text{if $y_{ij}$ is uncensored, i.e., $\delta_{ij}=1$,}\\
U_{ij}\,S_{ij}(y_{ij}), & \text{if $y_{ij}$ is censored, i.e., $\delta_{ij}=0$,}
\end{array}
\right. \label{rsp}
\end{equation} where U_{ij} is a uniform random number on (0, 1), and S_{ij}(\cdot) is the postulated survival function for t_{ij} given x_{ij}. S_{ij}^{R}(y_{ij}, \delta_{ij}, U_{ij}) is a random number between 0 and S_{ij}(y_{ij}) when y_{ij} is censored. It is proved that the RSPs are uniformly distributed on (0,1) given x_{i} under the true model. Therefore, the RSPs can be transformed into residuals with any desired distribution. We prefer to transform them with the normal quantile: \begin{equation}
r_{ij}^{Z}(y_{ij}, \delta_{ij}, U_{ij})=-\Phi^{-1} (S_{ij}^R(y_{ij}, \delta_{ij}, U_{ij})),\label{zresid}
\end{equation} which is normally distributed under the true model, so that we can conduct model diagnostics with Z-residuals for censored data in the same way as conducting model diagnostics for a normal regression model. There are a few advantages of transforming RSPs into Z-residuals. First, the diagnostics methods for checking normal regression are rich in the literature. Second, transforming RSPs into normal deviates facilitates the identification of extremely small and large RSPs. The frequency of such small RSPs may be too small to be highlighted by the plots of RSPs. However, the presence of such extreme SPs, even very few, is indicative of model misspecification. Normal transformation can highlight such extreme RSPs.
In our study, we employ both leave-one-out cross-validation (LOOCV) and 10-fold cross-validation (10-fold CV) techniques to compute cross-validatory Z-residuals. In LOOCV Z-residual, one observation, t_{ij}^{test}, is excluded from the dataset with n observations. The remaining observations, acting as the training dataset, are used for parameter estimation in the shared frailty model. Fitting the model to the training dataset produces the estimated regression coefficients, \hat{\beta'}, and frailty effects, \hat{z_i}. The Breslow estimator helps estimate the cumulative baseline hazard (\hat{H_0}). The survival function \hat{S}{ij} (y{ij}) for the test observation y_{ij}^{test} is computed using: \begin{equation}
\hat{S}_{ij}(y_{ij}^{test}) = \exp \{- \hat{z_i} \exp(\hat{\beta'} x_{ij}) \hat{H}_0(y_{ij}^{test}) \}.
\end{equation} Subsequently, the RSP for the observed t_{ij} is defined as: \begin{equation}
\hat{S}_{ij}^{R}(t_{ij}^{test}, d_{ij}, U_{ij})=
\left\{
\begin{array}{rl}
\hat{S}_{ij}(t_{ij}^{test}), & \text{if $t_{ij}^{test}$ is uncensored, i.e., $d_{ij}=1$,}\\
U_{ij}\,\hat{S}_{ij}(t_{ij}^{test}), & \text{if $t_{ij}^{test}$ is censored, i.e., $d_{ij}=0$.}
\end{array}
\right.
\end{equation} Resulting in the Z-residual for t_{ij}^{test}: \begin{equation}
\hat{z}_{ij}(t_{ij}^{test}, d_{ij}, U_{ij})=-\Phi^{-1} (\hat{S}_{ij}^R(t_{ij}^{test}, d_{ij}, U_{ij})).
\end{equation} By repeating these steps for each observation (n times), a LOOCV predictive Z-residual is calculated for each observation.
For cluster-based or categorical covariate values, specific considerations are employed during the LOOCV and k-fold CV methods. Clusters with only one observation cannot be included in the training dataset, and similar requirements are imposed on categorical covariates. As such, cross-validatory Z-residuals for these observations are designated as NA in the implementation.
Examples for Illustration and Demonstration
Load the real Dataset
This example demonstrates the practical application of cross-validatory Z-residuals in identifying outliers within a study on kidney infections. The dataset comprises records from 38 kidney patients using a portable dialysis machine. It documents the times of the first and second recurrences of kidney infections for these patients. Each patient’s survival time is defined as the duration until infection from catheter insertion. The patient records are considered as clusters due to shared frailty, signifying the common effect across patients. Instances where the catheter is removed for reasons other than infection are treated as censored observations, accounting for 24% of the dataset. The dataset encompasses 38 patient clusters, with each patient having exactly two observations, resulting in a total sample size of 76. This dataset is frequently employed to exemplify shared frailty models.
Code
load(here::here("inst/extdata/kidney.rda"))
kidney$sex <- ifelse(kidney$sex == 1, "male", "female")
kidney$sex<-as.factor(kidney$sex)
kidney$id<-as.factor(kidney$id)
Fitting Models
We fit a shared gamma frailty model with three covariates: covariates: age in years, gender (male or female), and four disease types (0=GN, 1=AN, 2=PKD, 3=Other).
Code
fit_kidney <-coxph(Surv(time, status) ~ age + sex + disease+frailty(id, distribution="gamma"), data= kidney)
Z-residual and LOOCV Z-residual calculation
We computed Z-residuals using the No-CV and LOOCV methods for the kidney infection dataset. Given the similarity in performance between the 10-fold CV and LOOCV Z-residual methods demonstrated in the simulation studies and the manageable computational load, we focused on the LOOCV method.
Inspecting the Normality of Z-Residuals for Checking Overall GOF
The first and second columns of Figure 1 display scatterplots against the index and QQ plots of the Z-residuals calculated with the No-CV and LOOCV methods. The No-CV Z-residuals predominantly fall within the range of -3 to 3, displaying alignment with the 45^\circ straight line in the QQ plot. The QQ plot of the No-CV Z-residuals indicates a SW p-value of around 0.70, signifying a well-fitted model to the dataset. Thus, the diagnostic results using No-CV Z-residuals suggest the suitability of the shared frailty model for the dataset without identifying any outliers. However, analysis of the scatterplot of LOOCV Z-residuals reveals that the Z-residuals of cases labeled 20 and 42 exceed 3. These instances are considered outliers for the shared frailty model. The QQ plot of LOOCV Z-residuals displays a noticeable deviation from the 45^\circ straight line, attributed to the considerable Z-residuals of the two identified outliers. The SW p-value of LOOCV Z-residuals is notably small, measuring less than 0.01, as evident in the QQ plot. In summary, the diagnosis results with LOOCV Z-residuals suggest that the fitted shared frailty model is inadequate for this dataset, and two cases exhibit excessive Z-residuals, categorized as outliers for this model.
Code
par(mfrow = c(2, 2), mar = c(4, 4, 2, 2))
plot.zresid(Zresid.kidney,X="index", main.title = "Z-residual Scatterplot",
outlier.return = TRUE)
Code
plot.zresid(CVZresid.kidney,X="index", main.title = "LOOCV Z-residual Scatterplot",
outlier.return = TRUE)
Code
qqnorm.zresid(Zresid.kidney, main.title = "Z-residual QQ plot")
qqnorm.zresid(CVZresid.kidney, main.title = "LOOCV Z-residual QQ plot")
Diagnostic Tests with Z-residuals
The Shapiro-Wilk (SW) or Shapiro-Francia (SF) normality tests applied to Z-residuals can be used to numerically test the overall GOF of the model.
There exists randomness in the Z-residuals of censored observations, meaning that different sets of Z-residuals can be generated for the same dataset using distinct random numbers. Thus, to test the robustness of the previously conducted diagnosis, we replicated a large number of realizations of Z-residuals. The Figure 2 exhibits histograms of 1000 SW test p-values, each derived from a set of No-CV or LOOCV Z-residuals. More than 95% of the SW p-values for No-CV Z-residuals surpass 0.05, whereas 100% of the SW p-values for LOOCV Z-residuals fall below 0.05. This consistency across numerous replications confirms that the evaluation of the misspecification of the shared frailty model is not incidental to a specific set of LOOCV Z-residuals but a recurring conclusion supported by extensive Z-residual replications.
Code
par(mfrow = c(2,2),mar=c(4,4,2,2))
hist(sw.kidney,main="Replicated Z-SW P-values, No CV",breaks=20,
xlab="Z-SW P-values")
abline(v=pmin.sw.kidney,col="red")
hist(sw.kidney.cv,main="Replicated Z-SW P-values, LOOCV",breaks=20,
xlab="Z-SW P-values")
abline(v=pmin.sw.kidney.cv,col="red")
hist(sf.kidney,main="Replicated Z-SF P-values, No CV",breaks=20,
xlab="Z-SF P-values")
abline(v=pmin.sf.kidney,col="red")
hist(sf.kidney.cv,main="Replicated Z-SF P-values, LOOCV",breaks=20,
xlab="Z-SF P-values")
abline(v=pmin.sf.kidney.cv,col="red")