In fitting a Cox model, the phenomenon of monotone likelihood is observed if the likelihood converges to a finite value while at least one parameter diverges (Mukhopadhyay 2020; Heinze and Schemper 2001).
Firth (1993) recommended using the penalized likelihood to reduce the first-order bias in estimating the canonical parameters of an exponential family model, where
and
are the unpenalized likelihood and information matrix, respectively.
Heinze (1999) and Heinze and Schemper (2001) applied the idea of Firth (1993) by maximizing the penalized partial log likelihood
to obtain estimates of regression parameters when a monotone likelihood is observed.
The score function is replaced by the penalized score function,
, where
The Firth estimate is obtained iteratively as
Although the estimated regression parameters, , are obtained by maximizing the penalized partial likelihood, the Taylor series linearized variance estimator uses the score residuals and the information matrix from the unpenalized likelihood that are evaluated at
. For more information, see the section Taylor Series Linearization.
The replication variance estimation methods use the replicated version of the penalized score function to obtain replicate estimates, , for the regression parameters. The replicate estimates are then used in the replication variance estimation, as described in the sections Balanced Repeated Replication (BRR) Method, Bootstrap Method, Jackknife Method, and Replicate Weights Method.
Mukhopadhyay (2020) recommended using normalized weights to construct the penalized log partial likelihood for weighted data. Using the notation in the sections Notation and Estimation and Partial Likelihood Function for the Cox Model, the Breslow unpenalized log partial likelihood is given by
where is the normalized weight, n is the number of observation units, and
is the weight for unit j in PSU i and stratum h.
Denote
Then the score function is given by
and the Fisher information matrix is given by
Denote