Investigating the missing data mechanism in quality of life outcomes: a comparison of approaches
Abstract
Background: Missing data is classified as missing completely at random (MCAR), missing at
random (MAR) or missing not at random (MNAR). Knowing the mechanism is useful in identifying
the most appropriate analysis. The first aim was to compare different methods for identifying this
missing data mechanism to determine if they gave consistent conclusions. Secondly, to investigate
whether the reminder-response data can be utilised to help identify the missing data mechanism.
Methods: Five clinical trial datasets that employed a reminder system at follow-up were used.
Some quality of life questionnaires were initially missing, but later recovered through reminders.
Four methods of determining the missing data mechanism were applied. Two response data
scenarios were considered. Firstly, immediate data only; secondly, all observed responses
(including reminder-response).
Results: In three of five trials the hypothesis tests found evidence against the MCAR assumption.
Logistic regression suggested MAR, but was able to use the reminder-collected data to highlight
potential MNAR data in two trials.
Conclusion: The four methods were consistent in determining the missingness mechanism. One
hypothesis test was preferred as it is applicable with intermittent missingness. Some inconsistencies
between the two data scenarios were found. Ignoring the reminder data could potentially give a
distorted view of the missingness mechanism. Utilising reminder data allowed the possibility of
MNAR to be considered.