I recently heard Theis Lange talk about How to do Mediation Analysis with Survival Data. Lets say you want to examine the effect of socio-economic position (SEP) on long term sickness absence. Part of the effect may go through the physical work environment, part may be directly related to the socio-economic position:
To make the idea even more specific, imagine a society with farmers having a high socio-economic status (reducing the probability of becoming sick), but working in a dangerous physical environment (increasing the probability of becoming sick). The question is how much of the overall association between status and sickness absence that goes through the physical environment and how much that can be attributed directly to status.
Unfortunately there is no general frame for answering these questions. There are some approaches for specific models, but nothing that works for all cases. Lange's aim is to provide such a general frame. His approach is based on nested counterfactuals. We imagine SEP being constant and vary the mediating variable, but we also imagine the mediating variable being constant and change the status variable. This sounds easy, but as often it involves some effort to apply the idea. Lange makes a very useful contribution by showing exactly how to apply the idea in R.
I have two small comments. First of all, the language of direct and indirect effect is slightly misleading. All effects are indirect in the sense that at a finer level of detail one could specify a more detailed causal mechanism. What we are talking about is really "effects going through the mediating mechanism" vs "all other effects that go though other mediating mechanisms." But this is obvious.
More problematic, there may be a logical problem with nested models. At least if one is not careful when interpreting the effects. Go back to the example of the farmer. Lets say we have lots of professions and we want to examine the relationship between having a job in that professions (which is a proxy for socio economic status) and sickness absence. Imagening a farmer in a farmer environment is easy. Imagining a farmer in a non-farmer environment sounds difficult but not impossible for professions that are close, but imagining an economist working in a farmer's environment is implausible. Nested counterfactual often involve implausible counterfactuals of this type. An approach that asks us to work out the effect of a mediating variable by taking the average across some possible and some impossible counterfactuals may have a problem.
To what extent it is a problem, depends on how carefult one is interpreting the model. One may, for instance, interpret socio-economic status as something much more general. Imagine qadruplets from the same background, two with occupations having similar status - economist and dentist, perhaps - but with different work environments. Another two with different status professions, but with similar work environments. This may be logically possible but it seems difficult to identify the associations in general.
In sum. Good idea, great R implementation, but uncertain about the intuition of nested models in many contexts.
Unfortunately there is no general frame for answering these questions. There are some approaches for specific models, but nothing that works for all cases. Lange's aim is to provide such a general frame. His approach is based on nested counterfactuals. We imagine SEP being constant and vary the mediating variable, but we also imagine the mediating variable being constant and change the status variable. This sounds easy, but as often it involves some effort to apply the idea. Lange makes a very useful contribution by showing exactly how to apply the idea in R.
I have two small comments. First of all, the language of direct and indirect effect is slightly misleading. All effects are indirect in the sense that at a finer level of detail one could specify a more detailed causal mechanism. What we are talking about is really "effects going through the mediating mechanism" vs "all other effects that go though other mediating mechanisms." But this is obvious.
More problematic, there may be a logical problem with nested models. At least if one is not careful when interpreting the effects. Go back to the example of the farmer. Lets say we have lots of professions and we want to examine the relationship between having a job in that professions (which is a proxy for socio economic status) and sickness absence. Imagening a farmer in a farmer environment is easy. Imagining a farmer in a non-farmer environment sounds difficult but not impossible for professions that are close, but imagining an economist working in a farmer's environment is implausible. Nested counterfactual often involve implausible counterfactuals of this type. An approach that asks us to work out the effect of a mediating variable by taking the average across some possible and some impossible counterfactuals may have a problem.
To what extent it is a problem, depends on how carefult one is interpreting the model. One may, for instance, interpret socio-economic status as something much more general. Imagine qadruplets from the same background, two with occupations having similar status - economist and dentist, perhaps - but with different work environments. Another two with different status professions, but with similar work environments. This may be logically possible but it seems difficult to identify the associations in general.
In sum. Good idea, great R implementation, but uncertain about the intuition of nested models in many contexts.
'via Blog this'