Software for fitting
some models of principal stratification
Frangakis, CE, and Rubin, DB (2002). Principal Stratification in Causal Inference. (Biometrics, [58, 21-29, Full text - PDF]). The authors have no responsibility for any consequences of use of this software. |
1. WHAT THIS SOFTWARE DOES, and WHY THE NEED FOR THE NEW METHODS.
The software is written for the above paper. The paper studies situations
where we want causal effects adjusted for post-treatment variables, which
includes trials with noncompliance, trials with surrogate endpoints,
censoring by death, and many others. The paper shows that usual methods
of surrogacy and of adjustment do not have causal interpretation as surrogates
because they condition on the post-treatment variable which is affected
by treatment. The paper proposes new treatment comparions that adjust for
post-treatmentvariables and are always causal effects (``principal stratification'').
This version of the software is for a class of estimation methods
of such comparisons and with:
2. TO INSTALL THE SOFTWARE.
The
current version is for unix with S-plus5. Save the Code file (preffects.tar.gz)
in a directory. From that directory, type:
gunzip preffects.tar.gz
tar xf preffects.tar
This creates a subdirectory called preffects, which contains the files: batchin.s prel.s routines1.s routines2.s Go to the subdirectory ``preffects''. Open Splus5 from there. To proceed, we must have the data described below.
3. DATA NEEDED AND QUANTITIES ESTIMATED.
From here on, we assume the following data exist
in Splus5: (note that unless otherwise indicated, storage mode must be
double precision).
``nobs <- as.integer(nobs)''
logit(pr(Si=never-taker | xc1[i,] ))=xc1[i,] *betac1 (first column of xc1 is typically 1)
logit(pr(Si=complier | Si is not never taker, xc2[i,] ))=xc2[i,] *betac2; (first column of xc2 is typically 1)
pr( Yi(z) | Si=s, xy[i,]=xyf)= normal with mean
linky.v1(xyf,s,z) * betay and variance exp(betassqy)
currently, the definition of linky.v1 (in routines2.s) is
linky.v1(xyf,s,z)=[xyf,c2,c3,zt*c2] ;
so that the last parameter is the treatment effect for the principal stratum of compliers= {i: Di(0)=Di(1)=1}.
(note, with large sample sizes, the user can put xc1=xc2=xy; the user may still try that with smaller samples, but if some parameters are poorly estimated, the user may reduce the dimension of some of the design matriced; it is however, recommended that xc1,xc2, and xy have one more column in addition to the ``1'' column. If they only have the ``1'' column, the user should make sure these objects' mode is kept ``as.matrix'' (not just as.vector) throughout the operations.)
EXAMPLE.
Suppose we want to fit two covariates, age and gender, in an additive
way in the two model for principal strata, and the model for outcomes
given principal strata. Then, we should use: xc1_cbind(1,age,gender); xc2_xc1;
xy_xc1; and the dimensions nxc1, nxc2, nxy, should each equal to as.integer(3).
4. MORE ON THE QUANTITIES THAT ARE ESTIMATED.
The parameter of the model is beta=[betac1, betac2, betay, betassqy]
(estimated with maximum likelihood). Generally: (i) we should model covariates,
to increase precision, but we should note that (ii) then, the original
scale of the parameters becomes less interpretable, e.g. when we include
interactions. Because of both (i) and (ii), some averaged estimands
that **result** from the model, are more of interest. The function estimands.exercise
calculates theta=an 8-dim
vector of estimands, where theta=theta(beta,Data) and which comprises
the following: (see also batchin.s).
theta[1] = E(Y(z=0)| never-takers)
theta[2] = E(Y(z=0)| always-takers)
theta[3] = E(Y(z=0)| compliers)
theta[4] = E(Y(z=1)| compliers)
theta[5] = theta[4]-theta[3], the average effect on compliers
theta[6] = pr(Si=never-taker)
theta[7] = pr(Si=compler)
theta[8] = pr(Si=always-taker)
Note, the function estimands.exercise automatically
adapts to whatever xc1,xc2,xy, and linky.v1 (see prel.s) are.
so the user may feel free to change the design matrices in prel.s
without changing the function estimands.exercise.
5. HOW TO RUN THE SOFTWARE.
Read the comments in batchin.s.
Then,
Splus5 BATCH prel.s batchout
and, when the job is done, check the file batchout (it should contain no error messages).
Splus5 BATCH batchin.s batchout
and when the job is done, go in Splus5 and look at the object
citheta
This contains point estimates, and lower and upper 95% CI limits for the 8 estimands of Section 4 above.
Alternatively to (2), that the user can enter Splus5 and type manually each command of the file batchin.s, for best control of the sequence of procedures.
We have not encountered any bugs, but, if any such problem arises, we would appreciate if the user lets us know.