Dummy or effects coding

sab · Post by **sab** » Wed Jan 24, 2024 7:43 pm

Hello,

For the DCE I am designing I am interested in whether 2 new types of health test will influence uptake compared to 2 tests that are currently used. The current tests are based in hospital but the new tests could be done in community settings. I will therefore have the following attributes:

type of test attribute (4 levels)
- current test 1
- current test 2
- new test 1
- new test 2

location
- hospital
- community setting 1
- community setting 2

It will be a single profile design where people will be shown a single profile and asked whether they would have the test or not (to reflect a realistic choice for patients).

I've been under the impression that it is better to use effects coding as dummy coding can lead to misinterpretation of the estimates - the base level will be implicitly equal to zero and there is no unique interpretation of B0.(Bech & Gryd-Hansen, 2005). I have read Daly et al., 2016 paper on dummy coding vs effects coding, but what I still don't understand is if I use effects coding how I would answer my key research questions which require me to examine uptake differences between current practice and hypothetical future practice, as below:

Type of test:
Current test 1 and new test 1
Current test 1 and new test 2
Current test 2 and new test 1
Current test 2 and new test 2

Location:
Hospital and community setting 1
Hospital and community setting 2
Community setting 1 and community setting 2

I'm new to DCEs so want to make sure I understand how I will interpret the analysis in advance.

Michiel Bliemer · Post by **Michiel Bliemer** » Thu Jan 25, 2024 9:10 am

Both dummy and effects coding can be used to test your hypotheses.

If you use effects coding and choose the first level as the base level (but you can select any other level as base level), then you would estimate parameters b1, ..., b5 in the model.

type of test attribute (4 levels)
- current test 1 (base = -b1-b2-b3)
- current test 2 (b1)
- new test 1 (b2)
- new test 2 (b3)

location
- hospital (base = -b4-b5)
- community setting 1 (b4)
- community setting 2 (b5)

Based on parameter estimates for b1, ..., b5 and their covariance matrix, you can use the Delta method to conduct the following statistical tests:

Type of test:
Current test 1 and new test 1: H0: -b1-b2-b3 = b2 --> H0: -b1-2*b2-b3 = 0
Current test 1 and new test 2: H0: -b1-b2-b3 = b3 --> H0: -b1-b2-2*b3 = 0
Current test 2 and new test 1: H0: b1 = b2 --> H0: b1-b2 = 0
Current test 2 and new test 2: H0: b1 = b3 --> H0: b1-b3 = 0

Location:
Hospital and community setting 1: H0: -b4-b5 = b4 --> H0: -2*b4-b5 = 0
Hospital and community setting 2: H0: -b4-b5 = b5 --> H0: -b4-2*b5 = 0
Community setting 1 and community setting 2: H0: b4 = b5 --> H0: b4-b5 = 0

If you use dummy coding you could do exactly the same tests but it would be (much) simpler:

type of test attribute (4 levels)
- current test 1 (base = 0)
- current test 2 (b1)
- new test 1 (b2)
- new test 2 (b3)

location
- hospital (base = 0)
- community setting 1 (b4)
- community setting 2 (b5)

Type of test:
Current test 1 and new test 1: H0: 0 = b2 --> H0: b2 = 0
Current test 1 and new test 2: H0: 0 = b3 --> H0: b3 = 0
Current test 2 and new test 1: H0: b1 = b2 --> H0: b1-b2 = 0
Current test 2 and new test 2: H0: b1 = b3 --> H0: b1-b3 = 0

Location:
Hospital and community setting 1: H0: b4 = 0
Hospital and community setting 2: H0: b5 = 0
Community setting 1 and community setting 2: H0: b4-b5 = 0

Michiel

sab · Post by **sab** » Fri Feb 02, 2024 6:57 pm

Thanks very much Michiel! Is it personal preference then as to whether dummy or effects coding is used?

Michiel Bliemer · Post by **Michiel Bliemer** » Sun Feb 04, 2024 10:15 am

Yes mostly personal preference.

sab · Post by **sab** » Tue Dec 03, 2024 10:17 pm

Hi there,

I have a follow up question to the above post.

I proceeded with the effects coded design and I'm analysing the data using a mixed effects logistic regression. When I test whether there is a significant difference between two levels of an effects coded variable (e.g. I am using the lincom command in Stata to compare the coefficient for current test to the coefficient for new test), I get significant results, but the predicted probabilities for these two variables have overlapping 95% CIs - is this unusual? Moreover, when I ran the model using dummy coding to explore this further, the dummy coded coefficient representing the difference between the two levels (e.g. for new test when current test is used as the base level) is non-significant, when my expectation was that this should give the same result as the lincom test I used.

My understanding is that the lincom test accounts for the covariance of coefficients, but I am confused as to whether it is appropriate to report these tests given that they seem at odds to visual inspection of the predicted probabilities and the dummy coded model output.

Thanks very much in advance for any advice you can offer.

Michiel Bliemer · Post by **Michiel Bliemer** » Wed Dec 04, 2024 9:05 am

Comparing levels across dummy and effects coding should give the same results since the same difference in levels of utility between the two levels is tested, it should give the same result. I am not familiar with Stata so I am not sure what it use.

Yes if you look at confidence intervals only, then testing for statistical difference will not account for covariances, whereas a proper statistical test would use the Delta method to calculate the standard error of the difference of two parameters, which does account for covariances. Therefore, the correct statistical test uses the Delta method, overlap in confidence intervals is not an accurate statistical test for comparing parameter values.

Michiel

sab · Post by **sab** » Wed Dec 11, 2024 8:41 pm

Thanks very much!

apk0022 · Post by **apk0022** » Fri Apr 04, 2025 8:14 am

Hi,
I have a follow-up question here.
I ran a mixed logit model in Apollo (R) using seed in the control. The result for the significance of heterogeneity (sd) for one attribute (with 3 levels) is not consistent across dummy and effect coding. It shows significant heterogeneity for the dummy and not significant for effect coding. I tried altering the levels in the model, checked the data, removed opt-out, and estimated the model. The study has 2 unlabelled alternatives and an opt-out option. But nothing helped.
I thought the underlying heterogeneity would be reflected no matter whichever coding style I chose. This is the issue with only 1 attribute.
I would be really grateful for any advice I can get to understand and resolve this issue.

Thank you in advance.

Regards,
Asmita

Michiel Bliemer · Post by **Michiel Bliemer** » Sat Apr 05, 2025 8:48 am

So I do not see an "issue" with your results. Dummy coding are interpreted to the base level, whereas effects coding is interpreted to the mean. While they describe identical behaviour, the interpretation of the parameters, including any heterogeneity, is not directly comparable.

For example, compare these dummy coefficients: b1 = 0.5, b2 = 0.7, b3 = 0 (base). When you test for statistical significance of b1, you are actually testing whether parameter b1 is different from the base (b3).
Now look at the equivalent effects coefficients: b1 = 0.1, b2 = 0.3, b3 = -0.4 (base). When you test for statistical significance of b1, you are actually testing whether parameter b1 is different from the mean (0).

Both statistical tests are entirely different. It is perfectly fine to conclude that b1 is statistically different from b3 while at the same time concluding that b1 is not statistically different from the mean. Similarly, statistical significance of standard deviations of the parameters are not comparable across dummy and effects coding because they have an entirely different meaning. It is best to choose the coding based on the hypotheses you would like to test. If testing against the mean, choose effects coding, and if testing against another (base) level then choose dummy coding.

I refer to the article below and the Apollo forum.
https://www.sciencedirect.com/science/a ... 4516300781

Michiel

apk0022 · Post by **apk0022** » Fri Apr 11, 2025 11:29 am

Thank you for the response, Michiel.
That helped a little. I did understand the case for the estimate.
However, I am still confused about the t-ratios (significance) of the sigma values (in mixed logit) being behaviorally different for my output for different coding types.

Regards,
Asmita

choice-metrics.com

Dummy or effects coding

Dummy or effects coding

Re: Dummy or effects coding

Re: Dummy or effects coding

Re: Dummy or effects coding

Re: Dummy or effects coding

Re: Dummy or effects coding

Re: Dummy or effects coding

Re: Dummy or effects coding

Re: Dummy or effects coding

Re: Dummy or effects coding