choice-metrics.com

by **Ethannn** » Tue Jun 10, 2025 10:45 pm

Dear Michiel,
I am a beginner in the choice experiment method and am currently working on Bayesian D-efficiency design. At present, my attribute level is beach (good, average, poor). Water quality (good, average, poor) Biological resources (good, average, poor) Cost (50,100,150,200), the code in the pre-investigation of the preliminary design is:
Design
;alts=alt1*, alt2*,alt3
;rows=24
;block=3
;eff=(mnl,d)
;model:
U(alt1)=A.dummy[0.0001|0.0002]*A[1,2,0]+ B.dummy[0.0001|0.0002]*B[1,2,0]+ C.dummy[0.0001|0.0002]*C[1,2,0]+ D.dummy[-0.0001|-0.0002|-0.0003]*D[1,2,3,0]/
U(alt2)=A.dummy*A[1,2,0]+ B.dummy*B[1,2,0]+ C.dummy*C[1,2,0]+ D.dummy*D[1,2,3,0]
$
At present, I have a few questions to ask you. First, is it necessary to have a constant term? Secondly, since it is a pre-survey, virtual coding was adopted for all variables including costs. Is there any problem? Thirdly, it is currently planned to distribute 50 copies of the pre-survey to estimate the collected data through MNL or RPL models and determine whether additional interaction items need to be added. Fourth, if the results of the pre-survey are calculated, should virtual coding continue to be used for the cost variables, and what model results are needed as the prior distribution of Bayes D efficiency?
Many thanks!
Best wishes,
Ethannn

by **Michiel Bliemer** » Wed Jun 11, 2025 5:09 am

1. Yes you would include a constant in either alt1/alt2 or in alt3 (with default zero prior). So either:

U(alt1)=asc + A.dummy[0.0001|0.0002]*A[1,2,0]+ B.dummy[0.0001|0.0002]*B[1,2,0]+ C.dummy[0.0001|0.0002]*C[1,2,0]+ D.dummy[-0.0001|-0.0002|-0.0003]*D[1,2,3,0]/
U(alt2)=asc A.dummy*A[1,2,0]+ B.dummy*B[1,2,0]+ C.dummy*C[1,2,0]+ D.dummy*D[1,2,3,0]

or:

U(alt1)=A.dummy[0.0001|0.0002]*A[1,2,0]+ B.dummy[0.0001|0.0002]*B[1,2,0]+ C.dummy[0.0001|0.0002]*C[1,2,0]+ D.dummy[-0.0001|-0.0002|-0.0003]*D[1,2,3,0]/
U(alt2)=A.dummy*A[1,2,0]+ B.dummy*B[1,2,0]+ C.dummy*C[1,2,0]+ D.dummy*D[1,2,3,0]/
U(alt3)=asc

2. Dummy coding all attributes, including numerical ones, for the pilot study is fine and often even recommended.

3/4. After estimating an MNL model (no need to estimate a random parameter model because you do not have enough sample size and optimising for a random parameter model is practically infeasible) you can transfer the parameter estimate and the standard error as a Bayesian prior into the utility functions as follows: (n,beta,se), where beta is the parameter estimate and se is the standard error. You would also need to change to ;eff = (mnl,d,mean), or (mnl,d,median) if the standard errors are large. For the cost attribute I would generally estimate a single coefficient and then no longer use dummy coding to generate an experimental design for the main study. But if the design has only specific attribute combinations across alt1 and alt2, for example always 3 versus 0 and 1 versus 2, then you could again use dummy coding to generate the design for the main study.

Michiel

by **Ethannn** » Wed Jun 11, 2025 11:47 am

Thanks so much for your help with this Michiel.

by **Ethannn** » Mon Jun 23, 2025 11:26 pm

Dear Michiel,
I hope this message finds you well. I recently conducted a pilot survey based on the Ngene code you kindly provided earlier, involving 72 respondents. Thank you again for your valuable guidance.
I now have a few follow-up questions that I would greatly appreciate your insights on:
1. When estimating the MNL model, should I use dummy coding or effects coding for the categorical attributes? I ran both specifications and obtained different results:
o Using dummy coding, the MNL estimates are Mode1;
o Using effects coding, the estimates are Model2.
2. In Model 1, the signs of the attribute coefficients are all consistent with expectations, and most of them are statistically significant. However, the ASC coefficient is not significant. Will this affect the reliability of the model or the subsequent Bayesian D-efficient design? Based on Model 1, I prepared the following Ngene code for the Bayesian D-efficient design:
Design
;alts=alt1*, alt2*, alt3
;rows=24
;block=3
;eff=(mnl,d)
;model:
U(alt1) = ASC[0.219]+A [0.882]*A[1,2,0]
+ B [0.432]*B[1,2,0]
+ C [0.246]*C[1,2,0]
+ D [0.523]*D[1,2,0]
+ E[-0.00249]*E[50,100,150,200] /
U(alt2) = ASC[0.219]+A [0.882]*A[1,2,0]
+ B [0.432]*B[1,2,0]
+ C [0.246]*C[1,2,0]
+ D [0.523]*D[1,2,0]
+ E[-0.00249]*E[50,100,150,200] /
$

3. In Model 2, one of the attribute levels is not statistically significant. Would this non-significance compromise the suitability of the model for deriving prior parameters for a Bayesian design?
Design
;alts=alt1*, alt2*, alt3
;rows=24
;block=3
;eff=(mnl,d,mean)
;model:
U(alt1) = A.dummy[(n,1.440,0.177)|(n,2.012,0.195)]*A[1,2,0]
+ B.dummy[(n,0.912,0.182)|(n,0.890,0.158)]*B[1,2,0]
+ C.dummy[(n,0.434,0.169)|(n,0.754,0.169)]*C[1,2,0]
+ D.dummy[(n,0.111,0.164)|(n,0.981,0.154)]*D[1,2,0]
+ E[(n,-0.00209,0.00121)]*E[50,100,150,200] /
U(alt2) = A.dummy*A + B.dummy*B + C.dummy*C + D.dummy*D + E*E /
U(alt3) = ASC[(n,0.786,0.303)]
$

4. Regarding the cost attribute, the levels always appear in pairs (e.g., 50, 100, 150, 200). If I were to use dummy coding for cost, should I still assign non-zero priors to each dummy level in the Bayesian design (e.g., D.dummy[-0.0001|-0.0002|-0.0003]*D[1,2,3,0])?
More broadly, I am a bit uncertain about which model specification to use during estimation, and whether the coding scheme used in the design phase (e.g., dummy vs. effects coding) must align with the estimation model I will later apply. Any clarification would be extremely helpful.
Model 1
------------------------------------------------------------------------------
choice | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
A | .8817433 .0871399 10.12 0.000 .7109523 1.052534
B | .4317625 .0778809 5.54 0.000 .2791187 .5844062
C | .2462197 .0752601 3.27 0.001 .0987126 .3937268
D | .5230531 .0759689 6.89 0.000 .3741568 .6719495
E | -.0024924 .0011808 -2.11 0.035 -.0048068 -.000178
ASC | .2186871 .2420185 0.90 0.366 -.2556605 .6930347
Model 2
------------------------------------------------------------------------------
choice | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
A1 | 1.439732 .1767985 8.14 0.000 1.093213 1.786251
A2 | 2.01239 .1945652 10.34 0.000 1.63105 2.393731
B1 | .9119698 .1819027 5.01 0.000 .555447 1.268493
B2 | .8899421 .1584458 5.62 0.000 .579394 1.20049
C1 | .4338534 .1690078 2.57 0.010 .1026042 .7651026
C2 | .7543231 .1691705 4.46 0.000 .422755 1.085891
D1 | .111323 .1639312 0.68 0.497 -.2099764 .4326223
D2 | .9814282 .1540309 6.37 0.000 .6795332 1.283323
E | -.0020909 .001212 -1.73 0.084 -.0044664 .0002846
ASC | .7858964 .3030364 2.59 0.010 .1919561 1.379837
------------------------------------------------------------------------------
Many thanks!
Best wishes,
Ethannn

by **Michiel Bliemer** » Wed Jun 25, 2025 4:11 pm

You seem to be confused about dummy and effects coding.

Model 2 is using dummy coding (hence the ".dummy"). To use effects coding, you need to use ".effects" in the same way. Effects coding has a different interpretation but results in the same behavioural model with the same model fit.
Model 1 is not correct; you should not estimate categorical variables using design coding and a single coefficient. So attribute E is correctly added to the utility function in Model 1, but the other attributes are not. You MUST use dummy or effects coding for attributes A to D if they are categorical.

So the choice is easy, you need to use Model 1.
For numerical attribute E, if attribute levels across alt1 and alt2 appear as (50,200) and (100,150) and you do not like this (even though it is statistically optimal), then simply dummy code the attribute and use the dummy coefficients as priors to generate the experimental design. Later on, when you estimate the model, you can use simply estimate a single coefficient as you have done now.

It is not a problem if a parameter is not statistically significant during the pilot phase, since with small sample sizes this often happens. As long as the sign and preference order of the attribute levels makes sense, it should be fine.

Note that you need to use a larger number of draws than the default 200 Halton draws. I would suggest using ;bdraws = sobol(1000) or even more draws if you have so many Bayesian priors (you currently have 10 and this would increase if you also dummy code attribute E).

Michiel

by **Ethannn** » Wed Jun 25, 2025 10:48 pm

Dear Michiel,
Hello, thank you very much for your answer. I have gained a deeper understanding of this method. If I don't care that the cost attribute is always combined in pairs, would you like to reconfirm if there are any problems with the following code? In addition, the coefficient of ASC is expected to be negative, but the prior study is positive. Will there be any impact? Sorry to trouble you!

Design
;alts=alt1*, alt2*, alt3
;rows=24
;block=3
;eff=(mnl,d,mean)
; bdraws = sobol(1000)
;model:
U(alt1) = A.dummy[(n,1.440,0.177)|(n,2.012,0.195)]*A[1,2,0]
+ B.dummy[(n,0.912,0.182)|(n,0.890,0.158)]*B[1,2,0]
+ C.dummy[(n,0.434,0.169)|(n,0.754,0.169)]*C[1,2,0]
+ D.dummy[(n,0.111,0.164)|(n,0.981,0.154)]*D[1,2,0]
+ E[(n,-0.00209,0.00121)]*E[50,100,150,200] /
U(alt2) = A.dummy*A + B.dummy*B + C.dummy*C + D.dummy*D + E*E /
U(alt3) = ASC[(n,0.786,0.303)]
$

Many thanks!
Best wishes,
Ethannn

by **Michiel Bliemer** » Thu Jun 26, 2025 7:59 pm

The script looks good.
I don't think you can say that a negative ASC is expected, since the ASC in this case is not easy to interpret (since the attributes across alt1/alt2 versus alt3 are different). A positive ASC does NOT mean that alt3 is more preferred.
If you inspect the choice probabilities of the design in Ngene, you will see that with these priors the choice probability of alt3 is on average ~10%. If that sounds reasonable to you, then I don't see any issue.

Michiel

by **Ethannn** » Thu Jun 26, 2025 9:44 pm

Thank you very much for your answer. It has been of great help to me.

Ethannn

choice-metrics.com

Pilot study & Bayesian D-efficient design

Pilot study & Bayesian D-efficient design

Re: Pilot study & Bayesian D-efficient design

Re: Pilot study & Bayesian D-efficient design

Re: Pilot study & Bayesian D-efficient design

Re: Pilot study & Bayesian D-efficient design

Re: Pilot study & Bayesian D-efficient design

Re: Pilot study & Bayesian D-efficient design

Re: Pilot study & Bayesian D-efficient design

Who is online