choice-metrics.com

by **bobby1994** » Fri Apr 18, 2025 1:06 am

Best professor,

Below is the code I use for my choice experiment. I have a few questions:

1) I want to include a variable for BIKE and PE, called "access_BIKE_WAT" and this variable should have a fixed value of 3. How to code this? The variable is only for this alternative of BIKE. PE also should have a similar variable "access_PE_WAT.

2) For a few variables I have defined a zero prior, since it cannot be found in literature. I see that when including priors and also having more widen attribute levels, this has influence on the S-estimate, the nr of respondents. Do you have to meet the nr of respondents (S-estimate) indicated in NGENE to have statistically significant estimates?

3) Not for every alternative I have costs, but every alternative do have travel times. How are trade-offs usually being made in such situations, by the analist?

4) In the code below I have the D-efficient design defined. I wonder when you can add the "alg = mfederov" command? When I added this and I saw that the evaluation is much faster and the d-error gets much lower compared to when this command is not defined.

5) Is the code fine?

design
;alts = PE, BTM, BIKE, WALKING
;rows = 12
;eff = (mnl, d)

;model:

? PE = Private e-scooter, BTM = Bus, Tram, metro, BIKE = Bicycle, WALKING = walking

? for WALKING, the asc is removed, because this is the reference alternative

U (PE) = asc1[0]
+ a1[(n,-0.03,0.15)] * access_PE_TT[4,8,12] ? TT = travel time
+ a2[(n,-0.09,0.15)] * egress_PE_TT[4,8,12]
+ a3[0].dummy[0|0]* com_PE[1,2,0] ? com = comfort in train, 1 = stored 0 = holding

/

U(BTM)= asc2[0]
+ b1[-0.11] * access_BTM_WKT[2,4,6]
+ b2[-0.084] * access_BTM_TT[3,6,9]
+ b3[-0.073] * access_BTM_WAT[3,6,9]
+ b4[-0.207] * access_BTM_TC[1, 2, 3] ? TC = travel cost
/

U(BIKE)= asc3[0]
+ c1[-0.095] * access_BIKE_TT[4,8,12]
+ c2[-0.069] * access_BIKE_PST[1,3,5]
+ c3[-0.073] * access_BIKE_WAT[3]
/

U(WALKING)= d1[-0.095] * access_W_TT[11,15,19]

$

by **Michiel Bliemer** » Fri Apr 18, 2025 9:55 am

1) You can include as you have done using access_BIKE_WAT[3] and access_PE_WAT[3]. HOWEVER, since these are constants, and your coefficients are alternative-specific, you will not be able to estimate the model because it is not identified since you already have a constant in each alternative. You cannot add multiple constants in a utility function. This is the reason why you get "Undefined" as a D-error when running this script. You have two options: (i) you need to use generic coefficients for the WAT attributes across the alternatives, so using b3 also for PE and BIKE, this will make your model identified again, (ii) remove the WAT attributes from PE and BIKE because it will be absorbed in the alternative specific constants asc1 and asc3.

2) Sample size estimates rely on ALL parameters to have an informative prior. If some are set to zero, such as the constants, then the probabilities will deviate from the true probabilities and therefore the sample size estimates will also deviate from the true ones. I would not rely on the S-estimates in this case, especially since you also took the other priors from other studies. I generally only use priors from a pilot study to interpret S-estimates as otherwise they may not have much meaning.

3) I am not sure what you mean, is this an Ngene question? In a choice model, decision makers make trade-offs across all attributes in each alternative, and these attributes can differ across alternatives. If you are referring to willingness-to-pay calculations, you can use a generic cost coefficient across all alternatives (which does not need to appear in each alternative) and compute the value-of-time etc based on this generic cost coefficient.

4) Sure, you can add ;alg = mfederov, but this algorithm will not guarantee attribute level balance, so some of your attribute levels may not appear in the design. I would recommend using the swapping algorithm. I would also recommend increasing the number of rows to e.g. 24, together with ;block = 2, to get more variation in your data, and to also allow Ngene more flexibility in finding a more efficient design.

5) It looks fine, but I note that you specify Bayesian priors for a1 and a2 but are not generating a Bayesian efficient design. You can do this by using ;eff = (mnl,d,mean). This will use 200 Halton draws, or you can specify something like ;bdraws = sobol(200) or ;bdraws = gauss(3). Note that your standard deviation of 0.15 is very large compared to a parameter value of -0.03 and -0.09, which will result in many draws being positive rather than negative. You need to be careful in setting prior values as you could end up with a very INefficient design.

Michiel

by **bobby1994** » Mon Apr 21, 2025 10:10 pm

Best professor,

Thank you for your reply.

At the first answer you gave from the 5 questions, I have a question. So I use alternative specific coefficients, not generic. So that means I cannot use a coefficient for the constant of access_BIKE_WAT[3] and access_PE_WAT[3]. In the presentation of the choice tasks to the respondents, is it oke to show the constant value of 3 for Bike WAT.
The reason to show the constant is because the alternative BTM has a WAT as well and that one is varied.

And can you also have bayesian priors for ASC's?

I see that when I use a bayesian design, the evaluation is very slow. Is there a way to speed up the generation?

I was wondering what the difference is between having (12 rows without blocks) and (12 rows and 2 blocks). So I want to use 12 rows without blocks, but that might be too much for the respondent. So I tought of showing 6 random rows to each respondent. But what influence will that have on the nr of respondents needed when having 12 rows without blocks and only showing 6 random rows?.

design
;alts = PE, BTM, BIKE, WALKING
;rows = 12
;eff = (mnl, d, mean)

;model:

? PE = Private e-scooter, BTM = Bus, Tram, metro, BIKE = Bicycle, WALKING = walking

? for WALKING, the asc is removed, because this is the reference alternative

U (PE) = asc1[(n,-0.5, 0.7)]
+ a1[(n,-0.09,0.05)] * egress_PE_TT[4,8,12]
+ a2[(n,-0.09,0.05)] * egress_PE_PST[1,3,5]
/

U(BTM)= asc2[-0.165]
+ b1[-0.11] * egress_BTM_WTB[3,6,9] ? WTB = waiting time BTM
+ b2[-0.084] * egress_BTM_TT[3,6,9]
+ b3[-0.073] * egress_BTM_WTD[2,4,6] ?WTD = walking time destination
+ b4[-0.207] * egress_BTM_TC[1, 2, 3] ? TC = travel cost
/

U(BIKE)= asc3[0.055]
+ c1[-0.095] * egress_BIKE_TT[4,8,12]
+ c2[-0.069] * egress_BIKE_PST[1,3,5] ? PST = parking search time
/

U(WALKING)= d1[-0.095] * egress_W_TT[12,18,24] ? W = walking

$

by **Michiel Bliemer** » Tue Apr 22, 2025 1:29 pm

Yes you can show the values of 3 to respondents, but their impact on utility would then be consumed by the alternative-specific constant unless you use a generic coefficient. Having it included in the constant is fine, you will just not be able to disentangle the impact of WAT and the mode-specific label.

Yes you can have Bayesian priors for constants.

You can speed up the generation by using fewer draws. By default Ngene uses 200 Halton draws. You could add ;bdraws = sobol(100) or halton(100). Since you only have 3 Bayesian priors, Gaussian quadrature would work well. You could use ;bdraws = gauss(3), which will use 3 abscissas per Bayesian prior. So if you have 3 Bayesian priors, it will create 3*3*3 = 27 draws, which should be much faster than 200 Halton draws.

Blocking does not influence design efficiency, so you can always add ;block = 2. Blocking will try to achieve some attribute level balance within each block, but cannot be achieved perfectly unless using an orthogonal design. It is fine to show 6 random rows to each respondent. If you block the design in 2 blocks of 6, you of course need twice as many respondents to capture the same amount of information.

Michiel

by **bobby1994** » Thu Apr 24, 2025 12:45 am

Best professor,

Thank you for the reply.

I have two last last questions.

I will ask the respondent two questions of two trips. So I designed two experiments.

1) When estimating the coefficients, can you just add the two datasets together and estimate a nested logit model and MNL and ML?

2) I was thinking of maybe designing one experiment by adding all the variables of both trips and after Ngene generated a design, then I would separate the two trips and show that to the respondent. Whats the difference in generating two separate design and one design including all the variables considering estimation of the coefficients?

by **Michiel Bliemer** » Fri Apr 25, 2025 9:43 am

1) If the two choice experiments have some of the same attributes and coefficients, then you can pool the data and estimated a heteroskedastic choice model or a nested logit model. You would have two choices, and for each choice different alternatives and attributes may be available depending on your choice experiment, which is usually handled by indicating "availability" in the dataset. I refer to model estimation software such as Apollo and Biogeme on how to do this. And of course you can also make some coefficients random and estimated mixed logit models, or latent class models. If the choice experiments do not have the same attributes/coefficients, then you would estimate two separate models. You will always start estimating two separate models anyway, even if you intend to join the data later. For further questions about model estimation I refer to the Apollo and Biogeme forums.

2) With two separate designs you capture 2 choices, with one design you capture only a single choice. So you would model them differently. You may be able to estimate all coefficients, but having a lot of attributes in a single choice task may be quite complex for a respondent and increase error variance.

Michiel

choice-metrics.com

trade-offs and design efficiency

trade-offs and design efficiency

Re: trade-offs and design efficiency

Re: trade-offs and design efficiency

Re: trade-offs and design efficiency

Re: trade-offs and design efficiency

Re: trade-offs and design efficiency

Who is online