choice-metrics.com

by **bartoszbursa** » Tue Mar 25, 2025 2:13 am

Hello everyone,

I would need some advice on the following three points:

(1) I have an external design that I want to evaluate in Ngene. The design was created in Sawtooth, and is some sort of a nearly-orthogonal design, with balanced attribute-levels. The design is very large: 3600 choice tasks (rows) grouped in 300 versions (12 tasks in each) that are equivalent to blocks in Ngene language I guess. In the survey, each respondent is given one set of choice tasks (12) randomly selected from among these 300 versions. I don't know more, since - as is known - Sawtooth is a bit of a black box and the documentation is very meager.

The external design is prepared according to the information in Ngene's manual - first column with design number, second with tasks, further columns with attributes appearing as in Ngene syntax. The syntax is for an unlabeled design, with all attributes dummy coded, the model assumed is MNL with zero priors. All I want to get is a D-error for this design so that I can compare it with the alternative one made in Ngene for the same model. Unfortunately, even though Ngene loads the file and everything indicates that the file structure, etc. is correct, Ngene returns: D-error = undefined. I've tried a number of ways to make it work - reducing the external design to several dozen rows, adding/removing asterisks from alternatives, adding/removing blocking.

What could be the source of this problem? Is there anything more to care about except for what is mentioned in the manual (file structure, no syntax like alg; etc.)?
I can provide code if needed.

(2) This is less about Ngene, and more about experimental design. We typically work with designs that have from a few (say 8) to a few dozen rows (say 60), which are then blocked if too large for one person. The number of rows must be larger than the degrees of freedom, and we also try to make attribute levels balanced. And we do all this under the assumption that each respondent receives the same set of choice tasks (or one of the blocks). Somewhat inspired by Sawtooth's approach (which is unclear to me due to the sparse documentation in this regard), I am wondering whether it could make sense, and what difference it could potentially make, if each respondent received a unique set of choice tasks. This would of course lead to a very large design. For instance, knowing that we have 100 respondents (e.g., when ordering an online panel study), we make a design with 1000 rows, 100 blocks of 10 choice tasks, and each of the 100 respondent gets one of those blocks. Wouldn't this lead to higher efficiency and more variation in the data?

Or to put it the other way around, what brings us the fact that we have a fixed design and show the same thing to every respondent?

(3) I have two dummy variables (transfers and delay) with multiple levels. I want to interact both and am confused about what to do with the base levels. Is the code below, with base levels skipped, correct?

Code: Select all: + b_transfers.dummy[0|0] * transfers[2,3,1] ? transfers dummy: 1 = 0 transfers (base), 2 = 1 transfer, 3 = 2 transfers + b_delay.dummy[0|0|0] * delay[2,3,4,1] ? delay dummy: 1 = on time (base), 2 = 1h, 3 = 2h, 4 = 3h + i_tra_del_1[0] * transfers.dummy[2] * delay.dummy[2] + i_tra_del_2[0] * transfers.dummy[2] * delay.dummy[3] + i_tra_del_3[0] * transfers.dummy[2] * delay.dummy[4] + i_tra_del_4[0] * transfers.dummy[3] * delay.dummy[2] + i_tra_del_5[0] * transfers.dummy[3] * delay.dummy[3] + i_tra_del_6[0] * transfers.dummy[3] * delay.dummy[4]

Thank you so much,
Bartosz

by **Michiel Bliemer** » Tue Mar 25, 2025 12:52 pm

(1)
It is difficult to say why the D-error is undefined. It could be because the Sawtooth design has correlated main effects, or it could be that in your utility functions you have specified correlated interaction effects.
Blocking has no influence on the D-error. Removing rows would only make the D-error worse. Adding/removing asterisks will not impact the D-error, and neither does the algorithm (which is for design generation only).

The best way to find out is usually by looking at the covariance (AVC) matrix and the Fisher information matrix. If the D-error is Undefined (essentially infinite), this is usually caused by coefficients not receiving any Fisher information or when there exists multicollinearity. In the latter, you would see very large values in rows/columns for one or more parameters. You could then inspect the corresponding columns in the design matrix to ensure that they are not perfectly correlated.

I am happy to have a look if you like. You can send the script and the spreadsheet to me (private message here on the forum or by email).

(2)
Yes you can make very large designs, there is usually nothing wrong with that. But if the design is sufficiently large (say 100 rows), then you will not be able to squeeze out more efficiency per choice task by increasing it to 1000. So there is usually no need to create extremely large designs. Large designs usually also contain not-so-smart choice tasks because it is much harder to optimise such a large design and ensure that each choice task is optimised (you would need to run the script for a very long time). So while more variation in the data is usually a good thing, the efficiency attached to a single choice task is expected to decrease when adding more and more choice tasks. Since blocking cannot be done perfectly (unless you are using an orthogonal design), you could also choose to let go of blocks and simply randomly select choice tasks for each respondent from the design of 100 rows.
If you do want to optimise a design with 1000 rows, an optimisation strategy that I have sometimes adopted is that I generated 10 designs in parallel (you can start Ngene multiple times and use different cores in your computer), each with 100 rows. It may be easier to optimise 10*100 designs in parallel then to optimise a single design of 1000 rows.

(3)
Yes your script is correct. You need to leave out one of the levels in the interactions as otherwise your model becomes undefined. You can leave out the base levels (which is most common), but you could also have left out another level. You can include 2 transfer levels and 3 delay levels in your interactions.

Michiel

by **bartoszbursa** » Fri Mar 28, 2025 12:59 am

Michiel,

(Re3) thanks, now it is clear.

(Re2) OK, I think I get your point. But, when randomly selecting 10 rows from the design of 100 rows, I need to make sure that each row is selected (without replacement) an equal number of times, right? Which leads essentially to the same result as blocking. I can't show some rows many times and some not all, if I want to retain the calculated efficiency which is valid only to the whole design (all rows used). Or am I wrong here?

(Re3) Please see the code below.

The external design is here on my Google Drive:
https://drive.google.com/file/d/13AsEq-2fEMAXEkhcW4hzSDU5-1NRtqqV/view?usp=sharing
Have a look at Sawtooth's report on the design too:
https://docs.google.com/spreadsheets/d/17hKD1Ze8NBUq1nnZEEn3OoijrL0b7qRy/edit?usp=sharing&ouid=100486506683057714552&rtpof=true&sd=true

Now some explanations before you look at the code:
1. We want to study the preferences for transportation options in long-distance travel.

2. There are 3 alternatives to choose from. And the none option. The are described by mode attributes (time, cost, transfers, delay), access/egress modes, and mobility services offered at the destination (bus frequency, distance to the bus stop, facilities at the hotel).

3. The major attribute is mode, which can be: car, rail, and bus (coach). On rail and bus, you have the first mile (access) and last mile (egress) segments. These are described by a single attribute (FM, LM) with 11 levels, which are a combination of the feeder mode, time, and cost. Why not split it into 3 separate attributes? Because we wanted to have it in a single line in the survey, and this is the only way to implement it in our survey tool.

4. The design is unlabeled. I am aware that is not the optimal solution for a mode choice study, but it is the legacy traditional approach of my colleagues from marketing/economics, and the standard approach implemented in Sawtooth, and they want to preserve comparability between the tools in this project.

5. Everything is dummy coded in Ngene - under the premise that there may be non-linearities in preferences to some attributes, and we do not want to make any assumptions about that (e.g., that there is a linear reaction to cost or number of transfers). In Sawtooth, everything was effects-coded. Does it make a difference for the evaluation?

6. We want to have FM, LM, and transfers alternative(mode)-specific. Since the design is unlabeled, we interact these attributes with mode. Paired with dummy coding, this leads to a long code. Sorry for that.

7. There is no First and Last Mile when driving, and no transfers, so we added an additional level = 0 to these attributes, which applies when mode == car. This is controlled by the if() conditions at the beginning of the code. Although I am not really sure about whether this is correct. I can't find this topic in the forum now, but I remember your response saying that setting an attribute level to 0 is not the same as not having the attribute. What is the statistical explanation behind that? Is it very wrong, or is it just making the design less efficient?

I can't generate a new design in Ngene (no valid design in first 10 min, Ngene sometimes crashes after 30 min), given the assumptions above, nor can I evaluate the external design from Sawtooth (D-error is undefined).
The Fisher Information matrix looks fine. No extraordinarily large values there. But there are NaNs and huge numbers in AVC.

Code: Select all: Design ;alts = alt1*, alt2*, alt3*, none ;rows = 3600 ? 3600 rows in external design ? ;block = 300 ? 300 blocks in external design (ignore for now) ;eff = (mnl,d) ;eval = balanced_overlap_design_for_ngene.csv ;cond: ? conditions (AKA: prohibitions) for the swapping algorithm if(alt1.mode = 2, alt1.fm = 0), if(alt2.mode = 2, alt2.fm = 0), if(alt3.mode = 2, alt3.fm = 0), if(alt1.mode = 2, alt1.transfers = 0), if(alt2.mode = 2, alt2.transfers = 0), if(alt3.mode = 2, alt3.transfers = 0), if(alt1.mode = 2, alt1.lm = 0), if(alt2.mode = 2, alt2.lm = 0), if(alt3.mode = 2, alt3.lm = 0) ;model: U(alt1) = b_mode.dummy[0|0] * mode[1,3,2] ? main mode dummy: 1 = rail, 2 = car (base), 3 = bus + b_fm.dummy[0|0|0|0|0|0|0|0|0|0|0] * fm[0,2,3,4,5,6,7,8,9,10,11,1] ? First Mile dummy: 0 = needed if main mode == car since there is no FM when driving, 1 = walk 5m (base), 2 = walk 15m, 3 = Taxi 5m 12€, 4 = Taxi 15m 30€, 5 = Taxi 30m 50€, 6 = PT 10m 0€, 7 = PT 30m 0€, 8 = PT 50m 0€, 9 = PT 10m 3€, 10 = PT 30m 6€, 11 = PT 50m 9€ + i_fm_mode_1[0] * fm.dummy[2] * mode.dummy[1] ? FM dummies interacted with main mode (we want to have coefficients for each level of FM separately for rail and bus as main mode). No interaction for base levels and for FM = 0. + i_fm_mode_2[0] * fm.dummy[2] * mode.dummy[3] + i_fm_mode_3[0] * fm.dummy[3] * mode.dummy[1] + i_fm_mode_4[0] * fm.dummy[3] * mode.dummy[3] + i_fm_mode_5[0] * fm.dummy[4] * mode.dummy[1] + i_fm_mode_6[0] * fm.dummy[4] * mode.dummy[3] + i_fm_mode_7[0] * fm.dummy[5] * mode.dummy[1] + i_fm_mode_8[0] * fm.dummy[5] * mode.dummy[3] + i_fm_mode_9[0] * fm.dummy[6] * mode.dummy[1] + i_fm_mode_10[0] * fm.dummy[6] * mode.dummy[3] + i_fm_mode_11[0] * fm.dummy[7] * mode.dummy[1] + i_fm_mode_12[0] * fm.dummy[7] * mode.dummy[3] + i_fm_mode_13[0] * fm.dummy[8] * mode.dummy[1] + i_fm_mode_14[0] * fm.dummy[8] * mode.dummy[3] + i_fm_mode_15[0] * fm.dummy[9] * mode.dummy[1] + i_fm_mode_16[0] * fm.dummy[9] * mode.dummy[3] + i_fm_mode_17[0] * fm.dummy[10] * mode.dummy[1] + i_fm_mode_18[0] * fm.dummy[10] * mode.dummy[3] + i_fm_mode_19[0] * fm.dummy[11] * mode.dummy[1] + i_fm_mode_20[0] * fm.dummy[11] * mode.dummy[3] + b_tt.dummy[0|0] * tt[2,3,1] ? time dummy: 1 = 7h (base), 2 = 8h, 3 = 9h + b_transfers.dummy[0|0|0] * transfers[0,2,3,1] ? transfers dummy: 0 = needed if main mode == car, since there are no transfers when driving, 1 = 0 transfers, 2 = 1 transfer, 3 = 2 transfers + i_tra_mode_1[0] * transfers.dummy[2] * mode.dummy[1] ? transfers dummies interacted with main mode (to get cofficients for each number of transfers separately for rail and bus). No interaction for base levels and for transfers = 0. + i_tra_mode_2[0] * transfers.dummy[2] * mode.dummy[3] + i_tra_mode_3[0] * transfers.dummy[3] * mode.dummy[1] + i_tra_mode_4[0] * transfers.dummy[3] * mode.dummy[3] + b_delay.dummy[0|0|0] * delay[2,3,4,1] ? delay dummy: 1 = on time (base), 2 = 20% 1h, 3 = 20% 2h, 4 = 20% 3h + i_tra_del_mode_1[0] * transfers.dummy[2] * delay.dummy[2] * mode.dummy[1] ? transfer dummies interacted with delay dummies (to see whether delay has a larger effect if there are more transfers or maybe not) for each mode separately (that is interacted with mode). No interaction for base levels and for transfers = 0. + i_tra_del_mode_2[0] * transfers.dummy[2] * delay.dummy[2] * mode.dummy[3] + i_tra_del_mode_3[0] * transfers.dummy[2] * delay.dummy[3] * mode.dummy[1] + i_tra_del_mode_4[0] * transfers.dummy[2] * delay.dummy[3] * mode.dummy[3] + i_tra_del_mode_5[0] * transfers.dummy[2] * delay.dummy[4] * mode.dummy[1] + i_tra_del_mode_6[0] * transfers.dummy[2] * delay.dummy[4] * mode.dummy[3] + i_tra_del_mode_7[0] * transfers.dummy[3] * delay.dummy[2] * mode.dummy[1] + i_tra_del_mode_8[0] * transfers.dummy[3] * delay.dummy[2] * mode.dummy[3] + i_tra_del_mode_9[0] * transfers.dummy[3] * delay.dummy[3] * mode.dummy[1] + i_tra_del_mode_10[0] * transfers.dummy[3] * delay.dummy[3] * mode.dummy[3] + i_tra_del_mode_11[0] * transfers.dummy[3] * delay.dummy[4] * mode.dummy[1] + i_tra_del_mode_12[0] * transfers.dummy[3] * delay.dummy[4] * mode.dummy[3] + b_cost.dummy[0|0|0|0] * cost[2,3,4,5,1] ? cost dummy: 1 = 25€ (base), 2 = 50€, 3 = 100€, 4 = 150€, 5 = 200€ + b_lm.dummy[0|0|0|0|0|0|0|0|0|0|0] * lm[0,2,3,4,5,6,7,8,9,10,11,1] ? LM dummy: 0 = needed if main mode == car since there is no LM when driving, 1 = walk 5m (base), 2 = walk 15m, 3 = Taxi 5m 12€, 4 = Taxi 15m 30€, 5 = Taxi 30m 50€, 6 = PT 10m 0€, 7 = PT 30m 0€, 8 = PT 50m 0€, 9 = PT 10m 3€, 10 = PT 30m 6€, 11 = PT 50m 9€ + i_lm_mode_1[0] * lm.dummy[2] * mode.dummy[1] ? LM dummies interacted with main mode (we want to have coefficients for each level of LM separately for rail and bus as main mode). No interaction for base levels and for transfers = 0. + i_lm_mode_2[0] * lm.dummy[2] * mode.dummy[3] + i_lm_mode_3[0] * lm.dummy[3] * mode.dummy[1] + i_lm_mode_4[0] * lm.dummy[3] * mode.dummy[3] + i_lm_mode_5[0] * lm.dummy[4] * mode.dummy[1] + i_lm_mode_6[0] * lm.dummy[4] * mode.dummy[3] + i_lm_mode_7[0] * lm.dummy[5] * mode.dummy[1] + i_lm_mode_8[0] * lm.dummy[5] * mode.dummy[3] + i_lm_mode_9[0] * lm.dummy[6] * mode.dummy[1] + i_lm_mode_10[0] * lm.dummy[6] * mode.dummy[3] + i_lm_mode_11[0] * lm.dummy[7] * mode.dummy[1] + i_lm_mode_12[0] * lm.dummy[7] * mode.dummy[3] + i_lm_mode_13[0] * lm.dummy[8] * mode.dummy[1] + i_lm_mode_14[0] * lm.dummy[8] * mode.dummy[3] + i_lm_mode_15[0] * lm.dummy[9] * mode.dummy[1] + i_lm_mode_16[0] * lm.dummy[9] * mode.dummy[3] + i_lm_mode_17[0] * lm.dummy[10] * mode.dummy[1] + i_lm_mode_18[0] * lm.dummy[10] * mode.dummy[3] + i_lm_mode_19[0] * lm.dummy[11] * mode.dummy[1] + i_lm_mode_20[0] * lm.dummy[11] * mode.dummy[3] + b_busfreq.dummy[0|0] * busfreq[2,3,1] ? bus frequency: 1 = every 30m (base), 2 = every 60m, 3 = 3 times a day + b_busstop.dummy[0|0|0|0] * busstop[2,3,4,5,1] ? distance to bus stop: 1 = right at the hotel (base), 2 = 5min, 3 = 10min, 4 = 15min, 5 = 30min + b_hotel.dummy[0|0|0|0] * hotel[2,3,4,5,1] ? facilities at the hotel: 1 = nothing, 2 = E-Bikes, 3 = of (own) electric car for free, 4 = charging of (own) electric car for a market price, 5 = electric car for guests / U(alt2) = b_mode.dummy * mode + b_fm.dummy * fm + i_fm_mode_1 * fm.dummy[2] * mode.dummy[1] + i_fm_mode_2 * fm.dummy[2] * mode.dummy[3] + i_fm_mode_3 * fm.dummy[3] * mode.dummy[1] + i_fm_mode_4 * fm.dummy[3] * mode.dummy[3] + i_fm_mode_5 * fm.dummy[4] * mode.dummy[1] + i_fm_mode_6 * fm.dummy[4] * mode.dummy[3] + i_fm_mode_7 * fm.dummy[5] * mode.dummy[1] + i_fm_mode_8 * fm.dummy[5] * mode.dummy[3] + i_fm_mode_9 * fm.dummy[6] * mode.dummy[1] + i_fm_mode_10 * fm.dummy[6] * mode.dummy[3] + i_fm_mode_11 * fm.dummy[7] * mode.dummy[1] + i_fm_mode_12 * fm.dummy[7] * mode.dummy[3] + i_fm_mode_13 * fm.dummy[8] * mode.dummy[1] + i_fm_mode_14 * fm.dummy[8] * mode.dummy[3] + i_fm_mode_15 * fm.dummy[9] * mode.dummy[1] + i_fm_mode_16 * fm.dummy[9] * mode.dummy[3] + i_fm_mode_17 * fm.dummy[10] * mode.dummy[1] + i_fm_mode_18 * fm.dummy[10] * mode.dummy[3] + i_fm_mode_19 * fm.dummy[11] * mode.dummy[1] + i_fm_mode_20 * fm.dummy[11] * mode.dummy[3] + b_tt.dummy * tt + b_transfers.dummy * transfers + i_tra_mode_1 * transfers.dummy[2] * mode.dummy[1] + i_tra_mode_2 * transfers.dummy[2] * mode.dummy[3] + i_tra_mode_3 * transfers.dummy[3] * mode.dummy[1] + i_tra_mode_4 * transfers.dummy[3] * mode.dummy[3] + b_delay.dummy * delay + i_tra_del_mode_1 * transfers.dummy[2] * delay.dummy[2] * mode.dummy[1] + i_tra_del_mode_2 * transfers.dummy[2] * delay.dummy[2] * mode.dummy[3] + i_tra_del_mode_3 * transfers.dummy[2] * delay.dummy[3] * mode.dummy[1] + i_tra_del_mode_4 * transfers.dummy[2] * delay.dummy[3] * mode.dummy[3] + i_tra_del_mode_5 * transfers.dummy[2] * delay.dummy[4] * mode.dummy[1] + i_tra_del_mode_6 * transfers.dummy[2] * delay.dummy[4] * mode.dummy[3] + i_tra_del_mode_7 * transfers.dummy[3] * delay.dummy[2] * mode.dummy[1] + i_tra_del_mode_8 * transfers.dummy[3] * delay.dummy[2] * mode.dummy[3] + i_tra_del_mode_9 * transfers.dummy[3] * delay.dummy[3] * mode.dummy[1] + i_tra_del_mode_10 * transfers.dummy[3] * delay.dummy[3] * mode.dummy[3] + i_tra_del_mode_11 * transfers.dummy[3] * delay.dummy[4] * mode.dummy[1] + i_tra_del_mode_12 * transfers.dummy[3] * delay.dummy[4] * mode.dummy[3] + b_cost.dummy * cost + b_lm.dummy * lm + i_lm_mode_1 * lm.dummy[2] * mode.dummy[1] + i_lm_mode_2 * lm.dummy[2] * mode.dummy[3] + i_lm_mode_3 * lm.dummy[3] * mode.dummy[1] + i_lm_mode_4 * lm.dummy[3] * mode.dummy[3] + i_lm_mode_5 * lm.dummy[4] * mode.dummy[1] + i_lm_mode_6 * lm.dummy[4] * mode.dummy[3] + i_lm_mode_7 * lm.dummy[5] * mode.dummy[1] + i_lm_mode_8 * lm.dummy[5] * mode.dummy[3] + i_lm_mode_9 * lm.dummy[6] * mode.dummy[1] + i_lm_mode_10 * lm.dummy[6] * mode.dummy[3] + i_lm_mode_11 * lm.dummy[7] * mode.dummy[1] + i_lm_mode_12 * lm.dummy[7] * mode.dummy[3] + i_lm_mode_13 * lm.dummy[8] * mode.dummy[1] + i_lm_mode_14 * lm.dummy[8] * mode.dummy[3] + i_lm_mode_15 * lm.dummy[9] * mode.dummy[1] + i_lm_mode_16 * lm.dummy[9] * mode.dummy[3] + i_lm_mode_17 * lm.dummy[10] * mode.dummy[1] + i_lm_mode_18 * lm.dummy[10] * mode.dummy[3] + i_lm_mode_19 * lm.dummy[11] * mode.dummy[1] + i_lm_mode_20 * lm.dummy[11] * mode.dummy[3] + b_busfreq.dummy * busfreq + b_busstop.dummy * busstop + b_hotel.dummy * hotel / U(alt3) = b_mode.dummy * mode + b_fm.dummy * fm + i_fm_mode_1 * fm.dummy[2] * mode.dummy[1] + i_fm_mode_2 * fm.dummy[2] * mode.dummy[3] + i_fm_mode_3 * fm.dummy[3] * mode.dummy[1] + i_fm_mode_4 * fm.dummy[3] * mode.dummy[3] + i_fm_mode_5 * fm.dummy[4] * mode.dummy[1] + i_fm_mode_6 * fm.dummy[4] * mode.dummy[3] + i_fm_mode_7 * fm.dummy[5] * mode.dummy[1] + i_fm_mode_8 * fm.dummy[5] * mode.dummy[3] + i_fm_mode_9 * fm.dummy[6] * mode.dummy[1] + i_fm_mode_10 * fm.dummy[6] * mode.dummy[3] + i_fm_mode_11 * fm.dummy[7] * mode.dummy[1] + i_fm_mode_12 * fm.dummy[7] * mode.dummy[3] + i_fm_mode_13 * fm.dummy[8] * mode.dummy[1] + i_fm_mode_14 * fm.dummy[8] * mode.dummy[3] + i_fm_mode_15 * fm.dummy[9] * mode.dummy[1] + i_fm_mode_16 * fm.dummy[9] * mode.dummy[3] + i_fm_mode_17 * fm.dummy[10] * mode.dummy[1] + i_fm_mode_18 * fm.dummy[10] * mode.dummy[3] + i_fm_mode_19 * fm.dummy[11] * mode.dummy[1] + i_fm_mode_20 * fm.dummy[11] * mode.dummy[3] + b_tt.dummy * tt + b_transfers.dummy * transfers + i_tra_mode_1 * transfers.dummy[2] * mode.dummy[1] + i_tra_mode_2 * transfers.dummy[2] * mode.dummy[3] + i_tra_mode_3 * transfers.dummy[3] * mode.dummy[1] + i_tra_mode_4 * transfers.dummy[3] * mode.dummy[3] + b_delay.dummy * delay + i_tra_del_mode_1 * transfers.dummy[2] * delay.dummy[2] * mode.dummy[1] + i_tra_del_mode_2 * transfers.dummy[2] * delay.dummy[2] * mode.dummy[3] + i_tra_del_mode_3 * transfers.dummy[2] * delay.dummy[3] * mode.dummy[1] + i_tra_del_mode_4 * transfers.dummy[2] * delay.dummy[3] * mode.dummy[3] + i_tra_del_mode_5 * transfers.dummy[2] * delay.dummy[4] * mode.dummy[1] + i_tra_del_mode_6 * transfers.dummy[2] * delay.dummy[4] * mode.dummy[3] + i_tra_del_mode_7 * transfers.dummy[3] * delay.dummy[2] * mode.dummy[1] + i_tra_del_mode_8 * transfers.dummy[3] * delay.dummy[2] * mode.dummy[3] + i_tra_del_mode_9 * transfers.dummy[3] * delay.dummy[3] * mode.dummy[1] + i_tra_del_mode_10 * transfers.dummy[3] * delay.dummy[3] * mode.dummy[3] + i_tra_del_mode_11 * transfers.dummy[3] * delay.dummy[4] * mode.dummy[1] + i_tra_del_mode_12 * transfers.dummy[3] * delay.dummy[4] * mode.dummy[3] + b_cost.dummy * cost + b_lm.dummy * lm + i_lm_mode_1 * lm.dummy[2] * mode.dummy[1] + i_lm_mode_2 * lm.dummy[2] * mode.dummy[3] + i_lm_mode_3 * lm.dummy[3] * mode.dummy[1] + i_lm_mode_4 * lm.dummy[3] * mode.dummy[3] + i_lm_mode_5 * lm.dummy[4] * mode.dummy[1] + i_lm_mode_6 * lm.dummy[4] * mode.dummy[3] + i_lm_mode_7 * lm.dummy[5] * mode.dummy[1] + i_lm_mode_8 * lm.dummy[5] * mode.dummy[3] + i_lm_mode_9 * lm.dummy[6] * mode.dummy[1] + i_lm_mode_10 * lm.dummy[6] * mode.dummy[3] + i_lm_mode_11 * lm.dummy[7] * mode.dummy[1] + i_lm_mode_12 * lm.dummy[7] * mode.dummy[3] + i_lm_mode_13 * lm.dummy[8] * mode.dummy[1] + i_lm_mode_14 * lm.dummy[8] * mode.dummy[3] + i_lm_mode_15 * lm.dummy[9] * mode.dummy[1] + i_lm_mode_16 * lm.dummy[9] * mode.dummy[3] + i_lm_mode_17 * lm.dummy[10] * mode.dummy[1] + i_lm_mode_18 * lm.dummy[10] * mode.dummy[3] + i_lm_mode_19 * lm.dummy[11] * mode.dummy[1] + i_lm_mode_20 * lm.dummy[11] * mode.dummy[3] + b_busfreq.dummy * busfreq + b_busstop.dummy * busstop + b_hotel.dummy * hotel $

Thanks again for your time
Bartosz

by **Michiel Bliemer** » Fri Mar 28, 2025 10:02 pm

Regarding point (2), no you do not need that each row is selected an equal number of times. Well-known choice modellers are using random designs without any issues and in revealed preference data such a structure in the data will not exist either.

I looked at your script and I found a few things:
(1) The AVC matrix clearly indicates that the model is not identified (very large values).
(2) When I generate a new design without the conditional constraints, it runs fine. When I activate any of the constraints, the model becomes unidentified and has an infinite D-error.
(3) When I evaluate the design in the CSV file, the same conditional constraints have been applied in the design and therefore the model again becomes unidentified.

On inspection of your design, I found that level 0 for lm, fm, and transfers ONLY appears together with mode=2. In other words, you have created multicollinearity by perfectly correlating the dummy coefficient for mode=2 with the dummy coefficients for lm=0, fm=0, and transfers=0. Such a model is not identified and the D-error is infinite. The model would be identifiable if lm=0, fm=0, and transfers=0 would also appear when m<>0, but that does not seem to be the case.

The solution is that you need to specify an identifiable model. You will not be able to estimate the model that you have formulated. The design is probably fine, you just need to think a bit more about the utility functions.

I would proceed as follows:
1. Initially remove all your interaction effects, as it is very difficult to assess otherwise.
2. Remove level 0 for lm, fm, and transfers.
3. If mode=2 then the attribute lm, fm, and transfers should 'disappear' from the utility function. This is achieved by multiplying these attributes with an indicator that 1 if mode<>2 and 0 if mode=2.

For step 3, in Apollo or Biogeme you could simply write V = ... + (mode <> 2) * (lm2 * (lm==2) + lm3 * (lm==3) + ...). In Ngene, this is a bit tricker but possible as follows:
- Create new attributes in the dataset called is_pt, where is_pt = 1 if the mode is public transport, and 0 otherwise.
- Replace all levels 0 for fm, lm, and transfers with any other level, for example 1. Which level does not matter because you are multiplying with is_pt=0 anyway if mode=2.
- Create interaction effects in the utility functions between is_pt and the dummy coded levels of fm, lm, and transfers.

I have done the above, and the script below tells me that the design has a D-error of 0.002924.
Note that I am first specifying an auxiliary model (aux) to define all the attributes, and then in the model of interest (main) I specify the interaction with is_pt. I could not do that directly because I cannot introduce dummy variables that only appear in an interaction.
After you have done this, you can add interaction effects one by one, making sure that the model remains identifiable.

Code: Select all: Design ;alts(aux) = alt1*, alt2*, alt3*, none ;alts(main) = alt1*, alt2*, alt3*, none ;rows = 3600 ? 3600 rows in external design ? ;block = 300 ? 300 blocks in external design (ignore for now) ;eff = main(mnl,d) ;eval = balanced_overlap_design_for_ngene.csv ;model(aux): U(alt1) = b_mode.dummy[0|0] * mode[1,3,2] ? main mode dummy: 1 = rail, 2 = car (base), 3 = bus + b_ind * is_pt[0,1] + b_fm.dummy[0|0|0|0|0|0|0|0|0|0] * fm[2,3,4,5,6,7,8,9,10,11,1] ? First Mile dummy: 0 = needed if main mode == car since there is no FM when driving, 1 = walk 5m (base), 2 = walk 15m, 3 = Taxi 5m 12€, 4 = Taxi 15m 30€, 5 = Taxi 30m 50€, 6 = PT 10m 0€, 7 = PT 30m 0€, 8 = PT 50m 0€, 9 = PT 10m 3€, 10 = PT 30m 6€, 11 = PT 50m 9€ + b_tt.dummy[0|0] * tt[2,3,1] ? time dummy: 1 = 7h (base), 2 = 8h, 3 = 9h + b_transfers.dummy[0|0] * transfers[2,3,1] ? transfers dummy: 0 = needed if main mode == car, since there are no transfers when driving, 1 = 0 transfers, 2 = 1 transfer, 3 = 2 transfers + b_delay.dummy[0|0|0] * delay[2,3,4,1] ? delay dummy: 1 = on time (base), 2 = 20% 1h, 3 = 20% 2h, 4 = 20% 3h + b_cost.dummy[0|0|0|0] * cost[2,3,4,5,1] ? cost dummy: 1 = 25€ (base), 2 = 50€, 3 = 100€, 4 = 150€, 5 = 200€ + b_lm.dummy[0|0|0|0|0|0|0|0|0|0] * lm[2,3,4,5,6,7,8,9,10,11,1] ? LM dummy: 0 = needed if main mode == car since there is no LM when driving, 1 = walk 5m (base), 2 = walk 15m, 3 = Taxi 5m 12€, 4 = Taxi 15m 30€, 5 = Taxi 30m 50€, 6 = PT 10m 0€, 7 = PT 30m 0€, 8 = PT 50m 0€, 9 = PT 10m 3€, 10 = PT 30m 6€, 11 = PT 50m 9€ + b_busfreq.dummy[0|0] * busfreq[2,3,1] ? bus frequency: 1 = every 30m (base), 2 = every 60m, 3 = 3 times a day + b_busstop.dummy[0|0|0|0] * busstop[2,3,4,5,1] ? distance to bus stop: 1 = right at the hotel (base), 2 = 5min, 3 = 10min, 4 = 15min, 5 = 30min + b_hotel.dummy[0|0|0|0] * hotel[2,3,4,5,1] ? facilities at the hotel: 1 = nothing, 2 = E-Bikes, 3 = of (own) electric car for free, 4 = charging of (own) electric car for a market price, 5 = electric car for guests / U(alt2) = b_mode.dummy * mode + b_ind * is_pt[0,1] + b_fm.dummy * fm + b_tt.dummy * tt + b_transfers.dummy * transfers + b_delay.dummy * delay + b_cost.dummy * cost + b_lm.dummy * lm + b_busfreq.dummy * busfreq + b_busstop.dummy * busstop + b_hotel.dummy * hotel / U(alt3) = b_mode.dummy * mode + b_ind * is_pt[0,1] + b_fm.dummy * fm + b_tt.dummy * tt + b_transfers.dummy * transfers + b_delay.dummy * delay + b_cost.dummy * cost + b_lm.dummy * lm + b_busfreq.dummy * busfreq + b_busstop.dummy * busstop + b_hotel.dummy * hotel ;model(main): U(alt1) = b_mode.dummy[0|0] * mode + b_fm2 * is_pt * fm.dummy[2] + b_fm3 * is_pt * fm.dummy[3] + b_fm3 * is_pt * fm.dummy[4] + b_fm3 * is_pt * fm.dummy[5] + b_fm3 * is_pt * fm.dummy[6] + b_fm3 * is_pt * fm.dummy[7] + b_fm3 * is_pt * fm.dummy[8] + b_fm3 * is_pt * fm.dummy[9] + b_fm3 * is_pt * fm.dummy[10] + b_fm3 * is_pt * fm.dummy[11] + b_tt.dummy[0|0] * tt + b_transfers1 * is_pt * transfers.dummy[2] + b_transfers1 * is_pt * transfers.dummy[3] + b_delay.dummy[0|0|0] * delay + b_cost.dummy[0|0|0|0] * cost + b_lm2 * is_pt * lm.dummy[2] + b_lm3 * is_pt * lm.dummy[3] + b_lm3 * is_pt * lm.dummy[4] + b_lm3 * is_pt * lm.dummy[5] + b_lm3 * is_pt * lm.dummy[6] + b_lm3 * is_pt * lm.dummy[7] + b_lm3 * is_pt * lm.dummy[8] + b_lm3 * is_pt * lm.dummy[9] + b_lm3 * is_pt * lm.dummy[10] + b_lm3 * is_pt * lm.dummy[11] + b_busfreq.dummy[0|0] * busfreq + b_busstop.dummy[0|0|0|0] * busstop + b_hotel.dummy[0|0|0|0] * hotel / U(alt2) = b_mode.dummy[0|0] * mode + b_fm2 * is_pt * fm.dummy[2] + b_fm3 * is_pt * fm.dummy[3] + b_fm3 * is_pt * fm.dummy[4] + b_fm3 * is_pt * fm.dummy[5] + b_fm3 * is_pt * fm.dummy[6] + b_fm3 * is_pt * fm.dummy[7] + b_fm3 * is_pt * fm.dummy[8] + b_fm3 * is_pt * fm.dummy[9] + b_fm3 * is_pt * fm.dummy[10] + b_fm3 * is_pt * fm.dummy[11] + b_tt.dummy[0|0] * tt + b_transfers1 * is_pt * transfers.dummy[2] + b_transfers1 * is_pt * transfers.dummy[3] + b_delay.dummy[0|0|0] * delay + b_cost.dummy[0|0|0|0] * cost + b_lm2 * is_pt * lm.dummy[2] + b_lm3 * is_pt * lm.dummy[3] + b_lm3 * is_pt * lm.dummy[4] + b_lm3 * is_pt * lm.dummy[5] + b_lm3 * is_pt * lm.dummy[6] + b_lm3 * is_pt * lm.dummy[7] + b_lm3 * is_pt * lm.dummy[8] + b_lm3 * is_pt * lm.dummy[9] + b_lm3 * is_pt * lm.dummy[10] + b_lm3 * is_pt * lm.dummy[11] + b_busfreq.dummy[0|0] * busfreq + b_busstop.dummy[0|0|0|0] * busstop + b_hotel.dummy[0|0|0|0] * hotel / U(alt3) = b_mode.dummy[0|0] * mode + b_fm2 * is_pt * fm.dummy[2] + b_fm3 * is_pt * fm.dummy[3] + b_fm3 * is_pt * fm.dummy[4] + b_fm3 * is_pt * fm.dummy[5] + b_fm3 * is_pt * fm.dummy[6] + b_fm3 * is_pt * fm.dummy[7] + b_fm3 * is_pt * fm.dummy[8] + b_fm3 * is_pt * fm.dummy[9] + b_fm3 * is_pt * fm.dummy[10] + b_fm3 * is_pt * fm.dummy[11] + b_tt.dummy[0|0] * tt + b_transfers1 * is_pt * transfers.dummy[2] + b_transfers1 * is_pt * transfers.dummy[3] + b_delay.dummy[0|0|0] * delay + b_cost.dummy[0|0|0|0] * cost + b_lm2 * is_pt * lm.dummy[2] + b_lm3 * is_pt * lm.dummy[3] + b_lm3 * is_pt * lm.dummy[4] + b_lm3 * is_pt * lm.dummy[5] + b_lm3 * is_pt * lm.dummy[6] + b_lm3 * is_pt * lm.dummy[7] + b_lm3 * is_pt * lm.dummy[8] + b_lm3 * is_pt * lm.dummy[9] + b_lm3 * is_pt * lm.dummy[10] + b_lm3 * is_pt * lm.dummy[11] + b_busfreq.dummy[0|0] * busfreq + b_busstop.dummy[0|0|0|0] * busstop + b_hotel.dummy[0|0|0|0] * hotel $

I hope this helps.

Michiel

by **bartoszbursa** » Mon Mar 31, 2025 11:36 pm

Michiel,

Thanks again, I would have never come up with this trick myself :shock:

The code you pasted works for me (I get 0.00272), however, I still have some issues:

1. I noticed that in you code in the main model you use b_fm2 for level 2 of fm.dummy, then b_fm3 for level 3 of fm.dummy, and then you continue with b_fm3 for all remaining levels, which is a typo I guess (is it? the same holds for lm and transfers), since what is wanted here are separate coefficients for each single level. I introduced further coefficients (b_fm2, b_fm3, b_fm4, b_fm5, ..., b_fm11) and the D-error for the external design is 0.004249. If I run Ngene to optimize a design using the very same code, number of rows and the default swapping algorithm, the best what I get is around 0.005852 after initial seeding. Interesting. What can be so special about Sawtooth's algorithm ("balanced overlap" described briefly here at about 3/4 of the document) that it can deliver an over 30% more efficient design?

2. Now if I wanted to play with that in Ngene and search for a better / more efficient design, what can I change to still remain comparable?
I believe that I can't compare:
- designs with different number of rows (say, 24 against 3600),
- with different utility functions (labeled vs. unlabeled with label as an attribute and interacted with other attributes to mimic alt.-spec. coefficients of a labeled design)
- with different priors

So what designs can I compare? Is it just a matter of an algorithm? Or is there any other metric that I can use to compare designs? Such as in choice models e.g. R^2 to assess the fit of different models to the data, or LR-test of two nested models.

3. Again regarding the interaction between is_pt and fm.dummy. I want the coefficients for fm to be specific for rail and bus. Now I have a common coefficient for both (since both are PT). As far as I understand, to get those, I should now interact each term "is_pt * fm.dummy[2-11]" with two levels of mode (1 = rail, and 3 = bus), which will result in 20 mode-specific parameters for First Mile. Is that correct?

Bartosz

by **Michiel Bliemer** » Sat Apr 05, 2025 8:23 am

1. Yes that was a typo. You fixed it. The D-error that you report is for a RANDOM design after initialisation, and clearly a random design is not yet efficient, it serves as a starting point. Ngene optimises designs via optimisation algorithms, while I suspect that Sawtooth has several rules for constructing them without D-error evaluation or optimisation. The Sawtooth rule will be quick and easy, but will not achieve the efficiency that Ngene can achieve in an optimisation problem. However, since the design is HUGE (I have never seen such a large design, and frankly I would never try to generate such a large design), it is extremely computer intensive to optimise. I can see that each evaluation takes very long in Ngene and therefore it would probably take days to find a truly efficient design. I would expect the generated design in Ngene after thousands of iterations would have a lower D-error than the one generated by Sawtooth. If you insist on optimising the design for 3600 rows, I would recommend optimising 6 (or 12) designs, each with 600 (or 300) rows, and then pasting these 6 (12) designs together to create the large design of 3600. It would simplify the optimisation process. You can run Ngene 6 times on 6 different cores on your computer and perform the optimisations simultaneously. I would still let Ngene run for at least a day though

2. You can play with:
- Attribute level balance (the default swapping algorithm in Ngene maintains it, the modified Federov algorithm does not).
- Design type (efficient, orthogonal, random) - I doubt that an orthogonal design exists with your design dimensions.
D-error is what LL is in model estimation. In model estimation, you cannot compare R^2 across different data sets, whereas in experimental design you cannot compare D-errors across different models. A model is defined by (i) model type (e.g., MNL), (ii) utility specification, (iii) priors. So you need to keep all of these fixed.

3. You can simply use a alternative-specific coefficients, e.g. b_fm2_rail and b_fm2_bus, to multiply is_pt * fm_dummy[x]. Note that this will significantly increase the number of parameters in your model, and optimising your design will become even more computationally demanding (the covariance matrix will have something like a a thousand elements that need to be calculated in each design evaluation).

Michiel

by **Michiel Bliemer** » Sat Apr 05, 2025 9:27 am

I also had a quick look at the Sawtooth design. I notice that mode=2 only appears ~500 times across the 3600 rows for each alternative, while mode=1 and mode=3 each appear ~1500 times (3x as much). So, public transport appears ~3000 times while car only ~500 times. This is of course more efficient because it allows more often to get information about the attribute levels that only appear with public transport, but it may not be desirable to have such design imbalance. The default swapping algorithm in Ngene maintains attribute level balance, so it would insist on each mode appearing exactly 1200 times across the 3600 rows (at a higher D-error). The modified Federov algorithm relaxes attribute level balance and would be able to find more efficient designs. This is unfortunately an even slower algorithm to run, especially for such large designs.

Michiel

by **bartoszbursa** » Tue Apr 08, 2025 5:51 pm

Thank you so much for your help, Michiel.

I modified the code so that the term "is_pt.dummy[x] * fm.dummy[x]" is now multiplied by "mode.dummy[x]" and I have alternative-specific coefficients for fm and lm.

Since I want to create a new design now, not evaluate an external one, I added conditions so that is_pt = 1 if mode = [rail,bus], and 0 otherwise.

The thing is that Ngene keeps on crashing after several evaluations. It produces a valid design, the AVC looks good, no NaN or huge numbers. It just crashes each time after a couple of minutes (lasts longer with swapping algorithm, shorter with RSC) with a message: "Something went unexpectedly wrong. You may wish to email ChoiceMetrics for assistance." I use 660 rows as it gives me complete attribute level balance. And I use 55 blocks so as to have 12 tasks per respondent.

What can be wrong? See the code below:

Code: Select all: Design ;alts(aux) = alt1*, alt2*, alt3*, none ;alts(main) = alt1*, alt2*, alt3*, none ;rows = 660 ;block = 55 ;eff = main(mnl,d) ;alg = swap ;cond: if(alt1.mode = [1,3], alt1.is_pt = 1), if(alt2.mode = [1,3], alt2.is_pt = 1), if(alt3.mode = [1,3], alt3.is_pt = 1), if(alt1.mode = 2, alt1.is_pt = 0), if(alt2.mode = 2, alt2.is_pt = 0), if(alt3.mode = 2, alt3.is_pt = 0) ;model(aux): U(alt1) = b_mode.dummy[0|0] * mode[1,3,2] ? main mode dummy: 1 = rail, 2 = car (base), 3 = bus + b_ind[0] * is_pt[0,1] ? PT indicator + b_fm.dummy[0|0|0|0|0|0|0|0|0|0] * fm[2,3,4,5,6,7,8,9,10,11,1] ? First Mile dummy: 1 = walk 5m (base), 2 = walk 15m, 3 = Taxi 5m 12€, 4 = Taxi 15m 30€, 5 = Taxi 30m 50€, 6 = PT 10m 0€, 7 = PT 30m 0€, 8 = PT 50m 0€, 9 = PT 10m 3€, 10 = PT 30m 6€, 11 = PT 50m 9€ + b_tt.dummy[0|0] * tt[2,3,1] ? time dummy: 1 = 7h (base), 2 = 8h, 3 = 9h + b_transfers.dummy[0|0] * transfers[2,3,1] ? transfers dummy: 1 = 0 transfers, 2 = 1 transfer, 3 = 2 transfers + b_delay.dummy[0|0|0] * delay[2,3,4,1] ? delay dummy: 1 = on time (base), 2 = 20% 1h, 3 = 20% 2h, 4 = 20% 3h + b_cost.dummy[0|0|0|0] * cost[2,3,4,5,1] ? cost dummy: 1 = 25€ (base), 2 = 50€, 3 = 100€, 4 = 150€, 5 = 200€ + b_lm.dummy[0|0|0|0|0|0|0|0|0|0] * lm[2,3,4,5,6,7,8,9,10,11,1] ? LM dummy: 1 = walk 5m (base), 2 = walk 15m, 3 = Taxi 5m 12€, 4 = Taxi 15m 30€, 5 = Taxi 30m 50€, 6 = PT 10m 0€, 7 = PT 30m 0€, 8 = PT 50m 0€, 9 = PT 10m 3€, 10 = PT 30m 6€, 11 = PT 50m 9€ + b_busfreq.dummy[0|0] * busfreq[2,3,1] ? bus frequency: 1 = every 30m (base), 2 = every 60m, 3 = 3 times a day + b_busstop.dummy[0|0|0|0] * busstop[2,3,4,5,1] ? distance to bus stop: 1 = right at the hotel (base), 2 = 5min, 3 = 10min, 4 = 15min, 5 = 30min + b_hotel.dummy[0|0|0|0] * hotel[2,3,4,5,1] ? facilities at the hotel: 1 = nothing, 2 = E-Bikes, 3 = of (own) electric car for free, 4 = charging of (own) electric car for a market price, 5 = electric car for guests / U(alt2) = b_mode.dummy * mode + b_ind * is_pt + b_fm.dummy * fm + b_tt.dummy * tt + b_transfers.dummy * transfers + b_delay.dummy * delay + b_cost.dummy * cost + b_lm.dummy * lm + b_busfreq.dummy * busfreq + b_busstop.dummy * busstop + b_hotel.dummy * hotel / U(alt3) = b_mode.dummy * mode + b_ind * is_pt + b_fm.dummy * fm + b_tt.dummy * tt + b_transfers.dummy * transfers + b_delay.dummy * delay + b_cost.dummy * cost + b_lm.dummy * lm + b_busfreq.dummy * busfreq + b_busstop.dummy * busstop + b_hotel.dummy * hotel ;model(main): U(alt1) = b_mode.dummy[0|0] * mode + b_fm2_rail[0] * is_pt.dummy[1] * fm.dummy[2] * mode.dummy[1] + b_fm2_bus[0] * is_pt.dummy[1] * fm.dummy[2] * mode.dummy[3] + b_fm3_rail[0] * is_pt.dummy[1] * fm.dummy[3] * mode.dummy[1] + b_fm3_bus[0] * is_pt.dummy[1] * fm.dummy[3] * mode.dummy[3] + b_fm4_rail[0] * is_pt.dummy[1] * fm.dummy[4] * mode.dummy[1] + b_fm4_bus[0] * is_pt.dummy[1] * fm.dummy[4] * mode.dummy[3] + b_fm5_rail[0] * is_pt.dummy[1] * fm.dummy[5] * mode.dummy[1] + b_fm5_bus[0] * is_pt.dummy[1] * fm.dummy[5] * mode.dummy[3] + b_fm6_rail[0] * is_pt.dummy[1] * fm.dummy[6] * mode.dummy[1] + b_fm6_bus[0] * is_pt.dummy[1] * fm.dummy[6] * mode.dummy[3] + b_fm7_rail[0] * is_pt.dummy[1] * fm.dummy[7] * mode.dummy[1] + b_fm7_bus[0] * is_pt.dummy[1] * fm.dummy[7] * mode.dummy[3] + b_fm8_rail[0] * is_pt.dummy[1] * fm.dummy[8] * mode.dummy[1] + b_fm8_bus[0] * is_pt.dummy[1] * fm.dummy[8] * mode.dummy[3] + b_fm9_rail[0] * is_pt.dummy[1] * fm.dummy[9] * mode.dummy[1] + b_fm9_bus[0] * is_pt.dummy[1] * fm.dummy[9] * mode.dummy[3] + b_fm10_rail[0] * is_pt.dummy[1] * fm.dummy[10] * mode.dummy[1] + b_fm10_bus[0] * is_pt.dummy[1] * fm.dummy[10] * mode.dummy[3] + b_fm11_rail[0] * is_pt.dummy[1] * fm.dummy[11] * mode.dummy[1] + b_fm11_bus[0] * is_pt.dummy[1] * fm.dummy[11] * mode.dummy[3] + b_tt.dummy[0|0] * tt + b_transfers2_rail[0] * is_pt.dummy[1] * transfers.dummy[2] * mode.dummy[1] + b_transfers2_bus[0] * is_pt.dummy[1] * transfers.dummy[2] * mode.dummy[3] + b_transfers3_rail[0] * is_pt.dummy[1] * transfers.dummy[3] * mode.dummy[1] + b_transfers3_bus[0] * is_pt.dummy[1] * transfers.dummy[3] * mode.dummy[3] + b_delay.dummy[0|0|0] * delay + b_transfers2_delay2_rail[0] * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[2] * mode.dummy[1] + b_transfers2_delay2_bus[0] * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[2] * mode.dummy[3] + b_transfers2_delay3_rail[0] * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[3] * mode.dummy[1] + b_transfers2_delay3_bus[0] * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[3] * mode.dummy[3] + b_transfers2_delay4_rail[0] * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[4] * mode.dummy[1] + b_transfers2_delay4_bus[0] * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[4] * mode.dummy[3] + b_transfers3_delay2_rail[0] * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[2] * mode.dummy[1] + b_transfers3_delay2_bus[0] * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[2] * mode.dummy[3] + b_transfers3_delay3_rail[0] * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[3] * mode.dummy[1] + b_transfers3_delay3_bus[0] * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[3] * mode.dummy[3] + b_transfers3_delay4_rail[0] * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[4] * mode.dummy[1] + b_transfers3_delay4_bus[0] * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[4] * mode.dummy[3] + b_cost.dummy[0|0|0|0] * cost + b_lm2_rail[0] * is_pt.dummy[1] * lm.dummy[2] * mode.dummy[1] + b_lm2_bus[0] * is_pt.dummy[1] * lm.dummy[2] * mode.dummy[3] + b_lm3_rail[0] * is_pt.dummy[1] * lm.dummy[3] * mode.dummy[1] + b_lm3_bus[0] * is_pt.dummy[1] * lm.dummy[3] * mode.dummy[3] + b_lm4_rail[0] * is_pt.dummy[1] * lm.dummy[4] * mode.dummy[1] + b_lm4_bus[0] * is_pt.dummy[1] * lm.dummy[4] * mode.dummy[3] + b_lm5_rail[0] * is_pt.dummy[1] * lm.dummy[5] * mode.dummy[1] + b_lm5_bus[0] * is_pt.dummy[1] * lm.dummy[5] * mode.dummy[3] + b_lm6_rail[0] * is_pt.dummy[1] * lm.dummy[6] * mode.dummy[1] + b_lm6_bus[0] * is_pt.dummy[1] * lm.dummy[6] * mode.dummy[3] + b_lm7_rail[0] * is_pt.dummy[1] * lm.dummy[7] * mode.dummy[1] + b_lm7_bus[0] * is_pt.dummy[1] * lm.dummy[7] * mode.dummy[3] + b_lm8_rail[0] * is_pt.dummy[1] * lm.dummy[8] * mode.dummy[1] + b_lm8_bus[0] * is_pt.dummy[1] * lm.dummy[8] * mode.dummy[3] + b_lm9_rail[0] * is_pt.dummy[1] * lm.dummy[9] * mode.dummy[1] + b_lm9_bus[0] * is_pt.dummy[1] * lm.dummy[9] * mode.dummy[3] + b_lm10_rail[0] * is_pt.dummy[1] * lm.dummy[10] * mode.dummy[1] + b_lm10_bus[0] * is_pt.dummy[1] * lm.dummy[10] * mode.dummy[3] + b_lm11_rail[0] * is_pt.dummy[1] * lm.dummy[11] * mode.dummy[1] + b_lm11_bus[0] * is_pt.dummy[1] * lm.dummy[11] * mode.dummy[3] + b_busfreq.dummy[0|0] * busfreq + b_busstop.dummy[0|0|0|0] * busstop + b_hotel.dummy[0|0|0|0] * hotel / U(alt2) = b_mode.dummy * mode + b_fm2_rail * is_pt.dummy[1] * fm.dummy[2] * mode.dummy[1] + b_fm2_bus * is_pt.dummy[1] * fm.dummy[2] * mode.dummy[3] + b_fm3_rail * is_pt.dummy[1] * fm.dummy[3] * mode.dummy[1] + b_fm3_bus * is_pt.dummy[1] * fm.dummy[3] * mode.dummy[3] + b_fm4_rail * is_pt.dummy[1] * fm.dummy[4] * mode.dummy[1] + b_fm4_bus * is_pt.dummy[1] * fm.dummy[4] * mode.dummy[3] + b_fm5_rail * is_pt.dummy[1] * fm.dummy[5] * mode.dummy[1] + b_fm5_bus * is_pt.dummy[1] * fm.dummy[5] * mode.dummy[3] + b_fm6_rail * is_pt.dummy[1] * fm.dummy[6] * mode.dummy[1] + b_fm6_bus * is_pt.dummy[1] * fm.dummy[6] * mode.dummy[3] + b_fm7_rail * is_pt.dummy[1] * fm.dummy[7] * mode.dummy[1] + b_fm7_bus * is_pt.dummy[1] * fm.dummy[7] * mode.dummy[3] + b_fm8_rail * is_pt.dummy[1] * fm.dummy[8] * mode.dummy[1] + b_fm8_bus * is_pt.dummy[1] * fm.dummy[8] * mode.dummy[3] + b_fm9_rail * is_pt.dummy[1] * fm.dummy[9] * mode.dummy[1] + b_fm9_bus * is_pt.dummy[1] * fm.dummy[9] * mode.dummy[3] + b_fm10_rail * is_pt.dummy[1] * fm.dummy[10] * mode.dummy[1] + b_fm10_bus * is_pt.dummy[1] * fm.dummy[10] * mode.dummy[3] + b_fm11_rail * is_pt.dummy[1] * fm.dummy[11] * mode.dummy[1] + b_fm11_bus * is_pt.dummy[1] * fm.dummy[11] * mode.dummy[3] + b_tt.dummy * tt + b_transfers2_rail * is_pt.dummy[1] * transfers.dummy[2] * mode.dummy[1] + b_transfers2_bus * is_pt.dummy[1] * transfers.dummy[2] * mode.dummy[3] + b_transfers3_rail * is_pt.dummy[1] * transfers.dummy[3] * mode.dummy[1] + b_transfers3_bus * is_pt.dummy[1] * transfers.dummy[3] * mode.dummy[3] + b_delay.dummy * delay + b_transfers2_delay2_rail * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[2] * mode.dummy[1] + b_transfers2_delay2_bus * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[2] * mode.dummy[3] + b_transfers2_delay3_rail * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[3] * mode.dummy[1] + b_transfers2_delay3_bus * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[3] * mode.dummy[3] + b_transfers2_delay4_rail * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[4] * mode.dummy[1] + b_transfers2_delay4_bus * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[4] * mode.dummy[3] + b_transfers3_delay2_rail * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[2] * mode.dummy[1] + b_transfers3_delay2_bus * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[2] * mode.dummy[3] + b_transfers3_delay3_rail * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[3] * mode.dummy[1] + b_transfers3_delay3_bus * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[3] * mode.dummy[3] + b_transfers3_delay4_rail * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[4] * mode.dummy[1] + b_transfers3_delay4_bus * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[4] * mode.dummy[3] + b_cost.dummy * cost + b_lm2_rail * is_pt.dummy[1] * lm.dummy[2] * mode.dummy[1] + b_lm2_bus * is_pt.dummy[1] * lm.dummy[2] * mode.dummy[3] + b_lm3_rail * is_pt.dummy[1] * lm.dummy[3] * mode.dummy[1] + b_lm3_bus * is_pt.dummy[1] * lm.dummy[3] * mode.dummy[3] + b_lm4_rail * is_pt.dummy[1] * lm.dummy[4] * mode.dummy[1] + b_lm4_bus * is_pt.dummy[1] * lm.dummy[4] * mode.dummy[3] + b_lm5_rail * is_pt.dummy[1] * lm.dummy[5] * mode.dummy[1] + b_lm5_bus * is_pt.dummy[1] * lm.dummy[5] * mode.dummy[3] + b_lm6_rail * is_pt.dummy[1] * lm.dummy[6] * mode.dummy[1] + b_lm6_bus * is_pt.dummy[1] * lm.dummy[6] * mode.dummy[3] + b_lm7_rail * is_pt.dummy[1] * lm.dummy[7] * mode.dummy[1] + b_lm7_bus * is_pt.dummy[1] * lm.dummy[7] * mode.dummy[3] + b_lm8_rail * is_pt.dummy[1] * lm.dummy[8] * mode.dummy[1] + b_lm8_bus * is_pt.dummy[1] * lm.dummy[8] * mode.dummy[3] + b_lm9_rail * is_pt.dummy[1] * lm.dummy[9] * mode.dummy[1] + b_lm9_bus * is_pt.dummy[1] * lm.dummy[9] * mode.dummy[3] + b_lm10_rail * is_pt.dummy[1] * lm.dummy[10] * mode.dummy[1] + b_lm10_bus * is_pt.dummy[1] * lm.dummy[10] * mode.dummy[3] + b_lm11_rail * is_pt.dummy[1] * lm.dummy[11] * mode.dummy[1] + b_lm11_bus * is_pt.dummy[1] * lm.dummy[11] * mode.dummy[3] + b_busfreq.dummy * busfreq + b_busstop.dummy * busstop + b_hotel.dummy * hotel / U(alt3) = b_mode.dummy * mode + b_fm2_rail * is_pt.dummy[1] * fm.dummy[2] * mode.dummy[1] + b_fm2_bus * is_pt.dummy[1] * fm.dummy[2] * mode.dummy[3] + b_fm3_rail * is_pt.dummy[1] * fm.dummy[3] * mode.dummy[1] + b_fm3_bus * is_pt.dummy[1] * fm.dummy[3] * mode.dummy[3] + b_fm4_rail * is_pt.dummy[1] * fm.dummy[4] * mode.dummy[1] + b_fm4_bus * is_pt.dummy[1] * fm.dummy[4] * mode.dummy[3] + b_fm5_rail * is_pt.dummy[1] * fm.dummy[5] * mode.dummy[1] + b_fm5_bus * is_pt.dummy[1] * fm.dummy[5] * mode.dummy[3] + b_fm6_rail * is_pt.dummy[1] * fm.dummy[6] * mode.dummy[1] + b_fm6_bus * is_pt.dummy[1] * fm.dummy[6] * mode.dummy[3] + b_fm7_rail * is_pt.dummy[1] * fm.dummy[7] * mode.dummy[1] + b_fm7_bus * is_pt.dummy[1] * fm.dummy[7] * mode.dummy[3] + b_fm8_rail * is_pt.dummy[1] * fm.dummy[8] * mode.dummy[1] + b_fm8_bus * is_pt.dummy[1] * fm.dummy[8] * mode.dummy[3] + b_fm9_rail * is_pt.dummy[1] * fm.dummy[9] * mode.dummy[1] + b_fm9_bus * is_pt.dummy[1] * fm.dummy[9] * mode.dummy[3] + b_fm10_rail * is_pt.dummy[1] * fm.dummy[10] * mode.dummy[1] + b_fm10_bus * is_pt.dummy[1] * fm.dummy[10] * mode.dummy[3] + b_fm11_rail * is_pt.dummy[1] * fm.dummy[11] * mode.dummy[1] + b_fm11_bus * is_pt.dummy[1] * fm.dummy[11] * mode.dummy[3] + b_tt.dummy * tt + b_transfers2_rail * is_pt.dummy[1] * transfers.dummy[2] * mode.dummy[1] + b_transfers2_bus * is_pt.dummy[1] * transfers.dummy[2] * mode.dummy[3] + b_transfers3_rail * is_pt.dummy[1] * transfers.dummy[3] * mode.dummy[1] + b_transfers3_bus * is_pt.dummy[1] * transfers.dummy[3] * mode.dummy[3] + b_delay.dummy * delay + b_transfers2_delay2_rail * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[2] * mode.dummy[1] + b_transfers2_delay2_bus * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[2] * mode.dummy[3] + b_transfers2_delay3_rail * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[3] * mode.dummy[1] + b_transfers2_delay3_bus * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[3] * mode.dummy[3] + b_transfers2_delay4_rail * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[4] * mode.dummy[1] + b_transfers2_delay4_bus * is_pt.dummy[1] * transfers.dummy[2] * delay.dummy[4] * mode.dummy[3] + b_transfers3_delay2_rail * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[2] * mode.dummy[1] + b_transfers3_delay2_bus * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[2] * mode.dummy[3] + b_transfers3_delay3_rail * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[3] * mode.dummy[1] + b_transfers3_delay3_bus * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[3] * mode.dummy[3] + b_transfers3_delay4_rail * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[4] * mode.dummy[1] + b_transfers3_delay4_bus * is_pt.dummy[1] * transfers.dummy[3] * delay.dummy[4] * mode.dummy[3] + b_cost.dummy * cost + b_lm2_rail * is_pt.dummy[1] * lm.dummy[2] * mode.dummy[1] + b_lm2_bus * is_pt.dummy[1] * lm.dummy[2] * mode.dummy[3] + b_lm3_rail * is_pt.dummy[1] * lm.dummy[3] * mode.dummy[1] + b_lm3_bus * is_pt.dummy[1] * lm.dummy[3] * mode.dummy[3] + b_lm4_rail * is_pt.dummy[1] * lm.dummy[4] * mode.dummy[1] + b_lm4_bus * is_pt.dummy[1] * lm.dummy[4] * mode.dummy[3] + b_lm5_rail * is_pt.dummy[1] * lm.dummy[5] * mode.dummy[1] + b_lm5_bus * is_pt.dummy[1] * lm.dummy[5] * mode.dummy[3] + b_lm6_rail * is_pt.dummy[1] * lm.dummy[6] * mode.dummy[1] + b_lm6_bus * is_pt.dummy[1] * lm.dummy[6] * mode.dummy[3] + b_lm7_rail * is_pt.dummy[1] * lm.dummy[7] * mode.dummy[1] + b_lm7_bus * is_pt.dummy[1] * lm.dummy[7] * mode.dummy[3] + b_lm8_rail * is_pt.dummy[1] * lm.dummy[8] * mode.dummy[1] + b_lm8_bus * is_pt.dummy[1] * lm.dummy[8] * mode.dummy[3] + b_lm9_rail * is_pt.dummy[1] * lm.dummy[9] * mode.dummy[1] + b_lm9_bus * is_pt.dummy[1] * lm.dummy[9] * mode.dummy[3] + b_lm10_rail * is_pt.dummy[1] * lm.dummy[10] * mode.dummy[1] + b_lm10_bus * is_pt.dummy[1] * lm.dummy[10] * mode.dummy[3] + b_lm11_rail * is_pt.dummy[1] * lm.dummy[11] * mode.dummy[1] + b_lm11_bus * is_pt.dummy[1] * lm.dummy[11] * mode.dummy[3] + b_busfreq.dummy * busfreq + b_busstop.dummy * busstop + b_hotel.dummy * hotel $

by **Michiel Bliemer** » Wed Apr 09, 2025 8:58 am

The script has been running on my laptop for an hour without an issue. I suspect that it has to do with memory as your computer may not have sufficient memory available to run it. Note that you have 77 parameters in your model (and unless you have tens of thousands of respondents, it is unlikely that most of them will be statistically significant). I have never seen so many parameters in a choice model, and together with a very large number of rows, this will consume a lot of memory in your computer.

You can try doing two things:
1. Add ;store = 1 to your script, so that Ngene only stores the last design, not the last 10 designs (default).
2. Close other memory intensive applications

Michiel

by **bartoszbursa** » Wed Apr 09, 2025 8:52 pm

1. Ngene found a design for 660 rows (it still crashes for 3600, but I will test it on a better machine and with only on design stored) for a model with zero priors. As a next step, I wanted to introduce "sign" priors (negative/positive close to zero) for some attributes such as time or cost where I am quite confident about the preference order towards levels of time or cost. The effect is that Ngene cannot find any valid initial random design after 10 min, no matter whether I introduce non-zero priors only for one attribute, or more. It does, when I delete the asterisks in ;alts, but that is not what I want for the unlabeled design. Is there anything I might be doing wrong here?

I tried modified Federov and it appers to work - the big issue are the if() conditions that are not compatible with this algorithm and I see no way to formulate the same conditions using ;reject & ;require.

1. I have 3 alternatives, and the attribute mode which takes 3 levels. How can I make Ngene always show each one of these levels for the 3 alternatives and not use some level twice and the other not at all? In other words, in each task I want to have, e.g., alt1.mode=1, alt2.mode=2, alt3.mode=3 (or 1,3,2, or 2,3,1, or 2,1,3), but not alt1.mode=1, alt2.mode=1, alt3.mode=2 (do not repeat the level if it already appears for another alternative). How can I achieve that in Ngene?

Bartosz

choice-metrics.com

Evaluating external design & large designs

Evaluating external design & large designs

Re: Evaluating external design & large designs

Re: Evaluating external design & large designs

Re: Evaluating external design & large designs

Re: Evaluating external design & large designs

Re: Evaluating external design & large designs

Re: Evaluating external design & large designs

Re: Evaluating external design & large designs

Re: Evaluating external design & large designs

Re: Evaluating external design & large designs

Who is online