Regarding point (2), no you do not need that each row is selected an equal number of times. Well-known choice modellers are using random designs without any issues and in revealed preference data such a structure in the data will not exist either.
I looked at your script and I found a few things:
(1) The AVC matrix clearly indicates that the model is not identified (very large values).
(2) When I generate a new design without the conditional constraints, it runs fine. When I activate any of the constraints, the model becomes unidentified and has an infinite D-error.
(3) When I evaluate the design in the CSV file, the same conditional constraints have been applied in the design and therefore the model again becomes unidentified.
On inspection of your design, I found that level 0 for lm, fm, and transfers ONLY appears together with mode=2. In other words, you have created multicollinearity by perfectly correlating the dummy coefficient for mode=2 with the dummy coefficients for lm=0, fm=0, and transfers=0. Such a model is not identified and the D-error is infinite. The model would be identifiable if lm=0, fm=0, and transfers=0 would also appear when m<>0, but that does not seem to be the case.
The solution is that you need to specify an identifiable model. You will not be able to estimate the model that you have formulated. The design is probably fine, you just need to think a bit more about the utility functions.
I would proceed as follows:
1. Initially remove all your interaction effects, as it is very difficult to assess otherwise.
2. Remove level 0 for lm, fm, and transfers.
3. If mode=2 then the attribute lm, fm, and transfers should 'disappear' from the utility function. This is achieved by multiplying these attributes with an indicator that 1 if mode<>2 and 0 if mode=2.
For step 3, in Apollo or Biogeme you could simply write V = ... + (mode <> 2) * (lm2 * (lm==2) + lm3 * (lm==3) + ...). In Ngene, this is a bit tricker but possible as follows:
- Create new attributes in the dataset called is_pt, where is_pt = 1 if the mode is public transport, and 0 otherwise.
- Replace all levels 0 for fm, lm, and transfers with any other level, for example 1. Which level does not matter because you are multiplying with is_pt=0 anyway if mode=2.
- Create interaction effects in the utility functions between is_pt and the dummy coded levels of fm, lm, and transfers.
I have done the above, and the script below tells me that the design has a D-error of 0.002924.
Note that I am first specifying an auxiliary model (aux) to define all the attributes, and then in the model of interest (main) I specify the interaction with is_pt. I could not do that directly because I cannot introduce dummy variables that only appear in an interaction.
After you have done this, you can add interaction effects one by one, making sure that the model remains identifiable.
- Code: Select all
Design
;alts(aux) = alt1*, alt2*, alt3*, none
;alts(main) = alt1*, alt2*, alt3*, none
;rows = 3600 ? 3600 rows in external design
? ;block = 300 ? 300 blocks in external design (ignore for now)
;eff = main(mnl,d)
;eval = balanced_overlap_design_for_ngene.csv
;model(aux):
U(alt1) = b_mode.dummy[0|0] * mode[1,3,2] ? main mode dummy: 1 = rail, 2 = car (base), 3 = bus
+ b_ind * is_pt[0,1]
+ b_fm.dummy[0|0|0|0|0|0|0|0|0|0] * fm[2,3,4,5,6,7,8,9,10,11,1] ? First Mile dummy: 0 = needed if main mode == car since there is no FM when driving, 1 = walk 5m (base), 2 = walk 15m, 3 = Taxi 5m 12€, 4 = Taxi 15m 30€, 5 = Taxi 30m 50€, 6 = PT 10m 0€, 7 = PT 30m 0€, 8 = PT 50m 0€, 9 = PT 10m 3€, 10 = PT 30m 6€, 11 = PT 50m 9€
+ b_tt.dummy[0|0] * tt[2,3,1] ? time dummy: 1 = 7h (base), 2 = 8h, 3 = 9h
+ b_transfers.dummy[0|0] * transfers[2,3,1] ? transfers dummy: 0 = needed if main mode == car, since there are no transfers when driving, 1 = 0 transfers, 2 = 1 transfer, 3 = 2 transfers
+ b_delay.dummy[0|0|0] * delay[2,3,4,1] ? delay dummy: 1 = on time (base), 2 = 20% 1h, 3 = 20% 2h, 4 = 20% 3h
+ b_cost.dummy[0|0|0|0] * cost[2,3,4,5,1] ? cost dummy: 1 = 25€ (base), 2 = 50€, 3 = 100€, 4 = 150€, 5 = 200€
+ b_lm.dummy[0|0|0|0|0|0|0|0|0|0] * lm[2,3,4,5,6,7,8,9,10,11,1] ? LM dummy: 0 = needed if main mode == car since there is no LM when driving, 1 = walk 5m (base), 2 = walk 15m, 3 = Taxi 5m 12€, 4 = Taxi 15m 30€, 5 = Taxi 30m 50€, 6 = PT 10m 0€, 7 = PT 30m 0€, 8 = PT 50m 0€, 9 = PT 10m 3€, 10 = PT 30m 6€, 11 = PT 50m 9€
+ b_busfreq.dummy[0|0] * busfreq[2,3,1] ? bus frequency: 1 = every 30m (base), 2 = every 60m, 3 = 3 times a day
+ b_busstop.dummy[0|0|0|0] * busstop[2,3,4,5,1] ? distance to bus stop: 1 = right at the hotel (base), 2 = 5min, 3 = 10min, 4 = 15min, 5 = 30min
+ b_hotel.dummy[0|0|0|0] * hotel[2,3,4,5,1] ? facilities at the hotel: 1 = nothing, 2 = E-Bikes, 3 = of (own) electric car for free, 4 = charging of (own) electric car for a market price, 5 = electric car for guests
/
U(alt2) = b_mode.dummy * mode
+ b_ind * is_pt[0,1]
+ b_fm.dummy * fm
+ b_tt.dummy * tt
+ b_transfers.dummy * transfers
+ b_delay.dummy * delay
+ b_cost.dummy * cost
+ b_lm.dummy * lm
+ b_busfreq.dummy * busfreq
+ b_busstop.dummy * busstop
+ b_hotel.dummy * hotel
/
U(alt3) = b_mode.dummy * mode
+ b_ind * is_pt[0,1]
+ b_fm.dummy * fm
+ b_tt.dummy * tt
+ b_transfers.dummy * transfers
+ b_delay.dummy * delay
+ b_cost.dummy * cost
+ b_lm.dummy * lm
+ b_busfreq.dummy * busfreq
+ b_busstop.dummy * busstop
+ b_hotel.dummy * hotel
;model(main):
U(alt1) = b_mode.dummy[0|0] * mode
+ b_fm2 * is_pt * fm.dummy[2]
+ b_fm3 * is_pt * fm.dummy[3]
+ b_fm3 * is_pt * fm.dummy[4]
+ b_fm3 * is_pt * fm.dummy[5]
+ b_fm3 * is_pt * fm.dummy[6]
+ b_fm3 * is_pt * fm.dummy[7]
+ b_fm3 * is_pt * fm.dummy[8]
+ b_fm3 * is_pt * fm.dummy[9]
+ b_fm3 * is_pt * fm.dummy[10]
+ b_fm3 * is_pt * fm.dummy[11]
+ b_tt.dummy[0|0] * tt
+ b_transfers1 * is_pt * transfers.dummy[2]
+ b_transfers1 * is_pt * transfers.dummy[3]
+ b_delay.dummy[0|0|0] * delay
+ b_cost.dummy[0|0|0|0] * cost
+ b_lm2 * is_pt * lm.dummy[2]
+ b_lm3 * is_pt * lm.dummy[3]
+ b_lm3 * is_pt * lm.dummy[4]
+ b_lm3 * is_pt * lm.dummy[5]
+ b_lm3 * is_pt * lm.dummy[6]
+ b_lm3 * is_pt * lm.dummy[7]
+ b_lm3 * is_pt * lm.dummy[8]
+ b_lm3 * is_pt * lm.dummy[9]
+ b_lm3 * is_pt * lm.dummy[10]
+ b_lm3 * is_pt * lm.dummy[11]
+ b_busfreq.dummy[0|0] * busfreq
+ b_busstop.dummy[0|0|0|0] * busstop
+ b_hotel.dummy[0|0|0|0] * hotel
/
U(alt2) = b_mode.dummy[0|0] * mode
+ b_fm2 * is_pt * fm.dummy[2]
+ b_fm3 * is_pt * fm.dummy[3]
+ b_fm3 * is_pt * fm.dummy[4]
+ b_fm3 * is_pt * fm.dummy[5]
+ b_fm3 * is_pt * fm.dummy[6]
+ b_fm3 * is_pt * fm.dummy[7]
+ b_fm3 * is_pt * fm.dummy[8]
+ b_fm3 * is_pt * fm.dummy[9]
+ b_fm3 * is_pt * fm.dummy[10]
+ b_fm3 * is_pt * fm.dummy[11]
+ b_tt.dummy[0|0] * tt
+ b_transfers1 * is_pt * transfers.dummy[2]
+ b_transfers1 * is_pt * transfers.dummy[3]
+ b_delay.dummy[0|0|0] * delay
+ b_cost.dummy[0|0|0|0] * cost
+ b_lm2 * is_pt * lm.dummy[2]
+ b_lm3 * is_pt * lm.dummy[3]
+ b_lm3 * is_pt * lm.dummy[4]
+ b_lm3 * is_pt * lm.dummy[5]
+ b_lm3 * is_pt * lm.dummy[6]
+ b_lm3 * is_pt * lm.dummy[7]
+ b_lm3 * is_pt * lm.dummy[8]
+ b_lm3 * is_pt * lm.dummy[9]
+ b_lm3 * is_pt * lm.dummy[10]
+ b_lm3 * is_pt * lm.dummy[11]
+ b_busfreq.dummy[0|0] * busfreq
+ b_busstop.dummy[0|0|0|0] * busstop
+ b_hotel.dummy[0|0|0|0] * hotel
/
U(alt3) = b_mode.dummy[0|0] * mode
+ b_fm2 * is_pt * fm.dummy[2]
+ b_fm3 * is_pt * fm.dummy[3]
+ b_fm3 * is_pt * fm.dummy[4]
+ b_fm3 * is_pt * fm.dummy[5]
+ b_fm3 * is_pt * fm.dummy[6]
+ b_fm3 * is_pt * fm.dummy[7]
+ b_fm3 * is_pt * fm.dummy[8]
+ b_fm3 * is_pt * fm.dummy[9]
+ b_fm3 * is_pt * fm.dummy[10]
+ b_fm3 * is_pt * fm.dummy[11]
+ b_tt.dummy[0|0] * tt
+ b_transfers1 * is_pt * transfers.dummy[2]
+ b_transfers1 * is_pt * transfers.dummy[3]
+ b_delay.dummy[0|0|0] * delay
+ b_cost.dummy[0|0|0|0] * cost
+ b_lm2 * is_pt * lm.dummy[2]
+ b_lm3 * is_pt * lm.dummy[3]
+ b_lm3 * is_pt * lm.dummy[4]
+ b_lm3 * is_pt * lm.dummy[5]
+ b_lm3 * is_pt * lm.dummy[6]
+ b_lm3 * is_pt * lm.dummy[7]
+ b_lm3 * is_pt * lm.dummy[8]
+ b_lm3 * is_pt * lm.dummy[9]
+ b_lm3 * is_pt * lm.dummy[10]
+ b_lm3 * is_pt * lm.dummy[11]
+ b_busfreq.dummy[0|0] * busfreq
+ b_busstop.dummy[0|0|0|0] * busstop
+ b_hotel.dummy[0|0|0|0] * hotel
$
I hope this helps.
Michiel