Continuous vs categorical variable

This forum is for posts that specifically focus on the Windows desktop version of Ngene (i.e. all version 1.x releases).

Moderators: Andrew Collins, Michiel Bliemer, johnr

Post Reply
jgrant8
Posts: 6
Joined: Wed Mar 26, 2025 12:36 am

Continuous vs categorical variable

Post by jgrant8 »

Hello,

I am designing a DCE using Ngene which includes a continuous attribute with 3 levels (10, 15, 20). If the Ngene design specifies this attribute as continuous, is it essential that it is analysed as continuous in the logit model, or can it also be analysed as a categorical variable if this looks like a better fit at this stage? I read somewhere that you can design it as a non-linear/categorical and then analyse it as continuous but not vice-versa, but I am not sure why this would be the case.

Thanks very much for any help.
Michiel Bliemer
Posts: 2092
Joined: Tue Mar 31, 2009 4:13 pm

Re: Continuous vs categorical variable

Post by Michiel Bliemer »

If you generate an experimental design for estimating a model with dummy/effects coding (i.e. assuming nonlinear effects and more parameters), then by definition you will also be able to estimate a model using the same data assuming a linear effect with a single parameter.

However, if you optimise an experimental design assuming a linear effect with a single parameter (i.e., a continuous numerical attribute), then it MAY happen that you cannot estimate a model with dummy/effects coding because there may not be sufficient variation in the data for estimating a model with more parameters. This is unlikely to happen if you have a sufficiently large number of rows in your experimental design, so in most cases it will be fine. But to be safe, it is generally recommended to assume dummy/effects coding in the experimental design if you are unsure whether you will estimate a model with a single parameter or a model with multiple parameters when considering each level as a category.

For example:
b1*cost[1,2,3,4,5] only has a single parameter b1 that needs to be estimated.
b1.dummy[..|..|..|..] * cost[1,2,3,4,5] has 4 parameters for dummy coefficients for levels 1 to 4 (relative to level 5) that need estimating. A model with more parameters needs to capture more information from the data set and generally requires more rows in the experimental design.

Michiel
Post Reply