Supplementary Material for: Discrete Distribution Based on Compound Sum to Model Dental Caries Count Data

Methods for analysing dental caries and associated risk indicators have evolved considerably in recent decades. The use of zero-inflated or hurdle models is increasing so as to take account of the decayed, missing, and filled teeth (DMFT) distribution, which is positively skewed and has a high proportion of zero scores. However, there is a need to develop new statistical models that involve pragmatic biological considerations on dental caries in epidemiological surveys. In this paper, we show that the zero-inflated and the hurdle models can both be expressed as a compound sum. Using the same compound sum, we then present the generalized negative binomial (GNB) distribution for dental caries count data, and provide a numerical application using the data of the EPIPAP study. The GNB model generates the best score functions while handling the lifetime dental caries disease process better. In conclusion, the GNB model suits the nature of some count data, in particular when structural zeros are unlikely to occur and when several latent spells can lead to new countable events. For these reasons, the use of the GNB distribution appears to be relevant for the modelling of dental caries count data.