Chinese Medical Sciences Journal ›› 2017, Vol. 32 ›› Issue (4): 218-225.doi: 10.24920/J1001-9294.2017.054

• ORIGINAL ARTICLE • Previous Articles     Next Articles

Study of Zero-Inflated Regression Models in a Large-Scale Population Survey of Sub-Health Status and Its Influencing Factors

Xu Tao1, Zhu Guangjin2, Han Shaomei1, *()   

  1. 1Department of Epidemiology and Statistics, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing 100005, China;
    2Department of physiopathology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing 100005, China;
  • Received:2017-03-30 Online:2017-12-30 Published:2017-12-30
  • Contact: Han Shaomei E-mail:hansm1@vip.sina.com
  • Supported by:
    △Fund supported by the Basic Performance Key Project, the Ministry of Science and Technology of the People’s Republic of China (No. 2006FY110300).
This study was designed for the population study of subhealth status and it identified the zero-inflated negative binomial (ZINB) model as the optimal statistical model for the regression analysis of sub-health counting data in a large-scale population study. The predictive probabilities of ZINB model fitted the observed counts best.

Abstract: Objective

Sub-health status has progressively gained more attention from both medical professionals and the publics. Treating the number of sub-health symptoms as count data rather than dichotomous data helps to completely and accurately analyze findings in sub-healthy population. This study aims to compare the goodness of fit for count outcome models to identify the optimum model for sub-health study.

Methods

The sample of the study derived from a large-scale population survey on physiological and psychological constants from 2007 to 2011 in 4 provinces and 2 autonomous regions in China. We constructed four count outcome models using SAS: Poisson model, negative binomial (NB) model, zero-inflated Poisson (ZIP) model and zero-inflated negative binomial (ZINB) model. The number of sub-health symptoms was used as the main outcome measure. The alpha dispersion parameter and O test were used to identify over-dispersed data, and Vuong test was used to evaluate the excessive zero count. The goodness of fit of regression models were determined by predictive probability curves and statistics of likelihood ratio test.

Results

Of all 78 307 respondents, 38.53% reported no sub-health symptoms. The mean number of sub-health symptoms was 2.98, and the standard deviation was 3.72. The statistic O in over-dispersion test was 720.995 (P<0.001); the estimated alpha was 0.618 (95% CI: 0.600-0.636) comparing ZINB model and ZIP model; Vuong test statistic Z was 45.487. These results indicated over-dispersion of the data and excessive zero counts in this sub-health study. ZINB model had the largest log likelihood (-167 519), the smallest Akaike’s Information Criterion coefficient (335 112) and the smallest Bayesian information criterion coefficient (335455), indicating its best goodness of fit. The predictive probabilities for most counts in ZINB model fitted the observed counts best. The logit section of ZINB model analysis showed that age, sex, occupation, smoking, alcohol drinking, ethnicity and obesity were determinants for presence of sub-health symptoms; the binomial negative section of ZINB model analysis showed that sex, occupation, smoking, alcohol drinking, ethnicity, marital status and obesity had significant effect on the severity of sub-health.

Conclusions

All tests for goodness of fit and the predictive probability curve produced the same finding that ZINB model was the optimum model for exploring the influencing factors of sub-health symptoms.

Key words: zero-inflated, negative binomial regression, sub-health, population survey

Copyright © 2018 Chinese Academy of Medical Sciences. All right reserved.
 
www.cmsj.cams.cn
京公安备110402430088 京ICP备06002729号-1  Powered by Magtech.

Supervised by National Health & Family Plan Commission of PRC

9 Dongdan Santiao, Dongcheng district, Beijing, 100730 China

Tel: 86-10-65105897  Fax:86-10-65133074 

E-mail: cmsj@cams.cn  www.cmsj.cams.cn

Copyright © 2018 Chinese Academy of Medical Sciences

All right reserved.

京公安备110402430088  京ICP备06002729号-1