การเปรียบเทียบวิธีการประมาณค่าสูญหายของตัวแปรอิสระในตัวแบบการถดถอยลอจิสติกที่ข้อมูลมีความสัมพันธ์เชิงเส้นพหุ

ธัญพิชชา ฤทธิ์เทวา

การเปรียบเทียบวิธีการประมาณค่าสูญหายของตัวแปรอิสระในตัวแบบการถดถอยลอจิสติกที่ข้อมูลมีความสัมพันธ์เชิงเส้นพหุ

Files

6510220004.pdf (1.3 MB)

Date

2023

Authors

ธัญพิชชา ฤทธิ์เทวา

Publication

Publisher

มหาวิทยาลัยสงขลานครินทร์

Abstract

Logistic regression analysis is a technique for predicting the probability of an occurrence of a particular event when the dependent variable is qualitative. Data from both quantitative and qualitative sources can be used as the independent variable. It has been used in a wide range of sciences. In particular, in the case of medical data, missing data can lead to a loss of trust in patient evaluation and make it impossible to classify people according to their level of health or disease. Furthermore, multicollinearity between independent variables can lead to misleading results. Therefore, the objective of this research is to study the efficiency of missing data imputation methods for logistic regression when multicollinearity occurs. The missing data imputation methods considered in this research were : mean imputation (MEAN), multiple imputation (MI), k-nearest neighbor imputation (KNN), random forest imputation (RF), stochastic regression imputation (SRI), and bayesian linear regression imputation (BRI). In this study, the simulation was done with sample sizes of 20, 50, 100, 150, 200, 500, and 1000, and the percentages of missing data were 10%, 20%, 30%, and 40%. The estimated mean square error (EMSE) was used to compare efficiency. The results showed that when the sample size is large and there is a high percentage of missing data, the RF method is most effective. The EMSE rises when the percentage of missing data rises and falls when the sample size decreases.

Details

Description

วิทยาศาสตรมหาบัณฑิต (สถิติประยุกต์), 2566

Keywords

Imputation, การวิเคราะห์การถดถอยลอจิสติก (Logistic regression analysis), ข้อมูลสูญหาย (Missing data), ความสัมพันธ์เชิงเส้นพหุ (Multicollinearity), โรคเบาหวาน (Diabetes)

URI

http://kb.psu.ac.th/psukb/handle/2016/19428

Collections

322 Thesis

Full item page

Files

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By