The objective of this study is to identify misdiagnosis of pulmonary embolism and other factors that might be related to misdiagnosis using a data-driven approach.
We used the HCUP database of inpatient and emergency patient information in New York and California between 2005 and 2012. There were 64,382,957 observations in New York (20,926,038 inpatient and 43,456,919 emergency patients). There were 92,561,453 observations in California (27,907,535 inpatient and 64,653,918 emergency patients). We looked at patients diagnosed with pulmonary embolism and their previous and subsequent visits, as well as 10% of randomly-selected non-pulmonary-embolism patients.
Cancer, heart diseases, and physical injuries that cause immobility are very common in pulmonaryembolism patients. Having a low socioeconomic status; being a member of minority races such as Native American, Black, and Asian; a weekend admission; and discharge against medical advice are highly associated with a misdiagnosis. The less a hospital sees a pulmonary embolism, the more likely the physicians are to misdiagnose.
Using a data-driven approach, we accomplished the objective we set out for this project and we confirmed the hypothesis we had about misdiagnosis. There were limits to our data, but an analysis of more granular data could help us strengthen our results. We could also use a classification tree to create rules and warnings to reduce the number of misdiagnoses.
Aeint Thet Ngon, ’16
Dawit Tsigie, ’16
Addis Ababa, Ethiopia
Jacob Cuellar, ’16
Loves Park, IL
Economics & Business
Sponsor: Aaron Miller