Author

Jakob Spiller

Document Type

Honors Thesis

Date of Award

Spring 2024

Abstract

Variance-based sensitivity analysis serves as a crucial tool for assessing the variability of inputs on the output of complex mathematical models. In this thesis, we study Sobol indices, a class of variance-based sensitivity analysis, to quantify the importance of each input variable on the overall variability of the model output. We initially discuss Sobol’s first and total order indices. This includes a brief demonstration through two examples: the Sobol G-function and a polynomial function each with six input variables. These examples serve to highlight the theoretical foundations and practical applications of Sobol’s indices in analyzing model sensitivities. Mainly, we apply Sobol’s method within the framework of a regression model to assess the importance of various features (also known as predictors) in predicting total medical expenses. Our findings reveal that ‘smoking status’ emerged as the most important features impacting health insurance charges, followed by ‘age’ and ‘bmi’ as the second and third most important features, respectively. This application not only demonstrates the effectiveness of Sobol’s indices in real-world actuarial scenarios but also provides a clear hierarchy of factors affecting health insurance premiums. In summary, this study aims to implement a variance-based sensitivity method to select the most influential features, suggesting possible model simplifications and providing insights that could improve decision-making processes in health insurance modeling.

Advisor

Nahid Hasan

Keywords

Sobol’s indices; health insurance data; actuarial science; variable importance; regression model

Share

COinS