Classification Trees with Synthetic Features
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Mathematics
Date of Award
Spring 2018
Abstract
Trained synthetic features were used with classification and regression trees (CART) andboosting methods to predict outcomes of categorical response variables in general. The trainedsynthetic features involved were synthetic features (Zieba, Tomczak, & Tomczak, 2016),principal component analysis (PCA), zero-one regression (ZO), logistic regression (LS), lineardiscriminant analysis (LDA), robust fitting of linear models (RLM), least trimmed squares(LTS), naϊve Bayes (NBAY), and univariate spline (SPL) using the statistical software R. Toillustrate the trained synthetic features in this paper, they were applied to Polish companies' financial data, Fisher's Iris data, and skin lesion data. The objective of the research was to applytrained synthetic features to CART, stock boosting method that had been fitted with the syntheticfeatures at the root node, and synthetic boosting method that was reweighted and refitted thesynthetic features at each iteration, to improve on predictive accuracy for classes in a given dataset rather than random guessing based on the prior probabilities.
Advisor
Thomas Boucher
Subject Categories
Mathematics | Physical Sciences and Mathematics
Recommended Citation
Msabaeka, Tsitsi, "Classification Trees with Synthetic Features" (2018). Electronic Theses & Dissertations. 465.
https://digitalcommons.tamuc.edu/etd/465