Prediction Oriented Marker Selection (PROMISE) for High Dimensional Regression with Application to Personalized Medicine

dc.contributor.advisorScott, David W.en_US
dc.contributor.committeeMemberLee, J.Jacken_US
dc.contributor.committeeMemberBaladandayuthapani, Veerabhadranen_US
dc.contributor.committeeMemberEnsor, Katherine Ben_US
dc.contributor.committeeMemberNakhleh, Luay Ken_US
dc.creatorKim, Soyeonen_US
dc.date.accessioned2016-02-05T22:25:31Zen_US
dc.date.available2016-02-05T22:25:31Zen_US
dc.date.created2015-12en_US
dc.date.issued2015-10-27en_US
dc.date.submittedDecember 2015en_US
dc.date.updated2016-02-05T22:25:31Zen_US
dc.description.abstractIn personalized medicine, biomarkers are used to select therapies with the highest likelihood of success based on a patients individual biomarker profile. Two important goals of biomarker selection are to choose a small number of important biomarkers that are associated with treatment outcomes and to maintain a high-level of prediction accuracy. These goals are challenging because the number of candidate biomarkers can be large compared to the sample size. Established methods for variable selection based on penalized regression methods such as the lasso and the elastic net have yielded promising results. However, selecting the right amount of penalization is critical to maintain the desired properties for both variable selection and prediction accuracy. To select the regularization parameter, cross-validation (CV) is most commonly used. It tends to provide high prediction accuracy as well as a high true positive rate, at the cost of a high false positive rate. Resampling methods such as stability selection (SS) conversely maintains a good control of the false positive rate, but at the cost of yielding too few true positives. We propose prediction oriented marker selection (PROMISE), which combines SS with CV to include the advantages of both methods. We applied PROMISE to (1) the lasso and (2) the elastic net for individual marker selection, (3) the group lasso for pathway selection, and (4) the combination of the group lasso with the lasso for individual marker selection within the selected pathways. Data analysis show that PROMISE produces a more sparse solution than CV, reducing the false positives compared to CV, while giving similar prediction accuracy and true positives. In our simulation and real data analysis, SS does not work well for variable selection and prediction. PROMISE can be applied in many fields to select regularization parameters when the goals are to minimize both type I and type II errors and to maximize prediction accuracy.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationKim, Soyeon. "Prediction Oriented Marker Selection (PROMISE) for High Dimensional Regression with Application to Personalized Medicine." (2015) Diss., Rice University. <a href="https://hdl.handle.net/1911/88443">https://hdl.handle.net/1911/88443</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/88443en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectPredictive markeren_US
dc.subjectPersonalized medicineen_US
dc.subjectCross-validationen_US
dc.subjectStability Selectionen_US
dc.subjectVariable Selectionen_US
dc.subjectLassoen_US
dc.titlePrediction Oriented Marker Selection (PROMISE) for High Dimensional Regression with Application to Personalized Medicineen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentStatisticsen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
KIM-DOCUMENT-2015.pdf
Size:
1.75 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.6 KB
Format:
Plain Text
Description: