Forecast Aggregation and Binned Sequential Testing in a Streaming Environment

dc.contributor.advisorScott, David Wen_US
dc.creatorCross, Daniel Mishaelen_US
dc.date.accessioned2019-05-16T20:28:30Zen_US
dc.date.available2019-05-16T20:28:30Zen_US
dc.date.created2017-08en_US
dc.date.issued2017-08-08en_US
dc.date.submittedAugust 2017en_US
dc.date.updated2019-05-16T20:28:31Zen_US
dc.description.abstractThis thesis report covers two separate projects. The sequential probability ratio test is a statistical test of one simple hypothesis against another. Oftentimes a parametric form is assumed for the underlying density or (discrete) probability function, and the two hypotheses are specified by two different values of the parameter. In this case the sequential probability ratio test consists of taking observations sequentially and after each observation comparing the updated likelihood ratio to two constants that are chosen to specify type I and type II error probabilities. When the likelihood ratio crosses one of the constants the corresponding density to that constant is chosen to be the true density. For more closely mixed proposed densities the expected number of steps to decision may be large, but generally is less than a fixed n design with the same type I and II errors. In this situation data compression could be a necessity in order to reduce data storage requirements. In other situations, the data may arrive or be transformed into bins. In this paper we explore the effects of binning sequential data in two cases: (1) the exact binned (histogram) of densities is known; and (2) finite sample approximations of the exact histogram densities are known. We show the effects of binning in both cases on the expected number of steps to decision, and type I and type II error. Optimal binning parameter choices for common densities as well as formulae for general densities are also given. The Good Judgment Team led by psychologists P. Tetlock and B. Mellers of the University of Pennsylvania was the most successful of five research projects sponsored through 2015 by IARPA to develop improved group forecast aggregation algorithms. Each team had at least ten algorithms under continuous development and evaluation over the four year project. The mean Brier score was used to rank the algorithms on approximately 130 questions concerning categorical geopolitical events each year. An algorithm would return aggregate probabilities for each question based on the prob- abilities provided per question by thousands of individuals, who had been recruited by the Good Judgment Team. This paper summarizes the theorized basis and implementation of one of the two most accurate algorithms at the conclusion of the Good Judgment Project. The algorithm incorporated a number of pre- and post-processing steps, and relied upon a minimum distance robust regression method called L2E; see Scott (2001). The algorithm was just edged out by a variation of logistic regression, which has been described elsewhere; see Mellers et al. (2014) and GJP (2015a). Work since the official conclusion of the project has led to an even smaller gap.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationCross, Daniel Mishael. "Forecast Aggregation and Binned Sequential Testing in a Streaming Environment." (2017) Diss., Rice University. <a href="https://hdl.handle.net/1911/105520">https://hdl.handle.net/1911/105520</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/105520en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectForecast Aggregationen_US
dc.subjectBinningen_US
dc.titleForecast Aggregation and Binned Sequential Testing in a Streaming Environmenten_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentStatisticsen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
CROSS-DOCUMENT-2017.pdf
Size:
3.46 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: