Sparse Factor Analysis for Learning and Content Analytics
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
We develop a new model and algorithms for machine learning-based learning analytics, which estimate a learner’s knowledge of the concepts underlying a domain, and content analytics, which estimate the relationships among a collection of questions and those concepts. Our model represents the probability that a learner provides the correct response to a question in terms of three factors: their understanding of a set of underlying concepts, the concepts involved in each question, and each question’s intrinsic difficulty. We estimate these factors given the graded responses to a collection of questions. The underlying estimation problem is ill-posed in general, especially when only a subset of the questions are answered. The key observation that enables a well-posed solution is the fact that typical educational domains of interest involve only a small number of key concepts. Leveraging this observation, we develop a bi-convex maximum-likelihood solution to the resulting SPARse Factor Analysis (SPARFA) problem. We also incorporate instructor-defined tags on questions and question text to facilitate the interpretability of the estimated factors. Experiments with synthetic and real-world data demonstrate the efficacy of our approach.
Description
Advisor
Degree
Type
Keywords
Citation
Lan, Shiting. "Sparse Factor Analysis for Learning and Content Analytics." (2014) Master’s Thesis, Rice University. https://hdl.handle.net/1911/77215.