Comparing vector-based and ACT-R memory models using large-scale datasets: User-customized hashtag and tag prediction on Twitter and StackOverflow

Stanley, Clayton

Comparing vector-based and ACT-R memory models using large-scale datasets: User-customized hashtag and tag prediction on Twitter and StackOverflow

dc.contributor.advisor	Byrne, Michael D	en_US
dc.contributor.committeeMember	Kortum, Phillip	en_US
dc.contributor.committeeMember	Subramanian, Devika	en_US
dc.creator	Stanley, Clayton	en_US
dc.date.accessioned	2016-01-27T17:26:34Z	en_US
dc.date.available	2016-01-27T17:26:34Z	en_US
dc.date.created	2014-12	en_US
dc.date.issued	2014-12-02	en_US
dc.date.submitted	December 2014	en_US
dc.date.updated	2016-01-27T17:26:34Z	en_US
dc.description.abstract	The growth of social media and user-created content on online sites provides unique opportunities to study models of declarative memory. The tasks of choosing a hashtag for a tweet and tagging a post on StackOverflow were framed as declarative memory retrieval problems. Two state-of-the-art cognitively-plausible declarative memory models were evaluated on how accurately they predict a user’s chosen tags: an ACT-R based Bayesian model and a random permutation vector-based model. Millions of posts and tweets were collected, and both declarative memory models were used to predict Twitter hashtags and StackOverflow tags. The results show that past user behavior of tag use is a strong predictor of future behavior. Furthermore, past behavior was successfully incorporated into the random permutation model that previously used only context. Also, ACT-R’s attentional weight term was linked to a common entropy-weighting natural language processing method used to attenuate low-predictor words. Word order was not found to be strong predictor of tag use, and the random permutation model performed comparably to the Bayesian model without including word order. This shows that the strength of the random permutation model is not in the ability to represent word order, but rather in the way in which context information is successfully compressed. Finally, model accuracy was moderate to high for the tasks, which supports the theory that choosing tags on StackOverflow and Twitter is primarily a declarative memory retrieval process. The results of the large-scale exploration show how the architecture of the two memory models can be modified to significantly improve accuracy, and may suggest task-independent general modifications that can help improve model fit to human data in a much wider range of domains.	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.citation	Stanley, Clayton. "Comparing vector-based and ACT-R memory models using large-scale datasets: User-customized hashtag and tag prediction on Twitter and StackOverflow." (2014) Diss., Rice University. <a href="https://hdl.handle.net/1911/88165">https://hdl.handle.net/1911/88165</a>.	en_US
dc.identifier.uri	https://hdl.handle.net/1911/88165	en_US
dc.language.iso	eng	en_US
dc.rights	Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.	en_US
dc.subject	ACT-R declarative memory theory	en_US
dc.subject	vector-based models	en_US
dc.subject	LSA	en_US
dc.subject	machine learning	en_US
dc.title	Comparing vector-based and ACT-R memory models using large-scale datasets: User-customized hashtag and tag prediction on Twitter and StackOverflow	en_US
dc.type	Thesis	en_US
dc.type.material	Text	en_US
thesis.degree.department	Psychology	en_US
thesis.degree.discipline	Social Sciences	en_US
thesis.degree.grantor	Rice University	en_US
thesis.degree.level	Doctoral	en_US
thesis.degree.name	Doctor of Philosophy	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: STANLEY-DOCUMENT-2014.pdf
Size:: 5.69 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: PROQUEST_LICENSE.txt
Size:: 5.84 KB
Format:: Plain Text
Description:

Download

Name:: LICENSE.txt
Size:: 2.61 KB
Format:: Plain Text
Description:

Download

Collections

Rice University Theses and Dissertations