These files contain data used in the analysis for Lisa Spiro's "Looking for Bachelors in American Silent Film," The Arclight Guide to Media History and the Digital Humanities (ed. Eric Hoyt and Charles Acland). Data source: Media History Digital Library. Files: * MHDLBachelor1929Corpus.txt: a list of all the files from the Media History Digital Library used in the n-gram analysis and concordance analysis. * Bachelor2GramsLeft.txt: a list of the most frequently occurring two-word phrases with "bachelor" on the left. Created using AntConc on the files in the MHDLBachelor1929Corpus.txt. * Bachelor2GramsRight.txt: a list of the most frequently occurring two-word phrases with "bachelor" on the right. Created using AntConc on the files in the MHDLBachelor1929Corpus.txt. * BachelorFilmographyto1929.pdf: a hand-created list of films up to 1929 with "bachelor" in the description. Used in topic modeling and basic text analysis. Created using Zotero, based on searching for "bachelor" in files in the MHDLBachelor1929Corpus.txt. Many of the descriptions contain uncorrected OCR errors. * BachelorTopicModeling20Results.xlsx: a spreadsheet listing the 20 topics drawn from film descriptions in "Bachelor Filmography." Generated by TM Tool. * BachelorFilmographyWordListNoTAPORStopwords.txt: a list of the words used in my bachelor filmography, sorted by frequency. TAPOR Stopwords list applied. Created using AntConc. Data collected April-September 2015. Files created September 2015. Revisions, December 14, 2015 Two changes have been made since this dataset was uploaded on September 29, 2015: 1. A duplicate film (Bachelor Bill's Birthday Present) was removed from BachelorFilmographyto1929.pdf. The new file is called BachelorFilmographyto1929Revised2015-12-14.pdf 2. For my topic modeling experiment, I conducted additional analysis using the Taporware set of stopwords in addition to the stopwords built into AntConc: TopicModelsFilmAbstractsTAPORStopwordsFINAL.xls. Contact: Lisa Spiro |