An Unsupervised Approach to Detect Spam Campaigns that Use Botnets on Twitter

dc.contributor.advisorSubramanian, Devikaen_US
dc.creatorChen, Zhouhanen_US
dc.date.accessioned2019-05-17T14:14:38Zen_US
dc.date.available2019-05-17T14:14:38Zen_US
dc.date.created2018-05en_US
dc.date.issued2018-04-20en_US
dc.date.submittedMay 2018en_US
dc.date.updated2019-05-17T14:14:39Zen_US
dc.description.abstractIn recent years, Twitter has seen a proliferation of automated accounts or bots that send spam, offer clickbait, compromise security using malware, and attempt to skew public opinion. Previous research estimates that around 9% to 17% of Twitter accounts are bots contributing to between 16% to 56% of tweets on the medium. Our research introduces an unsupervised approach to detect Twitter spam campaigns in real-time. The bot groups we detect tweet duplicate content with shortened embedded URLs over extended periods of time. Our experiments with the detection protocol reveal that bots consistently account for 10% to 50% of tweets generated from 7 popular URL shortening services on Twitter. More importantly, we discover that bots using shortened URLs are connected to large scale spam campaigns that control thousands of domains. We present two use cases of our detection protocol: one as a filtering tool for sentiment analysis during 2014 #UmbrellaRevolution event, the other as a measurement tool to track political bot activities during 2018 #ReleaseTheMemo event. We also document two distinct mechanisms used to control bot groups. Our detection system runs 24/7 and actively collects bots involved in spam campaigns. As of November 2017, we have identified 200,379 unique bot accounts. We make our database of detected bots available for query through a REST API so others can filter out bots to get high quality Twitter datasets for analysis. We report bot accounts and suspicious domains to URL shortening services and Twitter, and our efforts have impacted those companies to suspend abused URLs and update their anti-spam policy.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationChen, Zhouhan. "An Unsupervised Approach to Detect Spam Campaigns that Use Botnets on Twitter." (2018) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/105662">https://hdl.handle.net/1911/105662</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/105662en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectBot detectionen_US
dc.subjectSpam detectionen_US
dc.subjectSocial Network Analysisen_US
dc.titleAn Unsupervised Approach to Detect Spam Campaigns that Use Botnets on Twitteren_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelMastersen_US
thesis.degree.nameMaster of Scienceen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
CHEN-DOCUMENT-2018.pdf
Size:
16.14 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: