An Unsupervised Approach to Detect Spam Campaigns that Use Botnets on Twitter

Chen, Zhouhan

An Unsupervised Approach to Detect Spam Campaigns that Use Botnets on Twitter

dc.contributor.advisor	Subramanian, Devika	en_US
dc.creator	Chen, Zhouhan	en_US
dc.date.accessioned	2019-05-17T14:14:38Z	en_US
dc.date.available	2019-05-17T14:14:38Z	en_US
dc.date.created	2018-05	en_US
dc.date.issued	2018-04-20	en_US
dc.date.submitted	May 2018	en_US
dc.date.updated	2019-05-17T14:14:39Z	en_US
dc.description.abstract	In recent years, Twitter has seen a proliferation of automated accounts or bots that send spam, offer clickbait, compromise security using malware, and attempt to skew public opinion. Previous research estimates that around 9% to 17% of Twitter accounts are bots contributing to between 16% to 56% of tweets on the medium. Our research introduces an unsupervised approach to detect Twitter spam campaigns in real-time. The bot groups we detect tweet duplicate content with shortened embedded URLs over extended periods of time. Our experiments with the detection protocol reveal that bots consistently account for 10% to 50% of tweets generated from 7 popular URL shortening services on Twitter. More importantly, we discover that bots using shortened URLs are connected to large scale spam campaigns that control thousands of domains. We present two use cases of our detection protocol: one as a filtering tool for sentiment analysis during 2014 #UmbrellaRevolution event, the other as a measurement tool to track political bot activities during 2018 #ReleaseTheMemo event. We also document two distinct mechanisms used to control bot groups. Our detection system runs 24/7 and actively collects bots involved in spam campaigns. As of November 2017, we have identified 200,379 unique bot accounts. We make our database of detected bots available for query through a REST API so others can filter out bots to get high quality Twitter datasets for analysis. We report bot accounts and suspicious domains to URL shortening services and Twitter, and our efforts have impacted those companies to suspend abused URLs and update their anti-spam policy.	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.citation	Chen, Zhouhan. "An Unsupervised Approach to Detect Spam Campaigns that Use Botnets on Twitter." (2018) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/105662">https://hdl.handle.net/1911/105662</a>.	en_US
dc.identifier.uri	https://hdl.handle.net/1911/105662	en_US
dc.language.iso	eng	en_US
dc.rights	Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.	en_US
dc.subject	Bot detection	en_US
dc.subject	Spam detection	en_US
dc.subject	Social Network Analysis	en_US
dc.title	An Unsupervised Approach to Detect Spam Campaigns that Use Botnets on Twitter	en_US
dc.type	Thesis	en_US
dc.type.material	Text	en_US
thesis.degree.department	Computer Science	en_US
thesis.degree.discipline	Engineering	en_US
thesis.degree.grantor	Rice University	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	Master of Science	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: CHEN-DOCUMENT-2018.pdf
Size:: 16.14 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: PROQUEST_LICENSE.txt
Size:: 5.84 KB
Format:: Plain Text
Description:

Download

Name:: LICENSE.txt
Size:: 2.61 KB
Format:: Plain Text
Description:

Download

Collections

Rice University Theses and Dissertations