Backdoor in AI: Algorithms, Attacks, and Defenses

dc.contributor.advisorHu, Xiaen_US
dc.creatorTang, Ruixiangen_US
dc.date.accessioned2024-08-30T16:34:10Zen_US
dc.date.available2024-08-30T16:34:10Zen_US
dc.date.created2024-08en_US
dc.date.issued2024-08-05en_US
dc.date.submittedAugust 2024en_US
dc.date.updated2024-08-30T16:34:10Zen_US
dc.description.abstractAs deep learning models (DNNs) become increasingly integral to critical domains such as healthcare, finance, and autonomous systems, ensuring their safety and reliability is of utmost importance. Among the various threats to these systems, backdoor attacks pose a particularly insidious challenge. These attacks compromise the model by embedding a hidden backdoor function, which can be triggered by specific inputs to manipulate the model's behavior. My research goal initially involves exploring the potential backdoor attack surface within the deep learning pipeline. Once we gain a more comprehensive understanding of the backdoor attack mechanism, we can then proceed to develop advanced defense algorithms. First, for exploring the new backdoor attack surface, we propose a training-free backdoor attack approach which is different from the traditional backdoor insertion method where backdoor behaviors are injected by training the model on a poisoned dataset. Specifically, the proposed attack embeds the backdoor into the target model by inserting a tiny malicious module, TrojanNet, into the target model. The infected model with the backdoor function can misclassify inputs into a target label when the inputs are stamped with preset triggers. The proposed TrojanNet has several new properties including (1) it is model-agnostic and could be injected into most DNNs, dramatically expanding its attack scenarios and (2) the training-free mechanism saves massive training efforts. Second, to defend against backdoor attacks, we proposed a honeypot defense method. Our objective is to develop a backdoor-resistant tuning procedure that yields a backdoor-free model, no matter whether the fine-tuning dataset contains poisoned samples. To this end, we propose and integrate a honeypot module into the original DNNs, specifically designed to absorb backdoor information exclusively. Our design is motivated by the observation that lower-layer representations in DNNs carry sufficient backdoor features while carrying minimal information about the original tasks. Consequently, we can impose penalties on the information acquired by the honeypot module to inhibit backdoor creation during the fine-tuning process of the stem network. Comprehensive experiments conducted on benchmark datasets substantiate the effectiveness and robustness of our defensive strategy. Third, we actively explore leveraging backdoors for socially beneficial applications. We demonstrate that backdoors can be used for watermarking valuable assets within the deep learning pipeline. We focused on using backdoors as watermarks to protect data, models, and APIs. To monitor the unauthorized use of datasets, we introduced a clean-label backdoor watermarking framework. Our findings indicate that incorporating just 1\% of watermarking samples is sufficient to embed a traceable backdoor function into unauthorized models. To counteract model theft or unauthorized redistribution, we introduced a novel product-key-based security layer for deep learning models. This mechanism restricts access to the model's functionalities until a verified key is entered.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationTang, Ruixiang. Backdoor in AI: Algorithms, Attacks, and Defenses. (2024). PhD diss., Rice University. https://hdl.handle.net/1911/117794en_US
dc.identifier.urihttps://hdl.handle.net/1911/117794en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectDeep Learningen_US
dc.subjectBackdoor Attacken_US
dc.subjectBackdoor Defenseen_US
dc.subjectIP Protectionen_US
dc.subjectWatermarken_US
dc.titleBackdoor in AI: Algorithms, Attacks, and Defensesen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TANG-DOCUMENT-2024.pdf
Size:
7.4 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.98 KB
Format:
Plain Text
Description: