Browsing by Author "Koushanfar, Farinaz"
Now showing 1 - 20 of 27
Results Per Page
Sort Options
Item A Data and Platform-Aware Framework For Large-Scale Machine Learning(2015-04-24) Mirhoseini, Azalia; Koushanfar, Farinaz; Aazhang, Behnaam; Baraniuk, Richard; Jermaine, ChristopherThis thesis introduces a novel framework for execution of a broad class of iterative machine learning algorithms on massive and dense (non-sparse) datasets. Several classes of critical and fast-growing data, including image and video content, contain dense dependencies. Current pursuits are overwhelmed by the excessive computation, memory access, and inter-processor communication overhead incurred by processing dense data. On the one hand, solutions that employ data-aware processing techniques produce transformations that are oblivious to the overhead created on the underlying computing platform. On the other hand, solutions that leverage platform-aware approaches do not exploit the non-apparent data geometry. My work is the first to develop a comprehensive data- and platform-aware solution that provably optimizes the cost (in terms of runtime, energy, power, and memory usage) of iterative learning analysis on dense data. My solution is founded on a novel tunable data transformation methodology that can be customized with respect to the underlying computing resources and constraints. My key contributions include: (i) introducing a scalable and parametric data transformation methodology that leverages coarse-grained parallelism in the data to create versatile and tunable data representations, (ii) developing automated methods for quantifying platform-specific computing costs in distributed settings, (iii) devising optimally-bounded partitioning and distributed flow scheduling techniques for running iterative updates on dense correlation matrices, (iv) devising methods that enable transforming and learning on streaming dense data, and (v) providing user-friendly open-source APIs that facilitate adoption of my solution on multiple platforms including (multi-core and many-core) CPUs and FPGAs. Several learning algorithms such as regularized regression, cone optimization, and power iteration can be readily solved using my APIs. My solutions are evaluated on a number of learning applications including image classification, super-resolution, and denoising. I perform experiments on various real-world datasets with up to 5 billion non-zeros on a range of computing platforms including Intel i7 CPUs, Amazon EC2, IBM iDataPlex, and Xilinx Virtex-6 FPGAs. I demonstrate that my framework can achieve up to 2 orders of magnitude performance improvement in comparison with current state-of-the-art solutions.Item A Resource-Aware Streaming-based Framework for Big Data Analysis(2015-12-02) Darvish Rouhani, Bita; Koushanfar, Farinaz; Aazhang, Behnaam; Baraniuk, RichardThe ever growing body of digital data is challenging conventional analytical techniques in machine learning, computer vision, and signal processing. Traditional analytical methods have been mainly developed based on the assumption that designers can work with data within the confines of their own computing environment. The growth of big data, however, is changing that paradigm especially in scenarios where severe memory and computational resource constraints exist. This thesis aims at addressing major challenges in big data learning problem by devising a new customizable computing framework that holistically takes into account the data structure and underlying platform constraints. It targets a widely used class of analytical algorithms that model the data dependencies by iteratively updating a set of matrix parameters, including but not limited to most regression methods, expectation maximization, and stochastic optimizations, as well as the emerging deep learning techniques. The key to our approach is a customizable, streaming-based data projection methodology that adaptively transforms data into a new lower-dimensional embedding by simultaneously considering both data and hardware characteristics. It enables scalable data analysis and rapid prototyping of an arbitrary matrix-based learning task using a sparse-approximation of the collection that is constantly updated inline with the data arrival. Our work is supported by a set of user-friendly Application Programming Interfaces (APIs) that ensure automated adaptation of the proposed framework to various datasets and System on Chip (SoC) platforms including CPUs, GPUs, and FPGAs. Proof of concept evaluations using a variety of large contemporary datasets corroborate the practicability and scalability of our approach in resource-limited settings. For instance, our results demonstrate 50-fold improvement over the best known prior-art in terms of memory, energy, power, and runtime for training and execution of deep learning models in deployment of different sensing applications including indoor localization and speech recognition on constrained embedded platforms used in today's IoT enabled devices such as autonomous vehicles, robots, and smartphone.Item A Timing Channel Spyware Robust to MAC Random Back-off(2010-03-02) Alkabani, Yousra; Coleman, Todd; Kiyavash, Negar; Koushanfar, FarinazThis paper presents the design and implementation of spyware communication circuits built into the widely used Carrier Sense Multiple Access with collision avoidance (CSMA/CA) protocol. The spyware components are embedded within the sequential and combinational communication circuit structure during synthesis, rendering the distinction or dissociation of the spyware from the original circuit impossible. We take advantage of the timing channel resulting from transmission of packets to implement a new practical coding scheme that covertly transfers the spied data. Our codes are robust against the CSMA/CA’s random retransmission time for collision avoidance and in fact take advantage of it to disguise the covert communication. The data snooping may be sporadically triggered, either externally or internally. The occasional trigger and the real-time traffic’s variability make the spyware timing covert channel detection a challenge. The spyware is implemented and tested on a widely used open-source wireless CSMA/CA radio platform. We identify the following performance metrics and evaluate them on our architecture: 1) efficiency of implementation of the encoder; 2) robustness of the communication scheme to heterogeneous CSMA/CA effects; and 3) difficulty of covert channel detection. We evaluate criterion 1) completely theoretically. Criterion 2) is evaluated by simulating a wireless CSMA/CA architecture and testing the robustness of the decoder in different heterogeneous wireless conditions. Criterion 3) is confirmed experimentally using the state-of-the-art covert timing channel detection methods.Item A Unified Framework for Multimodal IC Trojan Detection(2010-02-02) Alkabani, Yousra; Koushanfar, Farinaz; Mirhoseini, AzaliaThis paper presents a unified formal framework for integrated circuits (IC) Trojan detection that can simultaneously employ multiple noninvasive measurement types. Hardware Trojans refer to modifications, alterations, or insertions to the original IC for adversarial purposes. The new framework formally defines the IC Trojan detection for each measurement type as an optimization problem and discusses the complexity. A formulation of the problem that is applicable to a large class of Trojan detection problems and is submodular is devised. Based on the objective function properties, an efficient Trojan detection method with strong approximation and optimality guarantees is introduced. Signal processing methods for calibrating the impact of inter-chip and intra-chip correlations are presented. We define a new sensitivity metric which formally quantifies the impact of modifications to each gate on the Trojan detection. Using the new metric, we compare the Trojan detection capability of the different measurement types for static (quiescent) current, dynamic (transient) current, and timing (delay) measurements. We propose a number of methods for combining the detections of the different measurement types and show how the sensitivity results can be used for a systematic combining of the detection results. Experimental evaluations on benchmark designs reveal the low-overhead and effectiveness of the new Trojan detection framework and provides a comparison of different detection combining methods.Item Automated Design, Implementation, and Evaluation of Arbiter-based PUF on FPGA using Programmable Delay Lines(2014-08-18) Devadas, Srinivas; Kharaya, Akshat; Koushanfar, Farinaz; Majzoobi, MehrdadThis paper proposes a novel approach for automated implementation of an arbiter-based physical unclonable function (PUF) on field programmable gate arrays (FPGAs). We introduce a high resolution programmable delay logic (PDL) that is implemented by harnessing the FPGA lookup-table (LUT) internal structure. PDL allows automatic fine tuning of delays that can mitigate the timing skews caused by asymmetries in interconnect routing and systematic variations. To thwart the arbiter metastability problem, we present and analyze methods for majority voting of responses. A method to classify and group challenges into different robustness sets is introduced that enhances the corresponding responses’ stability in the face of operational variations. The trade-off between response stability and response entropy (uniqueness) is investigated through comprehensive measurements. We exploit the correlation between the impact of temperature and power supply on responses and perform less costly power measurements to predict the temperature impact on PUF. The measurements are performed on 12 identical Virtex 5 FPGAs across 9 different accurately controlled operating temperature and voltage supply points. A database of challenge response pairs (CRPs) are collected and made openly available for the research community.Item Classification Techniques for Undersampled Electromyography and Electrocardiography(2012-10-01) Wilhelm, Keith; Varman, Peter J.; Massoud, Yehia; Clark, John W., Jr.; Koushanfar, FarinazElectrophysiological signals including electrocardiography (ECG) and electromyography (EMG) are widely used in clinical environments for monitoring of patients and for diagnosis of conditions including cardiac and neuromuscular disease. Due to the wealth of information contained in these signals, many additional applications would be facilitated by full-time acquisition combined with automated analysis. Recent performance gains in portable computing devices and large scale computing platforms provide the necessary computational resources to process and store this data; however challenges at the sensor level have prevented monitoring systems from reaching the practicality and convenience necessary for widespread, continuous use. In this thesis, we examine the feasibility of applying techniques from the compressive sensing field to the acquisition and analysis of electrophysiological signals. These techniques allow signals to be acquired in compressed form, thereby providing a means to reduce power consumption of monitoring devices. We demonstrate the effects of several methods of compressive sampling and reconstruction on standard compression and reconstruction error metrics. Additionally, we investigate the effects of compressive sensing on the accuracy of automated signal analysis techniques for extracting useful information from ECG and EMG signals.Item Coding for Phase Change Memory Performance Optimization(2012-09-05) Mirhoseini, Azalia; Koushanfar, Farinaz; Baraniuk, Richard G.; Aazhang, BehnaamOver the past several decades, memory technologies have exploited continual scaling of CMOS to drastically improve performance and cost. Unfortunately, charge-based memories become unreliable beyond 20 nm feature sizes. A promising alternative is Phase-Change-Memory (PCM) which leverages scalable resistive thermal mechanisms. To realize PCM's potential, a number of challenges, including the limited wear-endurance and costly writes, need to be addressed. This thesis introduces novel methodologies for encoding data on PCM which exploit asymmetries in read/write performance to minimize memory's wear/energy consumption. First, we map the problem to a distance-based graph clustering problem and prove it is NP-hard. Next, we propose two different approaches: an optimal solution based on Integer-Linear-Programming, and an approximately-optimal solution based on Dynamic-Programming. Our methods target both single-level and multi-level cell PCM and provide further optimizations for stochastically-distributed data. We devise a low overhead hardware architecture for the encoder. Evaluations demonstrate significant performance gains of our framework.Item Design Techniques for Robust Analog Signal Acquisition(2012-10-02) Singal, Vikas; Varman, Peter J.; Massoud, Yehia; Clark, John W., Jr.; Koushanfar, FarinazThe random demodulator architecture is a compressive sensing based receiver that allows the reconstruction of frequency-sparse signals from measurements acquired at a rate below the signal’s Nyquist rate. This in turn results in tremendous power savings in receivers because of the direct correlation between the power consumption of analog-to-digital converters (ADCs) in communication receivers and the sampling rate at which these ADCs operate. In this thesis, we propose design techniques for a robust and efficient random demodulator. We tackle two critical components that are most critical, the resetting mechanism of the integrator and the random sequence. On the one hand, the resetting mechanism can pose challenges in practical settings that can degrade the performance of the random demodulator. We propose practical approaches to mitigate the effect of resetting and propose resetting schemes that provide robust performance. On the other hand, the random sequence is a central part in the system and the properties of this sequence directly affect the properties of the whole system. We study the performance of the random demodulator under many practical random sequences such as maximal length sequences and Kasami sequences and provide pros and cons of using each in the random demodulator.Item Efficient Architectures for Wideband Receivers(2012-08-29) El Smaili, Sami; Varman, Peter J.; Massoud, Yehia; Clark, John W., Jr.; Koushanfar, FarinazReducing power consumption of radio receivers is becoming more critical with the advancement of biomedical portable and implantable devices due to the stringent power requirements in such applications. Compressive sensing promises to tremendously reduce the power of radio receivers by allowing the reconstruction of sparse signals from measurements acquired at a sub-Nyquist rate. A key component in compressive sensing systems is the random signal which is used to acquire the measurements. Most e orts have been devoted to the design of signals with high randomness but little have been devoted to manipulating the random signal to suite a speci fic application, meet certain specifi cations, or enhance the performance of the system. This thesis tackles compressive sensing systems from this angle. We first propose an architecture that alleviates a critical requirement in compressive sensing: that the random signal should run at the Nyquist rate, which becomes prohibitive as the signal bandwidth increases. We provide theoretical and experimental results that demonstrate the e ectiveness of the proposed architecture. Secondly, we propose a framework for manipulating the random signal in the frequency domain as suitable for speci c applications. We use the framework to develop an architecture for recon gurable ultra wide-band radios.Item Efficient Radiometric Signature Methods for Cognitive Radio Devices(2011) Kocabas, Ovunc; Koushanfar, FarinazThis thesis presents the first comprehensive study and new methods for radiometric fingerprinting of the Cognitive Radio (CR) devices. The scope of the currently available radio identification techniques is limited to a single radio adjustment. Yet, the variable nature of the CR with multiple levels of parameters and adjustments renders the radiometric fingerprinting much more complex. We introduce a new method for radiometric fingerprinting that detects the unique variations in the hardware of the reconfigurable radio by passively monitoring the radio packets. Several individual identifiers are used for extracting the unique physical characteristics of the radio, including the frequency offset, modulated phase offset, in-phase/quadrature-phase offset from the origin, and magnitude. Our method provides stable and robust identification by developing individual identifiers (classifiers) that may each be weak (i.e., incurring a high prediction error) but their committee can provide a strong classification technique. Weighted voting method is used for combining the classifiers. Our hardware implementation and experimental evaluations over multiple radios demonstrate that our weighted voting approach can identify the radios with an average of 97.7% detection probability and an average of 2.3% probability of false alarm after testing only 5 frames. The probability of detection and probability of false alarms both rapidly improve by increasing the number of test frames.Item Improving user authentication on the web: Protected login, strong sessions, and identity federation(2014-01-14) Dietz, Mike; Wallach, Daniel S.; Ng, T. S. Eugene; Koushanfar, FarinazClient authentication on the web has remained in the internet-equivalent of the stone ages for the last two decades. Instead of adopting modern public-key-based authentication mechanisms, we seem to be stuck with traditional methods like passwords and cookies. These authentication methods are vulnerable to a wide range of attacks from simple password reuse to strong man-in-the-middle attackers that can inject themselves into the middle of encrypted communication channels. While many potential solutions have been proposed to sole the issues with the use of passwords and cookies for web authentication, most have failed to take hold. This lack of adoption stems from two issues. First, traditional password based authentication provides a very simple user experience. Any new technique must not increase user friction during login and provide a reasonable user experience. Secondly, a new authentication technique must not be difficult to implement in existing browsers and web applications or deploy to users. This thesis presents three techniques that provide protection against strong attackers while providing a low friction user experience. The first, Origin Bound Certificates, is a session hardening technique that cryptographically binds the user's authentication cookie to the TLS channel the cookie is presented over. This technique protects a user's session against strong attackers, requires no additional user interaction, requires little (or no) modification to existing web applications, and is compatible with existing data center infrastructure like TLS terminators. The second, Opportunistic Cryptographic Identity Assertions, is a technique in which the web browsers communicates with a user's cell phone in order to establish it as an opportunistic second factor in the initial login operation. This technique provides security assurances comparable or greater than conventional two factor authentication (i.e. phishing and password reuse prevention) while offering a simple user experience. Finally, I discuss a new federated login system that makes use of a new browser provided construct called the PostKey API. This interface allows the browser to create a cross certification that asserts ownership of client side keys to a trusted third party. The these cross certifications can be verified by an identity provider and used to harden existing federated login protocols as well as to create a new federation protocol that is resistant to man-in-the-middle attacks and leaked authentication tokens and provides relying parties with the means the better secure communication with the user.Item Indelible Physical Randomness for Security: Silicon, Bisignals, Biometrics(2014-11-11) Rostami, Masoud; Koushanfar, Farinaz; Wallach, Dan S; Knightly, Edward; Juels, AriIn this thesis, I investigate the nature and properties of several indelible physical randomness phenomena. I leverage these indelible statistical properties to design robust and efficient security systems. Three different phenomena are discussed in this thesis: randomness in biosignals, silicon chips, and biometrics. In the first part, I present a system to authenticate external medical device programmers to Implantable Medical Devices (IMDs).IMDs have now built-in radio communication to facilitate non-invasive reprogramming, but lack well-designed authentication mechanisms, exposing patients to the risks of over-the-air attacks and physical harm. Our protocol uses biosignals for authentication mechanism, ensuring access only by a medical instrument in physical contact with an IMD-bearing patient. Based on statistical analysis of real-world data, I propose and analyze new techniques for extracting time-varying randomness from biosignals and introduce a novel cryptographic device pairing protocol that uses this randomness to protect against attacks by active adversaries, while meeting the practical challenges of lightweight implementation and noise tolerance in biosignals readings. In the second part, unavoidable physical randomness of transistors is investigated, and novel robust and low-overhead authentication, bit-commitment, and key exchange protocols are proposed. It will be meticulously shown that these protocols can achieve resiliency against reverse-engineering and replay attacks without a costly secure channel. The attack analysis guides us in tuning the parameters of the protocols for an efficient and secure implementation. In the third part, the statistical properties of fingerprint minutiae points are analyzed and a distributed security protocol will be proposed to safeguard biometric fingerprint databases based on the developed statistical models of fingerprint biometric.Item Input vector control for post-silicon leakage current minimization under manufacturing variations(2008-02-04) Alkabani, Yousra; Koushanfar, Farinaz; Massey, Tammara; Potkonjak, MiodragWe present the first approach for post-silicon leakage power reduction through input vector control (IVC) that takes into account the impact of the manufacturing variability (MV). Because of the MV, the integrated circuits (ICs) implementing one design require different input vectors to achieve their lowest leakage states. There are two major challenges that have to be addressed. The first is the extraction of the gate-level characteristics of an IC by measuring only the overall leakage power for different inputs. The second problem is the rapid generation of input vectors that result in a low leakage for a large number of unique ICs that implement a given design, but are different in the post-manufacturing phase. We solve the first problem using a linear programming formulation that in a polynomial time, finds the most likely gate-level characterization of a pertinent IC. The approach is provably optimal, if there are no measurement errors; we also examine the erroneous cases. We address the second problem using the coordinated application of statistical clustering and the very large neighborhood iterative improvement algorithm. Experimental results on a large set of benchmark instances demonstrate the efficiency of the proposed methods. For example, the leakage power consumption could be reduced in average by more than 10.4%, when compared to the previously published IVC techniques that did not consider MV.Item Lifetime Optimization Using Energy Allocation in Wireless Ad-hoc Networks(2008-02-12) Koushanfar, Farinaz; Shamsi, DavoodWe develop energy-balancing strategies for wireless ad-hoc networks energy resource allocation and deployment. The objective is to extend the network lifetime. We find the amount of energy storage that each node requires for having a balanced energy consumption throughout the network. For a limited set of energy resources in the deployment area, we determine an efficient deployment scenario in which messages are routed across the network while using the fastest delivery path. Two ad-hoc architectures are considered: first, where the network is peerto-peer and all the nodes have the same characteristics; and second, a base-station centric network where a base-station in the center collects the data from the ad-hoc nodes. We study synchronous and asynchronous communication paradigms for both architectures. To address the problems, we first determine the deployment scheme that results in the most comprehensive radio coverage. Next, we calculate the energy distribution for each network scenario. Then, the derived distributions are extended to randomly deployed networks. We present a thorough analysis and comparison for peer-to-peer and base-station architectures, for both synchronous and asynchronous paradigms. Our experimental evaluations show that the energy-balancing distributions extend the network’s lifetime by more than 40% when compared to nonbalanced networks with no overhead on message routing delay.Item Lightweight Silicon-based Security: Concept, Implementations, and Protocols(2013-09-16) Majzoobi, Mehrdad; Koushanfar, Farinaz; Baraniuk, Richard G.; Wallach, Dan S.Advancement in cryptography over the past few decades has enabled a spectrum of security mechanisms and protocols for many applications. Despite the algorithmic security of classic cryptography, there are limitations in application and implementation of standard security methods in ultra-low energy and resource constrained systems. In addition, implementations of standard cryptographic methods can be prone to physical attacks that involve hardware level invasive or non-invasive attacks. Physical unclonable functions (PUFs) provide a complimentary security paradigm for a number of application spaces where classic cryptography has shown to be inefficient or inadequate for the above reasons. PUFs rely on intrinsic device-dependent physical variation at the microscopic scale. Physical variation results from imperfection and random fluctuations during the manufacturing process which impact each device’s characteristics in a unique way. PUFs at the circuit level amplify and capture variation in electrical characteristics to derive and establish a unique device-dependent challenge-response mapping. Prior to this work, PUF implementations were unsuitable for low power applications and vulnerable to wide range of security attacks. This doctoral thesis presents a coherent framework to derive formal requirements to design architectures and protocols for PUFs. To the best of our knowledge, this is the first comprehensive work that introduces and integrates these pieces together. The contributions include an introduction of structural requirements and metrics to classify and evaluate PUFs, design of novel architectures to fulfill these requirements, implementation and evaluation of the proposed architectures, and integration into real-world security protocols. First, I formally define and derive a new set of fundamental requirements and properties for PUFs. This work is the first attempt to provide structural requirements and guideline for design of PUF architectures. Moreover, a suite of statistical properties of PUF responses and metrics are introduced to evaluate PUFs. Second, using the proposed requirements, new and efficient PUF architectures are designed and implemented on both analog and digital platforms. In this work, the most power efficient and smallest PUF known to date is designed and implemented on ASICs that exploits analog variation in sub-threshold leakage currents of MOS devices. On the digital platform, the first successful implementation of Arbiter-PUF on FPGA was accomplished in this work after years of unsuccessful attempts by the research community. I introduced a programmable delay tuning mechanism with pico-second resolution which serves as a key component in implementation of the Arbiter-PUF on FPGA. Full performance analysis and comparison is carried out through comprehensive device simulations as well as measurements performed on a population of FPGA devices. Finally, I present the design of low-overhead and secure protocols using PUFs for integration in lightweight identification and authentication applications. The new protocols are designed with elegant simplicity to avoid the use of heavy hash operations or any error correction. The first protocol uses a time bound on the authentication process while second uses a pattern-matching index-based method to thwart reverseengineering and machine learning attacks. Using machine learning methods during the commissioning phase, a compact representation of PUF is derived and stored in a database for authentication.Item Methods and systems of digital rights management for integrated circuits(2015-02-24) Koushanfar, Farinaz; Potkonjak, Miodrag; Rice University; Regents of the University of California; United States Patent and Trademark OfficeMethods for remote activation and permanent or temporary deactivation of integrated circuits (IC) for digital rights management are disclosed. Remote activation enables designers to remotely control each IC manufactured by an independent silicon foundry. Certain embodiments of the invention exploit inherent unclonable variability in modern manufacturing for the creation of unique identification (ID) and then integrate the IDs into the circuit functionality. Some of the objectives may be realized by replicating a subset of states of one or more finite state machines and by superimposing additional state transitions that are known only to the designer. On each chip, the added transitions signals are a function of the unique IDs and are thus unclonable. The method and system of the invention is robust against operational and environment conditions, unclonable and attack-resilient, while having a low overhead and a unique key for each IC with very high probability.Item New Theory and Methods for Signals in Unions of Subspaces(2014-09-18) Dyer, Eva Lauren; Baraniuk, Richard G.; Koushanfar, Farinaz; Allen, Genevera; Sabharwal, AshutoshThe rapid development and availability of cheap storage and sensing devices has quickly produced a deluge of high-dimensional data. While the dimensionality of modern datasets continues to grow, our saving grace is that these data often exhibit low-dimensional structure that can be exploited to compress, organize, and cluster massive collections of data. Signal models such as linear subspace models, remain one of the most widely used models for high-dimensional data; however, in many settings of interest, finding a global model that can capture all the relevant structure in the data is not possible. Thus, an alternative to learning a global model is to instead learn a hybrid model or a union of low-dimensional subspaces that model different subsets of signals in the dataset as living on distinct subspaces. This thesis develops new methods and theory for learning union of subspace models as well as exploiting multi-subspace structure in a wide range of signal processing and data analysis tasks. The main contributions of this thesis include new methods and theory for: (i) decomposing and subsampling datasets consisting of signals on unions of subspaces, (ii) subspace clustering for learning union of subspace models, and (iii) exploiting multi-subspace structure in order accelerate distributed computing and signal processing on massive collections of data. I demonstrate the utility of the proposed methods in a number of important imaging and computer vision applications including: illumination-invariant face recognition, segmentation of hyperspectral remote sensing data, and compression of video and lightfield data arising in 3D scene modeling and analysis.Item Non-invasive IC tomography using spatial correlations(2010) Shamsi, Davood; Koushanfar, FarinazWe introduce a new methodology for post-silicon characterization of the gate-level variations in a manufactured Integrated Circuit (IC). The estimated characteristics are based on the power and the delay measurements that are affected by the process variations. The power (delay) variations are spatially correlated. Thus, there exists a basis in which variations are sparse. The sparse representation suggests using the L1-regularization (the compressive sensing theory). We show how to use the compressive sensing theory to improve post-silicon characterization. We also address the problem by adding spatial constraints directly to the traditional L2-minimization. The proposed methodology is fast, inexpensive, non-invasive, and applicable to legacy designs. Noninvasive IC characterization has a range of emerging applications, including post-silicon optimization, IC identification, and variations' modeling/simulations. The evaluation results on standard benchmark circuits show that, in average, the gate level characteristics estimation accuracy can be improved by more than two times using the proposed methods.Item P3: Privacy Preserving Positioning for Smart Automotive Systems(2016-05-03) Hussain, Siam Umar; Koushanfar, FarinazThis thesis presents the first provably secure localization method for smart automotive systems. Using this method, a car, lost due to unavailability of GPS, can compute its location with assistance from three nearby cars while the locations of all the participating cars including the lost car remain private. This localization application is one of the very first location-based services that does not sacrifice accuracy to maintain privacy. The secure location is computed using a protocol utilizing Yao’s Garbled Circuit (GC) that allows two parties to jointly compute a function on their private inputs. We design and optimize GC netlists of the functions required for computation of location by leveraging conventional logic synthesis tools. Proof-of-concept implementation of the protocol shows that the complete operation can be performed within only 0.55 seconds. The fast computing time enables practical localization of moving cars.Item Partitioned machine learning architecture(2024-03-05) Rouhani, Bita Darvish; Mirhoseini, Azalia; Koushanfar, Farinaz; Rice University; United States Patent and Trademark OfficeA system may include a processor and a memory. The memory may include program code that provides operations when executed by the processor. The operations may include: partitioning, based at least on a resource constraint of a platform, a global machine learning model into a plurality of local machine learning models; transforming training data to at least conform to the resource constraint of the platform; and training the global machine learning model by at least processing, at the platform, the transformed training data with a first of the plurality of local machine learning models.