ECE Theses and Dissertations

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 597
  • Item
    ShuFFLE: Automated Framework for HArdware Accelerated Iterative Big Data Analysis
    (2014-10-22) Mohammadgholi Songhori, Ebrahim; Koushanfar, Farinaz; Baraniuk, Richard; Cavallaro, Joseph
    This thesis introduces ShuFFLE, a set of novel methodologies and tools for automated analysis and hardware acceleration of large and dense (non-sparse) Gram matrices. Such matrices arise in most contemporary data mining; they are hard to handle because of the complexity of known matrix transformation algorithms and the inseparability of non-sparse correlations. ShuFFLE learns the properties of the Gram matrices and their rank for each particular application domain. It then utilizes the underlying properties for reconfiguring accelerators that scalably operate on the data in that domain. The learning is based on new factorizations that work at the limit of the matrix rank to optimize the hardware implementation by minimizing the costly off-chip memory as well as I/O interactions. ShuFFLE also provides users with a new Application Programming Interface (API) to implement a customized iterative least squares solver for analyzing big and dense matrices in a scalable way. This API is readily integrated within the Xilinx Vivado High Level Synthesis tool to translate user's code to Hardware Description Language (HDL). As a case study, we implement Fast Iterative Shrinkage-Thresholding Algorithm (FISTA) as an l1 regularized least squares solver. Experimental results show that during FISTA computation using Field-Programmable Gate Array (FPGA) platform, ShuFFLE attains 1800x iteration speed improvement compared to the conventional solver and about 24x improvement compared to our factorized solver on a general purpose processor with SSE4 architecture for a Gram matrix with 4.6 billion non-zero elements.
  • Item
    A Data and Platform-Aware Framework For Large-Scale Machine Learning
    (2015-04-24) Mirhoseini, Azalia; Koushanfar, Farinaz; Aazhang, Behnaam; Baraniuk, Richard; Jermaine, Christopher
    This thesis introduces a novel framework for execution of a broad class of iterative machine learning algorithms on massive and dense (non-sparse) datasets. Several classes of critical and fast-growing data, including image and video content, contain dense dependencies. Current pursuits are overwhelmed by the excessive computation, memory access, and inter-processor communication overhead incurred by processing dense data. On the one hand, solutions that employ data-aware processing techniques produce transformations that are oblivious to the overhead created on the underlying computing platform. On the other hand, solutions that leverage platform-aware approaches do not exploit the non-apparent data geometry. My work is the first to develop a comprehensive data- and platform-aware solution that provably optimizes the cost (in terms of runtime, energy, power, and memory usage) of iterative learning analysis on dense data. My solution is founded on a novel tunable data transformation methodology that can be customized with respect to the underlying computing resources and constraints. My key contributions include: (i) introducing a scalable and parametric data transformation methodology that leverages coarse-grained parallelism in the data to create versatile and tunable data representations, (ii) developing automated methods for quantifying platform-specific computing costs in distributed settings, (iii) devising optimally-bounded partitioning and distributed flow scheduling techniques for running iterative updates on dense correlation matrices, (iv) devising methods that enable transforming and learning on streaming dense data, and (v) providing user-friendly open-source APIs that facilitate adoption of my solution on multiple platforms including (multi-core and many-core) CPUs and FPGAs. Several learning algorithms such as regularized regression, cone optimization, and power iteration can be readily solved using my APIs. My solutions are evaluated on a number of learning applications including image classification, super-resolution, and denoising. I perform experiments on various real-world datasets with up to 5 billion non-zeros on a range of computing platforms including Intel i7 CPUs, Amazon EC2, IBM iDataPlex, and Xilinx Virtex-6 FPGAs. I demonstrate that my framework can achieve up to 2 orders of magnitude performance improvement in comparison with current state-of-the-art solutions.
  • Item
    Client Beamforming for Rate Scalability of MU-MIMO Networks
    (2015-04-24) Yu, Hang; Zhong, Lin; Knightly, Edward W; Sabharwal, Ashutosh; Johnson, David B
    The multi-user MIMO (MU-MIMO) technology allows an AP with multiple antennas to simultaneously serve multiple clients to improve the network capacity. To achieve this, the AP leverages zero-forcing beamforming (ZFBF) to eliminate the intra-cell interference between served clients. However, current MU-MIMO networks suffer from two fundamental problems that limit the network capacity. First, for a single MU-MIMO cell, as the number of clients approaches the number of antennas on the AP, the cell capacity often flattens and may even drop. Second, for multiple MU-MIMO cells, the multiple APs cannot simultaneously serve their clients due to inter-cell interference, so that the concurrent streams are constrained to a single cell with limited network capacity. Our unique perspective to tackle these two problems is that modern mobile clients can be equipped with multiple antennas for beamforming. We have proposed two solutions that leverage the client antennas. For the capacity scalability problem in a single MU-MIMO cell, we use multiple client antennas to improve the orthogonality between the channel vectors of the clients. The orthogonality between clients’ channels determines the SNR reduction from the zero-forcing beamforming by the AP, and is therefore critical for the capacity of a MU-MIMO cell to become more scalable to the number of clients. We have devised a 802.11ac-based protocol called MACCO, in which each client locally optimizes its beamforming weights based on the channel knowledge obtained from overhearing other clients’ channel reports. For the inter-cell interference problem in multiple MU-MIMO cells, we leverage multiple client antennas to assist the interfering APs to coordinately cancel the inter-cell interference between them. To achieve such coordinated interference cancellation in a practical way, We have proposed a two-step optimization including antenna usage optimization and beamforming weight optimization. We have devised another 802.11ac-based protocol called CoaCa, which integrates this two-step optimization into 802.11ac with small modifications and negligible overhead, allowing each AP and client to locally identify the optimal beamforming weights. We have implemented both MACCO and CoaCa on the WARP SDR platform leveraging the WARPLab framework, and experimentally evaluated their performance under real-world indoor wireless channels. The results have demonstrated the effectiveness of MACCO and CoaCa toward solving the capacity scalability and inter-cell interference problems of MU-MIMO networks. First, on average MACCO can increase the capacity of a single MU-MIMO cell with eight AP antennas and eight clients by 35%, compared to existing solutions that use client antennas differently. Second, for a MU-MIMO network with two cells, by cancelling the inter-cell interference CoaCa can convert the majority of the number of streams increase (50%-67%) into network capacity improvement (41%-52%).
  • Item
    Compressive Sensing in Positron Emission Tomography (PET) Imaging
    (2015-04-16) Valiollahzadeh, Majid; Clark, John; Veeraghavan, Ashok; Jacot, Jeffrey; Mawlawi, Osama; kelly, Kevin
    Positron emission tomography (PET) is a nuclear medicine functional imaging modality, applicable to several clinical problems, but especially in detecting the metabolic activity (as in cancer). PET scanners use multiple rings of gamma ray detectors that surround the patient. These scanners are quite expensive (1-3 million dollars), therefore a technology that would allow the reduction in the number of detectors per ring without affecting image quality, could reduce the scanner cost, thereby making this imaging modality more accessible to patients. In this thesis , a mathematical technique known as compressive sensing is applied in an effort to decrease the number of detectors required, while maintaining good image quality. A CS model was developed based on a combination of gradient magnitude and wavelet domains to recover missing observations associated with PET data acquisition. The CS model also included a Poisson-distributed noise term. The overall model was formulated as an optimization problem wherein the cost function was a weighted sum of the total variation and the L1-norm of the wavelet coefficients. Subsequently, the cost function was minimized subject to the CS model equations, the partially observed data, and a penalty function for noise suppression (the Poisson log-likelihood function). We refer to the complete model as the WTV model. This thesis also explores an alternative reconstruction method, wherein a different CS model based on an adaptive dictionary learning (DL) technique for data recovery in PET imaging was developed. Specifically, a PET image is decomposed into small overlapped patches and the dictionary is learned from these overlapped patches. The technique has good sparsifying properties and the dictionary tends to capture local as well as structural similarities, without sacrificing resolution. Recovery is accomplished in two stages: a dictionary learning phase followed by a reconstruction step. In addition to developing optimized CS reconstruction, this thesis also investigated: (a) the limits of detector removal when using the DL CS reconstruction algorithm; and (b) the optimal detector removal configuration per ring while minimizing the impact on image quality following recovery using the CS model. Results of these investigations can serve to help make PET scanners more affordable while maintaining image quality. These results can also be used to improve patient throughput by redesigning scanners so that removed detectors can be placed in axial extent to image a larger portion of the body. This will help increase scanner throughput hence improve scanner efficiency as well as patient discomfort due to long scan time.
  • Item
    Inference by Reparameterization using Neural Population Codes
    (2015-12-04) Vasudeva Raju, Rajkumar; Pitkow, Xaq; Aazhang, Behnaam; Ernst, Philip; Josic, Kresimir
    Behavioral experiments on humans and animals suggest that the brain performs probabilistic inference to interpret its environment. Here we present a general-purpose, biologically plausible implementation of approximate inference based on Probabilistic Population Codes (PPCs). PPCs are distributed neural representations of probability distributions that are capable of implementing marginalization and cue-integration in a biologically plausible way. By connecting multiple PPCs together, we can naturally represent multivariate probability distributions, and capture the conditional dependency structure by setting those connections as in a probabilistic graphical model. To perform inference in general graphical models, one convenient and often accurate algorithm is Loopy Belief Propagation (LBP), a ‘message-passing’ algorithm that uses local marginalization and integration operations to perform approximate inference efficiently even for complex models. In LBP, a message from one node to a neighboring node is a function of incoming messages from all neighboring nodes, except the recipient. This exception renders it neurally implausible because neurons cannot readily send many different signals to many different target neurons. Interestingly, however, LBP can be reformulated as a sequence of Tree-based Re-Parameterization (TRP) updates on the graphical model which re-factorizes a portion of the probability distribution. Although this formulation still implicitly has the message exclusion problem, we show this can be circumvented by converting the algorithm to a nonlinear dynamical system with auxiliary variables and a separation of time-scales. By combining these ideas, we show that a network of PPCs can represent multivariate probability distributions and implement the TRP updates for the graphical model to perform probabilistic inference. Simulations with Gaussian graphical models demonstrate that the performance of the PPC-based neural network implementation of TRP updates for probabilistic inference is comparable to the direct evaluation of LBP, and thus provides a compelling substrate for general, probabilistic inference in the brain.
  • Item
    High resolution light field capture using GMM prior and sparse coding
    (2014-10-07) Tambe, Salil; Veeraraghavan, Ashok; Sabharwal, Ashutosh; Kelly, Kevin
    Light fields, being inherently a 4D function cannot be mapped onto the 2D sensor in a single image without loosing out on resolution. A natural way to overcome this barrier is to capture multiple images to record the light field. However, this method only works for static scenes, therefore the resolution problem stays unresolved, it only gets transformed from the domain of low spatio-angular resolution to a problem of low temporal resolution. In this work, we leverage the redundant nature of light fields to recover them at higher resolution by first capturing a set of well-chosen images, and later reconstructing the LF from these images using some prior-based algorithms. We achieve this in two ways. In the first method, we capture multiplexed light field frames using an electronically tunable programmable aperture and later recover the light field using a motion-aware dictionary learning and sparsity based reconstruction algorithm. The number of adjacent multiplexed frames to be used during the recovery of each light field frame is decided based on the applicability of the static scene assumption. This is determined using optical-flow and forms the basis of our motion-aware reconstruction algorithm. We also show how to optimize the programmable aperture patterns using the learned dictionary. Our second method utilizes focus stacks to computationally recover light fields post-capture [1] . However our method differs from [1] in the following ways. (i) We obtain the entire focus-aperture (45 focus and 18 aperture settings) stack by capturing just a few (about $8-16$) images and computationally reconstructing images corresponding to all other focus-aperture settings, while [1] capture the entire focus stack corresponding to a given aperture setting (ii) Since we recover the focus stack at smaller aperture settings as well, we can produce LFs at finer angular resolutions. We call our method 'Compressive Epsilon Photography' since we capture few (compressive) images with slightly varying parameters (Epsilon Photography) and post-capture computationally reconstruct images corresponding to all other missing parameter combinations. The recovered LF has spatial resolution corresponding to the sensor resolution of the camera and can recover any angular view which lies inside the aperture. [1] A. Levin and F. Durand, "Linear view synthesis using a dimensionality gap light field prior," in IEEE conf. Computer Vision and Pattern Recognition, pp. 1831-1838, 2010
  • Item
    Controlling Race Conditions in OpenFlow to Accelerate Application Verification and Packet Forwarding
    (2014-10-24) Sun, Xiaoye Steven; Ng, T. S. Eugene; Knightly, Edward W; Zhong, Lin
    OpenFlow is a Software Defined Networking (SDN) protocol that is being deployed in critical network systems. SDN application verification takes an important role in guaranteeing the correctness of the application. Through our investigation, we discover that application verification can be very inefficient under the OpenFlow protocol since there are many race conditions between the data packets and control plane messages. Furthermore, these race conditions also increase the control plane workload and packet forwarding delay. We propose Attendre, an OpenFlow extension, to mitigate the ill effects of the race conditions in OpenFlow networks. We have implemented Attendre in NICE (a model checking verifier), Open vSwitch (a software virtual switch) and NOX (an OpenFlow control platform). Experiments show that Attendre can reduce verification time by several orders of magnitude, and can significantly reduce TCP connection setup time.
  • Item
    Linkify: A Web-Based Collaborative Content Tagging System for Machine Learning Algorithms
    (2014-12-03) Soares, Dante Mattos de Salles; Baraniuk, Richard; Cavallaro, Joseph; Burrus, C. Sidney
    Automated tutoring systems that use machine learning algorithms are a relatively new development which promises to revolutionize education by providing students on a large scale with an experience that closely resembles one-on-one tutoring. Machine learning algorithms are essential for these systems, as they are able to perform, with fairly good results, certain data processing tasks that have usually been considered difficult for artificial intelligence. However, the high performance of several machine learning algorithms relies on the existence of information about what is being processed in the form of tags, which have to be manually added to the content. Therefore, there is a strong need today for tagged educational resources. Unfortunately, tagging can be a very time-consuming task. Proven strategies for the mass tagging of content already exist: collaborative tagging systems, such as Delicious, StumbleUpon and CiteULike, have been growing in popularity in recent years. These websites allow users to tag content and browse previously tagged content that is relevant to the user’s interests. However, attempting to apply this particular strategy towards educational resource tagging presents several problems. Tags for educational resources to be used in tutoring systems need to be highly accurate, as mistakes in recommending or assigning material to students can be very detrimental to their learning, so ideally subject-matter experts would perform the resource tagging. The issue with hiring experts is that they can sometimes be not only scarce but also expensive, therefore limiting the number of resources that could potentially be tagged. Even if non-experts are used, another issue arises from the fact that a large user base would be required to tag large amounts of resources, and acquiring large numbers of users can be a challenge in itself. To solve these problems, we present Linkify, a system that allows the more accurate tagging of large amounts of educational resources by combining the efforts of users with certain existing machine learning algorithms that are also capable of tagging resources. This thesis will discuss Linkify in detail, presenting its database structure and components, and discussing the design choices made during its development. We will also discuss a novel model for tagging errors based on a binary asymmetric channel. From this model, we derive an EM algorithm which can be used to combine tags entered into the Linkify system by multiple users and machine learning algorithms, producing the most likely set of relevant tags for each given educational resource. Our goal is to enable automated tutoring systems to use this tagging information in the future in order to improve their capability of assessing student knowledge and predicting student performance. At the same time, Linkify’s standardized structure for data input and output will facilitate the development and testing of new machine learning algorithms.
  • Item
    Methods for Ripple Detection and Spike Sorting During Hippocampal Replay
    (2015-09-21) Sethi, Ankit; Kemere, Caleb T; Aazhang, Behnaam; Robinson, Jacob
    In the rat hippocampus, fast oscillations termed sharp wave ripples and an associated sequential firing of neurons, termed replay, have been identified as playing a crucial role in memory formation and learning. The term 'replay' is used since the observed spiking encodes patterns of past experiences. To determine the role of replay in learning and decision making, a need arises for systems that can decode replay activity observed during ripples. This necessitates online algorithms for both spike sorting and ripple detection at low latencies. In my work, I have developed and tested an improved method for ripple detection and tested its performance against previous methods. Further, I have optimized a recently proposed spike sorting algorithm based on real-time bayesian inference so that it can run online in a multi-tetrode scenario, and implemented it, along with ripple detection, for the open-source electrophysiological suite, "open-ephys". The algorithm's parameters were also analyzed for their suitability in operating in an unsupervised scenario. These two modules are integrated to form a system uniquely suited to decoding neuronal sequences during sharp wave ripple events.
  • Item
    mobileVision: A Face-mounted, Voice-activated, Non-mydriatic "Lucky" Ophthalmoscope
    (2014-12-11) Samaniego, Adam Patric; Veeraraghavan, Ashok; Sabharwal, Ashutosh; Zhong, Lin; Woods, Gary
    mobileVision is a portable, robust, smartphone-based ophthalmoscopy system intended to reduce the barriers to ocular pathology screening in developing and underserved regions. Through smartphone integration and ergonomic design, the system demonstrates automatic compensation for patient refractive error, voice-activated multi-shot retinal image acquisition without pupil dilation, and touch-gesture based control of patient fixation and accommodation. Further, a lucky imaging and retinal stitching pipeline is developed and demonstrated, which not only increases overall retinal field-of-view, but also makes the system robust to patient saccades, blinks, device jitter, and imaging artifacts such as noise or unintended scattering from ocular surfaces. The prototype is tested through a combination of mock eye tests and in-vivo trials. The current prototype can image over +/-45 degrees of retina with an estimated 23.5 um of retinal resolution for patients with between -6D to +13D refractive error.
  • Item
    Imaging Plasmons with Compressive Hyperspectral Microscopy
    (2015-04-23) Lu, Liyang; Kelly, Kevin F; Baraniuk, Richard G; Landes, Christy
    With the ability of revealing the interactions between objects and electromagnetic waves, hyperspectral imaging in optical microscopy is of great importance in the study of various micro/nano-scale physical and chemical phenomena. The conventional methods, however, require various scanning processes to acquire a complete set of hyperspectral data because of its 3-dimensional structure. As such the quality and efficiency of the data acquisition using these conventional scanning techniques is greatly limited by the detector sensitivity and low signal light intensity from the sample. To overcome such limitations, we applied compressive sensing theory to the hyperspectral imaging. The compressive imaging enhances the measurement signal-to-noise ratio by encoding and combining the spatial information of the sample to the detector, and a recovery algorithm is used to decode the detector outputs and reconstruct the image. A microscopy system based on this compressive hyperspectral imaging scheme was designed and implemented. Further analysis and discussion on the diffraction and interference phenomenon and a solution to the spectral distortion in this compressive sensing microscopy system are also presented. Experimental results of compressive dark-field scattering from gold nanobelts are presented, followed with an analysis on signal-to-noise ratio and a comparison with conventional scanning methods in measuring the plasmon resonances.
  • Item
    Indelible Physical Randomness for Security: Silicon, Bisignals, Biometrics
    (2014-11-11) Rostami, Masoud; Koushanfar, Farinaz; Wallach, Dan S; Knightly, Edward; Juels, Ari
    In this thesis, I investigate the nature and properties of several indelible physical randomness phenomena. I leverage these indelible statistical properties to design robust and efficient security systems. Three different phenomena are discussed in this thesis: randomness in biosignals, silicon chips, and biometrics. In the first part, I present a system to authenticate external medical device programmers to Implantable Medical Devices (IMDs).IMDs have now built-in radio communication to facilitate non-invasive reprogramming, but lack well-designed authentication mechanisms, exposing patients to the risks of over-the-air attacks and physical harm. Our protocol uses biosignals for authentication mechanism, ensuring access only by a medical instrument in physical contact with an IMD-bearing patient. Based on statistical analysis of real-world data, I propose and analyze new techniques for extracting time-varying randomness from biosignals and introduce a novel cryptographic device pairing protocol that uses this randomness to protect against attacks by active adversaries, while meeting the practical challenges of lightweight implementation and noise tolerance in biosignals readings. In the second part, unavoidable physical randomness of transistors is investigated, and novel robust and low-overhead authentication, bit-commitment, and key exchange protocols are proposed. It will be meticulously shown that these protocols can achieve resiliency against reverse-engineering and replay attacks without a costly secure channel. The attack analysis guides us in tuning the parameters of the protocols for an efficient and secure implementation. In the third part, the statistical properties of fingerprint minutiae points are analyzed and a distributed security protocol will be proposed to safeguard biometric fingerprint databases based on the developed statistical models of fingerprint biometric.
  • Item
    Virtual Ring Buffer for Camera Application Concurrency
    (2015-01-26) Reyes, Jose Eduardo; Zhong, Lin; Cavallaro, Joseph R; Veeraraghavan, Ashok
    Smartphones with integrated cameras have inspired a growing number of real- time, computer vision applications. Existing camera software architectures, however, do not support concurrency: only one application accesses the image stream at any time. A naive solution that makes a copy of every image for every application is inherently ine cient. Towards a computation- and power-e cient solution, this work presents a driver-level architecture, wherein a single, copy-on-write, shared-memory ring bu er delivers images to all applications via virtual interfaces. The architecture guarantees application isolation, minimizes data redundancy, and provides an illusion to applications that they are the sole consumers of the image stream. This work implements the architecture in Android 4.3.1 and characterizes its performance on a modern, multi-core smartphone. Measurements show the architecture increases CPU utilization at half the rate of the naive solution and reduces power consumption by several hundred milliwatts.
  • Item
    Imaging and Visual Classification by Knowledge-Enhanced Compressive Imaging
    (2015-09-10) Li, Yun; Kelly, Kevin F.; Baraniuk , Richard G.; Landes , Christy F.
    Compressive imaging is a technology that uses multiplexed measurements and the sparsity of many natural images to efficiently capture and reconstruct images. The compressive single pixel camera is one embodiment of such an imaging system and has proven capable of imaging static images, dynamic scenes, and entire hyperspectral datacubes using fewer measurements than the current schemes. However, for many imaging tasks prior information or models exists and when incorporated in the compressive measurement can greatly improve reconstructed result. In this thesis, we illustrate and quantify through simulation and experiment the effectiveness of knowledge-enhanced patterns over unbiased compressive measurements in a variety of applications including motion tracking, anomaly detection, and object recognition. In the case of motion tracking, one might interest in moving foreground. Given prior information about the moving foreground in the scene, we propose the design of patterns for foreground imaging. Then one can recover the dynamic scene through combining moving foreground from designed patterns and static background. We also implemented anomaly detection from compressive measurements. A set of detection criteria is implemented and proven to be effective. On top of that, we also introduced patterns selected from partial-complete set according to the geometric information of the anomaly point, which later shows improved effectiveness comparing with random patterns. For image classification, we implemented two methods to generate secant projections, which are optimized to preserve the difference between image classes. Lastly we illustrate the new design of single pixel based hyperspectral design. To reach that, the control of DMD chip and optics of SPC have been improved. Also we show results about implementation of compressive endmembers unmixing scheme for compressive sum frequency generation hyperspectral imaging system.
  • Item
    GPU Accelerated Reconfigurable Detector and Precoder for Massive MIMO SDR Systems
    (2015-12-02) Li, Kaipeng; Cavallaro, Joseph; Aazhang, Behnaam; Zhong, Lin
    We present a reconfigurable GPU-based unified detector and precoder for massive MIMO software-defined radio systems. To enable high throughput, we implement the linear minimum mean square error detector/precoder and further reduce the algorithm complexity by numerical approximation without sacrificing the error-rate performance. For efficient GPU implementation, we explore the algorithm's inherent parallelism and take advantage of the GPU's numerous computing cores and hierarchical memories for the optimization of kernel computations. We furthermore perform multi-stream scheduling and multi-GPU workload deployment to pipeline multiple detection or precoding tasks on GPU streams for the reduction of host-device memory copy overhead. The flexible design supports both detection and precoding and can switch between Cholesky based mode and conjugate gradient based mode for accuracy and complexity tradeoff. The GPU implementation exceeds 250 Mb/s detection and precoding throughput for a 128x16 antenna system.
  • Item
    Wafer-scale films of aligned single-wall carbon nanotubes: preparation, characterization, and optoelectronic applications
    (2015-11-20) He, Xiaowei; Kono, Junichiro; Kelly, Kevin; Hauge, Robert; Adams, Wade
    Single-wall carbon nanotubes (SWCNTs) are one-dimensional materials defined by a cylindrical and hollow structure with aspect ratios of up to 10^7:1. Individual SWCNTs have been shown to possess excellent electric, optical, thermal, and mechanical properties that are promising for electronic and optoelectronic device applications. However, when they are assembled into macroscopic objects such as films and fibers, these unique properties tend to vanish, primarily due to disorder. Hence, methods are being sought for fabricating ordered SWCNT assemblies for the development of high-performance devices based on SWCNTs. In this dissertation, we present two methods for preparing highly aligned SWCNT films with excellent optoelectronic properties. The first method is based on vertically aligned SWCNT arrays grown by water-assisted chemical vapor deposition. We transferred these arrays to desired substrates to form horizontally aligned SWCNT films and created p-n junction devices that worked as flexible, room-temperature-operating, and polarization-sensitive infrared and terahertz photodetectors. The second method is based on our discovery of spontaneous global alignment of SWCNTs that occurs during vacuum filtration of SWCNT suspensions. By carefully controlling critical factors during vacuum filtration, we obtained wafer-scale, monodomain films of strongly aligned SWCNTs. By measuring polarization-dependent terahertz transmittance, we demonstrated ideal polarizer performance with large extinction ratios. The universality of this method was confirmed by applying it to diverse types of SWCNTs, all of which showed exceptionally high degrees of alignment. Furthermore, we successfully fabricated aligned SWCNT films enriched in one specific chirality by combining our new method with an advanced nanotube sorting technique: aqueous two-phase extraction. Transistors fabricated using such films showed very high conductivity anisotropies and excellent on-off ratios.
  • Item
    Full-duplex Wireless with Large Antenna Arrays
    (2015-12-03) Everett, Evan Jackson; Sabharwal, Ashutosh; Aazhang, Behnaam; Cox, Steven; Knightly, Edward; Kennedy, Timothy
    To meet the growing demand for wireless data, base stations with very large antenna arrays are being deployed in order to serve multiple users simultaneously. Concurrently, there is growing interest in full-duplex operation. The challenge to full-duplex is suppressing the high-powered self-interference caused by transmitting and receiving at the same time on the same frequency. Unfortunately, the state-of-the-art methods to suppress self-interference require extra analog circuitry that does not scale well to large antenna arrays. However, large antenna arrays open a new opportunity to use digital beamforming to reduce the self-interference. In this thesis we study the use of digital beamforming to enable full-duplex operation on conventional antenna arrays. Unlike most designs that rely on analog cancelers to suppress self-interference, we consider all-digital solutions that can be employed on existing radio hardware.
  • Item
    WrAP: Hardware and Software Support for Atomic Persistence in Storage Class Memory
    (2015-04-23) Giles, Ellis Robinson; Varman, Peter J.; Cavallaro, Joseph R; Jermaine, Christoper M
    In-memory computing is gaining popularity as a means of sidestepping the performance bottlenecks of traditional block-based storage devices. However, the volatile nature of DRAM makes these systems vulnerable to system crashes, while the need to continuously refresh massive amounts of passive memory-resident data increases power consumption. Emerging storage-class memory (SCM) technologies, like Phase Change Memory and Memristors, combine fast DRAM-like cache-line access granularity with the persistence of storage devices like disks or SSDs, resulting in potential 10x - 100x performance gains, and low passive power consumption. This unification of storage and memory into a single directly-accessible persistent storage tier is a mixed blessing, as it pushes upon developers the burden of ensuring that SCM stores are ordered correctly, flushed from processor caches, and if interrupted by sudden machine stoppage, not left in inconsistent states. The complexity of ensuring properly ordered and all-or-nothing updates is addressed in this thesis in both a software-hardware architecture and a software-only based solution. This thesis extends and evaluates a software-hardware architecture called WrAP, or Write-Aside Persistence, for atomic stores to SCM. This thesis also presents SoftWrAP, a library for Software based Write-Aside Persistence, which provides lightweight atomicity and durability for SCM storage transactions. Both methods are shown to provide atomicity and durability while simultaneously ensuring that fast paths through the cache, DRAM, and persistent memory layers are not slowed down by burdensome buffering or double-copying requirements. Software-hardware architecture evaluation of trace-driven simulation of transactional data structures indicates the potential for significant performance gains using the WrAP approach. The SoftWrAP library is evaluated with both handcrafted SCM- based micro-benchmarks as well as existing applications, specifically the STX B+Tree library and SQLite database, backed by emulated SCM. Our results show the ease of using the API to create atomic persistent regions and the significant benefits of SoftWrAP over existing methods such as undo logging and shadow copying. SoftWrAP can match non-atomic durable writes to SCM, thereby gaining atomic consistency almost for free.
  • Item
    Molecular Plasmonics: Graphene Plasmons in the Picoscale Limit
    (2015-08-20) Lauchner, Adam; Halas, Naomi; Nordlander, Peter; Link, Stephan
    Doped graphene supports surface plasmons in the mid- to far-infrared that are both electrically and spatially tunable. Graphene has been shown to enable greater spatial confinement of the plasmon and fewer losses than typical noble metals. Reduced-dimensional graphene structures, including nanoribbons, nanodisks, and other allotropes including carbon nanotubes exhibit higher frequency plasmons throughout the mid- and near-infrared regimes due to additional electronic confinement of the electrons to smaller length scales. Recent theoretical predictions have suggested that further spatial confinement to dimensions of only a few nanometers (containing only a few hundred atoms) would result in a near-infrared plasmon resonance remarkably sensitive to the addition of single charge carriers. At the extreme limit of quantum confinement, picoscale graphene structures known as Polycyclic Aromatic Hydrocarbons (PAHs) containing only a few dozen atoms should possess a plasmon resonance fully switched on by the addition or removal of a single electron. This thesis reports the experimental realization of plasmon resonances in PAHs with the addition of a single electron to the neutral molecule. Charged PAHs are observed to support intense absorption in the visible regime with geometrical tunability analogous to plasmonic resonances of much larger nanoscale systems. To facilitate charge transfer to and from PAH molecules, a three-electrode electrochemical cell with optical access was designed, where current is passed through a nonaqueous electrolyte solution that contains a known concentration of PAH molecules. In contrast to larger graphene nanostructures, the PAH absorption spectra possess a rich and complex fine structure that we attribute to the coupling between the molecular plasmon and the vibrational modes of the molecules. The natural abundance, low cost, and extremely large variety of PAH molecules available could make extremely large-area active color-switching applications, such as walls, windows or other architectural elements, even vehicles, a practical technology.
  • Item
    SocialSync: Sub-Frame Synchronization in a Smartphone Camera Network
    (2014-09-17) Latimer, Richard; Sabharwal, Ashutosh; Zhong, Lin; Veeraraghavan, Ashok
    SocialSync is a sub-frame synchronization protocol for capturing images simultaneously using a smartphone camera network. By synchronizing image captures to within a frame period, multiple smartphone cameras, which are often in use in social settings, can be used for a variety of applications including light field capture, depth estimation, and free viewpoint television. Currently, smartphone camera networks are limited to capturing static scenes due to motion artifacts caused by frame misalignment. Because frame misalignment in smartphones camera networks is caused by variability in the camera system, we characterize frame capture on mobile devices by analyzing the statistics of camera setup latency and frame delivery within an Android app. Next, we develop the SocialSync protocol to achieve sub-frame synchronization between devices by estimating frame capture timestamps to within millisecond accuracy. Finally, we demonstrate the effectiveness of SocialSync on mobile devices by reducing motion-induced artifacts when recovering the light field.