Browsing by Author "Ng, T.S. Eugene"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item Hardware-Software Co-Design for Optimizing MPI Programs in Data Center Network(2021-12-03) Rahbar, Afsaneh; Ng, T.S. EugeneHigh Performance Computing (HPC) systems are critical. A single server/processor cannot handle the heavy computation needs of today’s applications. HPC systems are built out of increasing numbers of processors to solve these computation-intensive problems. Communication between machines is essential. These applications may consist of thousands of processes spread across machines coordinating to solve a specific large-scale problem. The critical component of these systems is the network that connects the servers and makes this collaboration between servers possible. The performance of the network has a significant impact on the application performance. To better understand the main issues and improve the communication performance in this thesis, we investigate data center networks and provide a general overview and analysis of the literature covering various research areas, including data center network architectures, network protocols for data center networks, and state-of-the-art communication frameworks. We argue that many of the challenges faced by HPC applications in the communication phase can be addressed by augmenting the existing physical network architecture with low-cost optical technologies. However, we observe that integrating physical network/ hardware-based solutions alone would not be adoptable by HPC applications users. It requires some level of software-level application adaptations to the physical network before benefiting from the new characteristics of the network. Without a proper application to network interaction, the network cannot automatically adapt to the application’s needs and vice versa. Our goal is to explore co-designing hardware and software solutions that optimize the data center network for MPI-based HPC programs. We propose a static source code analysis solution to identify the different communication patterns and requirements of applications and design algorithms that find the optimal network placement of the tasks to reduce the number of cross-rack communications to the least possible. We implement a prototype of our solution that automates learning the application communication characteristics, application to network interaction, and network to application adaptation (reconfiguring the network). We evaluate our tool and demonstrate the high potential of hardware-software co-design for optimizing HPC programs in the data center network.Item On the design principles of network coordinates systems(2008) Wang, Guohui; Ng, T.S. EugeneSince its inception, the concept of network coordinates has been successfully applied to solve a wide variety of problems such as overlay optimization, network routing, network localization, and network modeling. Despite these successes, several practical problems limit the benefits of network coordinates today. First, the Triangle Inequality Violation(TIV) among the Internet delays degrades the application performance of network coordinates, how to reduce the impact of TIV on network coordinates systems? Second, how can network coordinates be stabilized without losing accuracy in a distributed fashion so that they can be cached by applications? Third, how can network coordinates be secured such that legitimate nodes' coordinates are not impacted by misbehaving nodes? Although these problems have been discussed extensively, the solutions are still unclear. This thesis presents analytical studies for understanding these problems and reveals several new findings: (1) the analysis results from existing Internet delay measurements demonstrate the irregular behaviors of TIVs among the Internet delays, which implies the difficulty of modeling TIVs; (2) a new TIV alert mechanism can identify the edges causing severe TIVs and reduce the impact of TIVs on network coordinates; (3) a new model of coordinates stabilization based on error elimination can achieve stability without hurting accuracy; a novel algorithm based on this model is presented; (4) recently proposed statistical detection mechanisms cannot achieve an acceptable level of security against aggressive attacks. (5) an accountability protocol can completely protect coordinates computation and a TIV alert detection mechanism can effectively protect network coordinates against delay attacks. These findings offer guidelines on the design and applications of network coordinates systems.Item Protocol Design and Experimental Evaluation for Efficient Multi-User MIMO Wireless Networks(2015-04-24) Bejarano Chavez, Oscar; Knightly, Edward W.; Sabharwal, Ashutosh; Aazhang, Behnaam; Ng, T.S. EugeneInformation theoretic results on Multi-User MIMO (MU-MIMO) have demonstrated a many-fold increase in capacity compared to Single-Input Single-Output. By leveraging multiple antennas at the Access Point (AP) and beamforming techniques, MU-MIMO enables simultaneous transmissions of multiple independent streams on the downlink. Ideally, with sufficient antennas at the AP, MU-MIMO can attain capacity gains proportional to the number of streams. However, the cost required to enable efficient and robust multi-stream transmissions is much higher than that for the single-stream case and worsens with increasing number of streams. More specifically, two key factors hinder the potential gains that can be attained via MU-MIMO: (i) To serve multiple users simultaneously, the AP needs to collect Channel State Information (CSI) from all users to be served (i.e., sounding). Sounding overhead reduces the effective data airtime utilization of the overall system. (ii) Multi- stream transmissions are highly susceptible to inter-stream interference originated due to inaccurate or outdated CSI, thereby reducing packet reception performance. I demonstrate that in practice, the costs of MU-MIMO not only decrease the gains demonstrated by theory but can completely outweigh the benefits. I identify those adverse situations and propose several techniques that alleviate the negative impact caused by sounding overhead and CSI inaccuracies. First, I design CUiC and MUTE, two protocols that address MU-MIMO sounding overhead by performing overhead compression along spatial and temporal domains, respectively. CUiC exploits the available Degrees-of-Freedom (DoF) at the AP to allow multiple users to reply with their control messages (e.g., channel estimates and acknowledgements) simultaneously, therefore reducing the time required for users to reply, to a constant. MUTE exploits epochs characterized by slowly moving channels to reduce the frequency of channel sounding. Second, I design CHRoME, a protocol that addresses interference-leakage caused by outdated and inaccurate CSI as well as out-of-cell interference. CHRoME proposes a bit rate selection strategy that re-tunes the selection according to current channel and interference conditions. Additionally, if necessary, CHRoME realizes a fast soundless retransmission that exploits liberated DoF at the AP to minimize retransmission overhead. I implement and evaluate all three schemes using a combination of WARP FPGA-based transceivers, and custom emulation platforms.Item Query Processing and Optimization for Database Stochastic Analytics(2014-12-03) Perez, Luis Leopoldo; Jermaine, Christopher M; Ng, T.S. Eugene; Varman, Peter JThe application of relational database systems to analytical processing has been an active area of research for about two decades, motivated by constant surges in the scale of the data and in the complexity of the analysis tasks. Simultaneously, stochastic techniques have become commonplace in large-scale data analytics. This work is concerned with the application of relational database systems to support stochastic analytical tasks, particularly with the query evaluation and optimization phases. In this work, three problems are addressed in the context of MCDB/SimSQL, a relational database system for uncertain data management and analytics. The first contribution is a set of efficient techniques for evaluating queries that require satisfying a probability threshold, such as "Which pending orders are estimated to be processed and shipped by the end of the month, with a probability of at least 95%?" where the processing and shipment times of each order are generated by an arbitrary stochastic process. Results show that these techniques make sensible use of resources, weeding out data elements that require relatively few samples during the early stages of query evaluation. The second problem is concerned with recycling the materialized intermediate results of a query to optimize other queries in the future. Taking the assumption that a history of past queries provides an accurate picture of the workload, I describe techniques for query optimization that evaluate the costs and benefits of materializing intermediate results, with the objective of minimizing the hypothetical costs of future queries, subject to constraints on disk space. Results show a substantial improvement over conventional query caching techniques in workload and average query execution time. Finally, this work addresses the problem of evaluating queries for stochastic generative models, specified in a high level notation that treats random variables as first-class objects and allows operations with structured objects such as vectors and matrices. I describe a notation that, relying on the syntax of comprehensions, provides a language for denoting generative models that guarantees correspondence with relational algebra expressions, and techniques for translating a model into a database schema and set of relational queries.