Computer Science Technical Reports

Permanent URI for this collection

https://hdl.handle.net/1911/94827

Browse

Now showing 1 - 20 of 245

A Characterization of Compound Documents on the Web
(1999-11-29) Lara, Eyal de; Wallach, Dan S.; Zwaenepoel, Willy
Recent developments in office productivity suites make it easier for users to publish rich {\em compound documents\/} on the Web. Compound documents appear as a single unit of information but may contain data generated by different applications, such as text, images, and spreadsheets. Given the popularity enjoyed by these office suites and the pervasiveness of the Web as a publication medium, we expect that in the near future these compound documents will become an increasing proportion of the Web's content. As a result, the content handled by servers, proxies, and browsers may change considerably from what is currently observed. Furthermore, these compound documents are currently treated as opaque byte streams, but future Web infrastructure may wish to understand their internal structure to provide higher-quality service. In order to guide the design of this future Web infrastructure, we characterize compound documents currently found on the Web. Previous studies of Web content either ignored these document types altogether or did not consider their internal structure. We study compound documents originated by the three most popular applications from the Microsoft Office suite: Word, Excel, and PowerPoint. Our study encompasses over 12,500 documents retrieved from 935different Web sites. Our main conclusions are: Compound documents are in general much larger than current HTML documents. For large documents, embedded objects and images make up a large part of the documents' size. For small documents, XML format produces much larger documents than OLE. For large documents, there is little difference. Compression considerably reduces the size of documents in both formats.
A Comparison of Software Architectures for E-business Applications
(2002-02-20) Cecchet, Emmanuel; Chanda, Anupam; Elnikety, Sameh; Marguerite, Julie; Zwaenepoel, Willy
As dynamic content has become more prevalent on the Web, a number of standard mechanisms have evolved to generate such dynamic content. We study three specific mechanisms in common use: PHP, Java servlets, and Enterprise Java Beans (EJB). PHP and Java servlets require a direct encoding of the database queries in the application logic. EJB provides a level of indirection, allowing the application logic to call bean methods that then perform database queries. Unlike PHP, which typically executes on the same machine as the Web server, Java servlets and EJB allow the application logic to execute on different machines, including the machine on which the database executes or a completely separate (set of) machine(s). We present a comparison of the performance of these three systems in different configurations for two application benchmarks: an auction site and an online bookstore. We choose these two applications because they impose vastly different loads on the sub-systems: the auction site stresses the Web server front-end while the online bookstore stresses the database. We use open-source software in common use in all of our experiments (the Apache Web server, Tomcat servlet server, Jonas EJB server, and MySQL relational database). The computational demands ofJava servlets are modestly higher than those of PHP. The ability, however, of locating the servlets on a machine different from the Web server results in better performance for Java servlets than for PHP in the case that the application imposes a significant load on the front-end Web server. The computational demands of EJB are much higher than those of PHP and Java servlets. As with Java servlets, we can alleviate EJB's performance problems by putting them on a separate machine, but the resulting overall performance remains inferior to that of the other two systems.
A Deterministic Model for Parallel Program Performance Evaluation
(1998-12-03) Adve, Vikram S.; Vernon, Mary K.
Analytical models for parallel programs have been successful at providing simple qualitative insights and bounds on scalability, but have been less successful in practice for predicting detailed, quantitative information about program performance. We develop a conceptually simple model that provides detailed performance prediction for parallel programs with arbitrary task graphs, a wide variety of task scheduling policies, shared-memory communication, and significant resource contention. Unlike many previous models, our model assumes deterministic task execution times which permits detailed analysis of synchronization, task scheduling, the order of task execution as well as mean values of communication costs. The assumption of deterministic task times is supported by a recent study of the influence of non-deterministic delays in parallel programs. We show that the deterministic task graph model is accurate and efficient for five shared-memory programs, including programs with large and/or complex task graphs, sophisticated task scheduling, highly non-uniform task times, and significant communication and resource contention. We also use three example programs to illustrate the predictive capabilities of the model. In two cases, broad insights and detailed metrics from the model are used to suggest improvements in load-balancing and the model quickly and accurately predicts the impact of these changes. In the third case, further novel metrics are used to obtain insight into the impact of program design changes that improve communication locality as well as load-balancing. Finally, we briefly present results of a comparison between our model and representative models based on stochastic task execution times.
A Graphical Multistage Calculus
(2005-07-22) Ellner, Stephan; Taha, Walid
While visual programming languages continue to gain popularity in domains ranging from scientific computing to real-time systems, the wealth of abstraction mechanisms, reasoning principles, and type systems developed over the last thirty years is currently available mainly for textual languages. With the goal of understanding how results in the textual languages can be mapped to the graphical setting, we develop the visual calculus PreVIEW. While this calculus visualizes computations in dataflow-style similar to languages like LabVIEW and Simulink, its formal model is based on Ariola and Blom's work on cyclic lambda calculi. We extend this model with staging constructs, establish a precise connection between textual and graphical program representations, and show how a reduction semantics for a multi-stage language can be lifted from the textual to the graphical setting.
A Hierarchical Region-Based Static Single Assignment Form
(2009-12-14) Sarkar, Vivek; Zhao, Jisheng
Modern compilation systems face the challenge of incrementally reanalyzing a program’s intermediate representation each time a code transformation is performed. Current approaches typically either re-analyze the entire program after an individual transformation or limit the analysis information that is available after a transformation. To address both efficiency and precision goals in an optimizing compiler, we introduce a hierarchical static single-assignment form called Region Static Single-Assignment (Region-SSA) form. Static single assignment (SSA) form is an efficient intermediate representation that is well suited for solving many data flow analysis and optimization problems. By partitioning the program into hierarchical regions, Region-SSA form maintains a local SSA form for each region. Region-SSA supports a demand-driven re-computation of SSA form after a transformation is performed, since only the updated region’s SSA form needs to be reconstructed along with a potential propagation of exposed defs and uses. In this paper, we introduce the Region-SSA data structure, and present algorithms for construction and incremental reconstruction of Region-SSA form. The Region-SSA data structure includes a tree based region hierarchy, a region based control flow graph, and region-based SSA forms. We have implemented in Region SSA form in the Habanero-Java (HJ) research compiler. Our experimental results show significant improvements in compile-time compared to traditional approaches that recomputed the entire procedure’s SSA form exhaustively after transformation. For loop unrolling transformations, compile-time speedups up to 35.8× were observed using Region-SSA form relative to standard SSA form. For loop interchange transformations, compile-time speedups up to 205.6× were observed. We believe that Region-SSA form is an attractive foundation for future compiler frameworks that need to incorporate sophisticated incremental program analyses and transformations.
A Linear Transform Scheme for Combining Weights into Scores
(1998-10-09) Sung, Sam
Ranking has been widely used in many applications. A ranking scheme usually employs a "scoring rule" that assigns a final numerical value to each and every object to be ranked. A scoring rule normally involves the use of one or many scores, and it gives more weight to the scores that is more important. In this paper, we give a scheme that can combine weights into scores in a natural way. We compare our scheme to the formula given by Fagin. We give additional desirable properties that weighted "scoring rule" are desirable to possess. Some interesting issues on weighted scoring rule are also discussed.
A MAC protocol for Multi Frequency Physical Layer
(2003-01-23) Kumar, Rajnish; PalChaudhuri, Santashil; Saha, Amit
Existing MAC protocols for wireless LAN systems assume that a particular node can operate on only one frequency and that most/all of the nodes operate on the same frequency. We propose a MAC protocol for use in an ad hoc network of mobile nodes using a wireless LAN system that defines multiple independent frequency channels. Each node can switch quickly from one channel to another but can operate on one channel at a time. We simulate the proposed protocol by modifying the wireless extension. Our simulations show that the proposed protocol, though simple, is capable of much better performance in the presence of multiple independent channels than IEEE 802.11which assumes a single frequency channel for all nodes. As expected, the proposed protocol works as well as IEEE 802.11 in the presence of a single channel.
A New Approach to Routing With Dynamic Metrics
(1998-11-18) Chen, Johnny; Druschel, Peter; Subramanian, Devika
We present a new routing algorithm to compute paths within a network using dynamic link metrics. Dynamic link metrics are cost metrics that depend on a link's dynamic characteristics, e.g., the congestion on the link. Our algorithm is destination-initiated: the destination initiates a global path computation to itself using dynamic link metrics. All other destinations that do not initiate this dynamic metric computation use paths that are calculated and maintained by a traditional routing algorithm using static link metrics. Analysis of Internet packet traces show that a high percentage of network traffic is destined for a small number of networks. Because our algorithm is destination-initiated, it achieves maximum performance at minimum cost when it only recomputes dynamic metric paths to these selected "hot" destination networks. This selective approach to route recomputation reduces many of the problems (principally route oscillations) associated with calculating all routes simultaneously. We compare the routing efficiency and end-to-end performance of our algorithm against those of traditional algorithms using dynamic link metrics. The results of our experiments show that our algorithm can provide higher network performance at a significantly lower routing cost under conditions that arise in real networks. The effectiveness of the algorithm stems from the independent, time-staggered recomputation of important paths using dynamic metrics, allowing for splits in congested traffic that cannot be made by traditional routing algorithms.
A Practical Soft Type System for Scheme
(1993-12-06) Cartwright, Robert; Wright, Andrew
Soft type systems provide the benefits of static type checking for dynamically typed languages without rejecting untypable programs. A soft type checker infers types for variables and expressions and inserts explicit run-time checks to transform untypable programs to typable form. We describe a practical soft type system for R4RS Scheme. Our type checker uses a representation for types that is expressive, easy to interpret, and supports efficient type inference. Soft Scheme supports all of R4RS Scheme, including procedures of fixed and variable arity, assignment, continuations, and top-level definitions. Our implementation is available by anonymous FTP.
A Related-Key Cryptanalysis of RC4
(2000-06-08) Grosul, Alexander; Wallach, Dan S.
In this paper we present analysis of the RC4 stream cipher and show that for each 2048-bit key there exists a family of related keys, differing in one of the byte positions. The keystreams generated by RC4 for a key and its related keys are substantially similar in the initial hundred bytes before diverging. RC4 is most commonly used with a 128-bit key repeated 16 times;this variant does not suffer from the weaknesses we describe. We recommend that applications of RC4 with keys longer than 128 bits (and particularly those using the full 2048-bit keys) discard the initial 256 bytes of the keystream output.
A Resource Management Framework for Predictable Quality of Service in Web Servers
(2003-07-07) Aron, Mohit; Druschel, Peter; Iyer, Sitaram
This paper presents a resource management framework for providing predictable quality of service (QoS) in Web servers. The framework allows Web server and proxy operators to ensure a probabilistic minimal QoS level, expressed as an average request rate, for a certain class of requests (called a service), irrespective of the load imposed by other requests. A measurement-based admission control framework determines whether a service can be hosted on a given server or proxy, based on the measured statistics of the resource consumptions and the desired QoS levels of all the co-located services. In addition, we present a feedback-based resource scheduling framework that ensures that QoS levels are maintained among admitted, co-located services. Experimental results obtained with a prototype implementation of our framework on trace-based workloads show its effectiveness in providing desired QoS levels with high confidence, while achieving high average utilization of the hardware.
A Sample-Driven Call Stack Profiler
(2004-07-15) Fowler, Rob; Froyd, Nathan; Mellor-Crummey, John
Call graph profiling reports measurements of resource utilization along with information about the calling context in which the resources were consumed. We present the design of a novel profiler that measures resource utilization and its associated calling context using a stack sampling technique. Our scheme has a novel combination of features and mechanisms. First, it requires no compiler support or instrumentation, either of source or binary code. Second, it works on heavily optimized code and on complex, multi-module applications. Third, it uses sampling rather than tracing to build a context tree, collect histogram data, and to characterize calling patterns. Fourth, the data structures and algorithms are efficient enough to construct the complete tree exposed in the sampling process. We describe an implementation for the Alpha/Tru64 platform and present experimental measurements that compare this implementation with the standard call graph profiler provided on Tru64, hi prof. We show results from a variety of programs in several languages indicating that our profiler operates with modest overhead. Our experiments show that the profiling overhead of our technique is nearly a factor of 55 lower than that of hi prof when profiling a call-intensive recursive program.
A Security Analysis of My.MP3.com and the Beam-it Protocol
(2000-03-08) Stubblefield, Adam; Wallach, Dan S.
My.MP3.com is a service that streams audio in the MP3 format to its users. In order to resolve copyright concerns, the service first requires that a user prove he or she owns the right to listen to a particular CD. The mechanism used for the verification is a program called Beam-it which reads a random subset of an audio CD and interacts with the My.MP3.com servers using a proprietary protocol. This paper presents a reverse-engineering of the protocol and the client-side code which implements it. An analysis of Beam-it's security implications and speculations as to the Beam-it server architecture are also presented. We found the protocol to provide strong protection against a user pretending to have a music CD without actually possessing it, however we found the protocol to be unnecessarily verbose and includes information that some users may prefer to keep private.
A Set of Convolution Identities Relating the Blocks of Two Dixon Resultant Matrices
(1999-06-16) Chionh, Eng-Wee; Goldman, Ronald; Zhang, Ming
Resultants for bivariate polynomials are often represented by the determinants of very big matrices. Properly grouping the entries of these matrices into blocks is a very effective tool for studying the properties of these resultants. Here we derive a set of convolution identities relating the blocks of two Dixon bivariate resultant representations.
A Simple and Effective Caching Scheme for Dynamic Content
(2000-11-28) Cox, Alan; Rajamani, Karthick
As web sites increasingly deliver dynamic content, the process of content generation at request time is becoming a severe limitation to web site throughput. Recent studies have shown that much of the dynamic content is, however, better characterized as pseudo-dynamic, i.e., a dynamic composition of stored or static data. Consequently, caching the generated web pages may increase the web server's throughput if there is some temporal locality in the request stream. In this paper, we perform a quantitative analysis of the benefits of caching for dynamic content using the e-commerce benchmark, TPC-W,as the workload. We implement caching through a simple and efficient Apache extension module, DCache, that can be easily incorporated into the current infrastructure for dynamic content delivery. Our DCache module uses conventional expiration times and our own request-initiated invalidation scheme as the methods for keeping the cache consistent. It also supports site-specific optimization by providing a mechanism to incorporate the priorities of specific web pages into the caching scheme. Our experiments show that we can obtain over 3 times the non-caching throughput with our caching approach.
A simple, fast dominance algorithm
(2006-01-11) Cooper, Keith D.; Harvey, Timothy J.; Kennedy, Ken
The problem of finding the dominators in a control-flow graph has a long history in the literature. The original algorithms suffered from a large asymptotic complexity but were easy to understand. Subsequent work improved the time bound, but generally sacrificed both simplicity and ease of implementation. This paper returns to a simple formulation of dominance as a global data-flow problem. Some insights into the nature of dominance lead to an implementation of an O(N2) algorithm that runs faster, in practice, than the classic Lengauer-Tarjan algorithm, which has a timebound of O(E ∗ log(N)). We compare the algorithm to Lengauer-Tarjan because it is the best known and most widely used of the fast algorithms for dominance. Working from the same implementation insights, we also rederive (from earlier work on control dependence by Ferrante, et al.) a method for calculating dominance frontiers that we show is faster than the original algorithm by Cytron, et al. The aim of this paper is not to present a new algorithm, but, rather, to make an argument based on empirical evidence that algorithms with discouraging asymptotic complexities can be faster in practice than those more commonly employed. We show that, in some cases, careful engineering of simple algorithms can overcome theoretical advantages, even when problems grow beyond realistic sizes. Further, we argue that the algorithms presented herein are intuitive and easily implemented, making them excellent teaching tools.
A Simple, Practical Distributed Multi-Path Routing Algorithm
(1998-07-16) Chen, Johnny; Druschel, Peter; Subramanian, Devika
We present a simple and practical distributed routing algorithm based on backward learning. The algorithm periodically floods \emscout packets that explore paths to a destination in reverse. Scout packets are small and of fixed size; therefore, they lend themselves to hop-by-hop piggy-backing on data packets, largely defraying their cost to the network. The correctness of the proposed algorithm is analytically verified. Our algorithm also has loop-free multi-path routing capabilities, providing increased network utilization and route stability. The Scout algorithm requires very little state and computation in the routers, and can efficiently and gracefully handle high rates of change in the network's topology and link costs. An extensive simulation study shows that the proposed algorithm is competitive with link-state and distance vector algorithms, particularly in highly dynamic networks.
A Timing Channel Spyware Robust to MAC Random Back-off
(2010-03-02) Alkabani, Yousra; Coleman, Todd; Kiyavash, Negar; Koushanfar, Farinaz
This paper presents the design and implementation of spyware communication circuits built into the widely used Carrier Sense Multiple Access with collision avoidance (CSMA/CA) protocol. The spyware components are embedded within the sequential and combinational communication circuit structure during synthesis, rendering the distinction or dissociation of the spyware from the original circuit impossible. We take advantage of the timing channel resulting from transmission of packets to implement a new practical coding scheme that covertly transfers the spied data. Our codes are robust against the CSMA/CA’s random retransmission time for collision avoidance and in fact take advantage of it to disguise the covert communication. The data snooping may be sporadically triggered, either externally or internally. The occasional trigger and the real-time traffic’s variability make the spyware timing covert channel detection a challenge. The spyware is implemented and tested on a widely used open-source wireless CSMA/CA radio platform. We identify the following performance metrics and evaluate them on our architecture: 1) efficiency of implementation of the encoder; 2) robustness of the communication scheme to heterogeneous CSMA/CA effects; and 3) difficulty of covert channel detection. We evaluate criterion 1) completely theoretically. Criterion 2) is evaluated by simulating a wireless CSMA/CA architecture and testing the robustness of the decoder in different heterogeneous wireless conditions. Criterion 3) is confirmed experimentally using the state-of-the-art covert timing channel detection methods.
A Unified Framework for Multimodal IC Trojan Detection
(2010-02-02) Alkabani, Yousra; Koushanfar, Farinaz; Mirhoseini, Azalia
This paper presents a unified formal framework for integrated circuits (IC) Trojan detection that can simultaneously employ multiple noninvasive measurement types. Hardware Trojans refer to modifications, alterations, or insertions to the original IC for adversarial purposes. The new framework formally defines the IC Trojan detection for each measurement type as an optimization problem and discusses the complexity. A formulation of the problem that is applicable to a large class of Trojan detection problems and is submodular is devised. Based on the objective function properties, an efficient Trojan detection method with strong approximation and optimality guarantees is introduced. Signal processing methods for calibrating the impact of inter-chip and intra-chip correlations are presented. We define a new sensitivity metric which formally quantifies the impact of modifications to each gate on the Trojan detection. Using the new metric, we compare the Trojan detection capability of the different measurement types for static (quiescent) current, dynamic (transient) current, and timing (delay) measurements. We propose a number of methods for combining the detections of the different measurement types and show how the sensitivity results can be used for a systematic combining of the detection results. Experimental evaluations on benchmark designs reveal the low-overhead and effectiveness of the new Trojan detection framework and provides a comparison of different detection combining methods.
ACME: Adaptive Compilation Made Efficient/Easy
(2005-06-17) Cooper, Keith D.; Grosul, Alexander; Harvey, Timothy J.; Reeves, Steven W.; Subramanian, Devika; Torczon, Linda
Research over the past five years has shown significant performance improvements are possible using adaptive compilation. An adaptive compiler uses a compile-execute-analyze feedback loop to guide a series of compilations towards some performance goal, such as minimizing execution time. Despite its ability to improve performance, adaptive compilation has not seen widespread use because of two obstacles: the complexity inherent in a feedback-driven adaptive system makes it difficult to build and hard to use, and the large amounts of time that the system needs to perform the many compilations and executions prohibits most users from adopting these techniques. We have developed a technique called {\em virtual execution} to decrease the time requirements for adaptive compilation. Virtual execution runs the program a single time and preserves information that allows us to accurately predict performance with different optimization sequences. This technology significantly reduces the time required by our adaptive compiler. In conjunction with this performance boost, we have developed a graphical-user interface (GUI) that provides a controlled view of the compilation process. It limits the amount of information that the user must provide to get started, by providing appropriate defaults. At the same time, it lets the user exert fine-grained control over the parameters that control the system. In particular, the user has direct and obvious control over the maximum amount of time the compiler can spend, as well as the ability to choose the number of routines to be examined. (The tool uses profiling to identify the most-executed procedures.) The GUI provides an output screen so that the user can monitor the progress of the compilation.

Browse

Browsing Computer Science Technical Reports by Title

Results Per Page

Sort Options