Browsing by Author "Mamouras, Konstantinos"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item A Domain-Specific Language Approach for Quantitative Monitoring of Cyber-Physical Systems(2022-07-19) Wang, Zhifu; Mamouras, KonstantinosCyber-physical systems (CPS) are engineered systems that are characterized by the non-trivial interaction of computational components with physical processes. In order to ensure the safety and reliability of such systems, a multitude of approaches have been explored that aim to formally verify that the CPS is guaranteed to behave as intended. However, these approaches often fail to scale to complex systems or are inapplicable in certain cases, e.g., when no accurate model of the system is available. In this thesis, we focus on a complementary approach, called online monitoring. It involves the real-time observation of the evolution of a CPS in order to detect safety violations and potentially trigger alerts and corrective actions. We develop a flexible and expressive formalism for specifying quantitative properties of CPS and online monitors for these properties. Our formalism can be viewed as a domain-specific language (DSL) that describes signal transformations. A key feature of our DSL is that it relaxes the causality restriction of similar prior approaches, by allowing the output to depend on a bounded amount of future input. We illustrate the usefulness of our DSL by using it (1) to implement an ECG monitoring application, and (2) to encode online monitors for quantitative temporal properties.Item Compiler and Runtime Optimization of Computational Kernels for Irregular Applications(2023-08-17) Milakovic, Srdan; Mellor-Crummey, John; Budimlić, Zoran; Varman, Peter J; Mamouras, KonstantinosMany computationally-intensive workloads do not fit on individual compute nodes due to their size. As a consequence, such workloads are usually executed on multiple heterogenous compute nodes of a cluster or supercomputer. However, due to the complexity of the hardware, developing efficient and scalable code for modern compute nodes is difficult. Another challenge with sophisticated applications is that data structures, communication, and control patterns are often irregular and unknown before the program execution. Lack of regularity makes static analysis especially difficult or very often impossible. To overcome these issues, programmers use high-level and implicitly parallel programming models or domain-specific libraries that consist of composable building blocks. This dissertation explores compiler and runtime optimizations for automatic granularity selection in the context of two programming paradigms: Concurrent Collections (CnC)---a declarative,dynamic single-assignment, data-race free programming model---and GraphBLAS--a domain-specific Application-specific Programming Interface (API)---. Writing fine-grained CnC programs is easy and intuitive for domain experts because the programmers do not have to worry about parallelism. Additionally, fine-grained programs expose maximum parallelism. However, fine-grained programs can significantly increase the runtime overhead of CnC program execution due to a large number of data accesses and dependencies between computation tasks with respect to the amount of computation that is done by a fine-grained task. Runtime overhead can be reduced by coarsening the data accesses and task dependencies. However, coarsening is usually tedious, and it is not easy even for domain experts. For some applications, the coarse-grained code can be generated by a compiler. However, not all fine-grained applications can be converted to coarse-grained applications because not all information is statically known. In this dissertation, we introduce the concept of micro-runtimes. A micro-runtime is a Hierarchical CnC construct that enables fusion of multiple steps into a higher-level step during program execution. Another way for users to develop applications that efficiently exploit modern hardware is through domain-specific APIs that define composable building blocks. One such API specification is GraphBLAS. GraphBLAS allows users to specify graph algorithms using (sparse) linear algebra building blocks. Even though GraphBLAS libraries usually consist of highly hand-optimized building blocks, GraphBLAS libraries provide limited or no support for inter-kernel optimization. In this dissertation, we investigate multiple different approaches for inter-kernel optimization, including runtime optimizations and compile-time optimizations. Our optimizations reduce the number of arithmetic operations, memory accesses, and memory required for temporary objects.Item Embargo Formally Verified Algorithms for Temporal Logic and Regular Expressions(2024-08-09) Chattopadhyay, Agnishom; Mamouras, KonstantinosThe behavior of systems in various domains including IoT networks, cyber-physical systems and runtime environments of programs can be observed in the form of linear traces. **Temporal logic** and **regular expressions** are two core formalisms used to specify properties of such data. This thesis extends these formalisms to enable the expression of richer classes of properties in a succinct manner together with algorithms that can handle them efficiently. Using the Coq proof assistant, we formalize the semantics of our specification languages and verify the correctness of our algorithms using mechanically checked proofs. The verified algorithms have been extracted to executable code, and our emperical evaluation shows that they are competitive with state-of-the-art tools. The first part of the thesis is focused on investigating the formalization of an online monitoring framework for past-time metric temporal logic (MTL). We employ an algebraic quantitative semantics that encompasses the Boolean and robustness semantics of MTL and we interpret formulas over a discrete temporal domain. A potentially infinite-state variant of Mealy machines, a kind of string transducers, is used as a formal model of online monitors. We demonstrate a compositional construction from formulas to monitors, such that each monitor computes (in an online fashion) the semantic values of the corresponding formula over the input stream. The time taken by the monitor to process each input item is proportional to O(|φ|) where |φ| is the size of the formula, and is independent of the constants that appear in the formula. The monitor uses O(m) space where m is the sum of the numerical constants that appear in the formula. The latter part of the thesis is focused on regular expressions. Regular expressions in practice often contain lookaround assertions, which can be used to refine matches based on the surrounding context. Our formal semantics of lookarounds complements the commonly used operational understanding of lookaround in terms of a backtracking implementation. Widely used regular expression matching engines take exponential time to match regular expressions with lookarounds in the worst case. Our algorithm has a worst-case time complexity of O(m · n), where m is the size of the regex and n is the size of the input string. The key insight is to evaluate the lookarounds in a bottom-up manner, and guard automaton transitions with oracle queries evaluating the lookarounds. We demonstrate how this algorithm can be implemented in a purely functional manner using marked regular expressions. The formal semantics of lookarounds and our matching algorithm is verified in Coq. Finally, we investigate the formalization of a tokenization algorithm. Tokenization is the process of breaking a monolithic string into a stream of tokens. This is one of the very first steps in the compilation of programs. In this setting, the set of possible tokens is often described using an ordered list of regular expressions. Our algorithm is based on the simulation of the Thompson NFA of the given regular expressions. Two significant parts of the verification effort involve demonstrating the correctness of Thompson's algorithm and the computation of ε-closures using depth-first search. For a stream of length n and a list of regular expressions of total size m, our algorithm finds the first token in O(m · n) time, and tokenizes the entire stream in O(m · n^2) time in the worst-case.Item Language Support for Real-time Data Processing(2023-08-11) Kong, Lingkun; Mamouras, KonstantinosRecent technological advances are causing an enormous proliferation of streaming data, i.e., data that is generated in real-time. Such data is produced at an overwhelming rate that cannot be processed in traditional manners. This thesis aims to provide programming language support for real-time data processing through three approaches: (1) creating a language for specifying complex computations over real-time data streams, (2) developing software-hardware co-design to efficiently match regular patterns in a streaming setting, and (3) designing a system for parallel stream processing with the preservation of sequential semantics. The first part of this thesis introduces StreamQL, a high-level language for specifying complex streaming computations through a combination of stream transformations. StreamQL integrates relational, dataflow, and temporal constructs, offering an expressive and modular approach for programming streaming computations. Performance comparisons against popular streaming engines show that the StreamQL library consistently achieves higher throughput, making it a useful tool for prototyping complex real-world streaming algorithms. The second part of this thesis focuses on hardware acceleration for regular pattern matching, specifically targeting the matching of regular expressions with bounded repetitions. A hardware architecture inspired by nondeterministic counter automata is presented, which uses counter and bit vector modules to efficiently handle bounded repetitions. A regex-to-hardware compiler is developed in this work, which provides static analysis over regular expressions and translates them into hardware-recognizable programs. Experimental results show that our solution provides significant improvements in energy efficiency and area reduction compared to existing solutions. Finally, this thesis presents a novel programming system for parallelizing the processing of streaming data on multicore CPUs with the preservation of sequential semantics. This system addresses challenges in preserving the sequential semantics when dealing with identical timestamps, dynamic item rates, and non-linear task parallelism. A Rust library called ParaStream is developed to support semantics-preserving parallelism in stream processing, outperforming state-of-the-art tools in terms of single-threaded throughput and scalability. Real-world benchmarks show substantial performance gains with increasing degrees of parallelism, highlighting the practicality and efficiency of ParaStream.Item Static Analysis for Checking the Disambiguation Robustness of Regular Expressions(Association for Computing Machinery, 2024) Mamouras, Konstantinos; Le Glaunec, Alexis; Li, Wu Angela; Chattopadhyay, AgnishomRegular expressions are commonly used for finding and extracting matches from sequence data. Due to the inherent ambiguity of regular expressions, a disambiguation policy must be considered for the match extraction problem, in order to uniquely determine the desired match out of the possibly many matches. The most common disambiguation policies are the POSIX policy and the greedy (PCRE) policy. The POSIX policy chooses the longest match out of the leftmost ones. The greedy policy chooses a leftmost match and further disambiguates using a greedy interpretation of Kleene iteration to match as many times as possible. The choice of disambiguation policy can affect the output of match extraction, which can be an issue for reusing regular expressions across regex engines. In this paper, we introduce and study the notion of disambiguation robustness for regular expressions. A regular expression is robust if its extraction semantics is indifferent to whether the POSIX or greedy disambiguation policy is chosen. This gives rise to a decision problem for regular expressions, which we prove to be PSPACE-complete. We propose a static analysis algorithm for checking the (non-)robustness of regular expressions and two performance optimizations. We have implemented the proposed algorithms and we have shown experimentally that they are practical for analyzing large datasets of regular expressions derived from various application domains.