Browsing by Author "Palem, Krishna V."
Now showing 1 - 8 of 8
Results Per Page
Sort Options
Item Computing device using inexact computing architecture processor(2013-11-19) Palem, Krishna V.; Chakrapani, Lakshmi Narasimhan; Lingamneni, Avinash; Rice University; United States Patent and Trademark OfficeIn general, in one aspect, the invention relates to a computer readable medium including software instructions which, when executed by a processor, perform a method, the method including receiving a first method call from an application, wherein the first method call is associated with a first application component; obtaining a first application component error tolerance (ACET) value associated with the first method call; determining, using the first ACET value and a first inexact amount value (IAV) of a first inexact computing architecture (ICA) processor, that the first ICA processor is available to execute the first method call; and processing the first method call using the first ICA processor.Item Domain-driven models yield better predictions at lower cost than reservoir computers in Lorenz systems(The Royal Society, 2021) Pyle, Ryan; Jovanovic, Nikola; Subramanian, Devika; Palem, Krishna V.; Patel, Ankit B.Recent advances in computing algorithms and hardware have rekindled interest in developing high-accuracy, low-cost surrogate models for simulating physical systems. The idea is to replace expensive numerical integration of complex coupled partial differential equations at fine time scales performed on supercomputers, with machine-learned surrogates that efficiently and accurately forecast future system states using data sampled from the underlying system. One particularly popular technique being explored within the weather and climate modelling community is the echo state network (ESN), an attractive alternative to other well-known deep learning architectures. Using the classical Lorenz 63 system, and the three tier multi-scale Lorenz 96 system (Thornes T, Duben P, Palmer T. 2017 Q. J. R. Meteorol. Soc.143, 897–908. (doi:10.1002/qj.2974)) as benchmarks, we realize that previously studied state-of-the-art ESNs operate in two distinct regimes, corresponding to low and high spectral radius (LSR/HSR) for the sparse, randomly generated, reservoir recurrence matrix. Using knowledge of the mathematical structure of the Lorenz systems along with systematic ablation and hyperparameter sensitivity analyses, we show that state-of-the-art LSR-ESNs reduce to a polynomial regression model which we call Domain-Driven Regularized Regression (D2R2). Interestingly, D2R2 is a generalization of the well-known SINDy algorithm (Brunton SL, Proctor JL, Kutz JN. 2016 Proc. Natl Acad. Sci. USA113, 3932–3937. (doi:10.1073/pnas.1517384113)). We also show experimentally that LSR-ESNs (Chattopadhyay A, Hassanzadeh P, Subramanian D. 2019 (http://arxiv.org/abs/1906.08829)) outperform HSR ESNs (Pathak J, Hunt B, Girvan M, Lu Z, Ott E. 2018 Phys. Rev. Lett.120, 024102. (doi:10.1103/PhysRevLett.120.024102)) while D2R2 dominates both approaches. A significant goal in constructing surrogates is to cope with barriers to scaling in weather prediction and simulation of dynamical systems that are imposed by time and energy consumption in supercomputers. Inexact computing has emerged as a novel approach to helping with scaling. In this paper, we evaluate the performance of three models (LSR-ESN, HSR-ESN and D2R2) by varying the precision or word size of the computation as our inexactness-controlling parameter. For precisions of 64, 32 and 16 bits, we show that, surprisingly, the least expensive D2R2 method yields the most robust results and the greatest savings compared to ESNs. Specifically, D2R2 achieves 68 × in computational savings, with an additional 2 × if precision reductions are also employed, outperforming ESN variants by a large margin.This article is part of the theme issue ‘Machine learning for weather and climate modelling’.Item Implementing Energy Parsimonious Circuits through Inexact Designs(2011) Lingamneni, Avinash; Palem, Krishna V.Inexact Circuits or circuits in which accuracy of the output can be traded for cost (energy, delay and/or area) savings, have been receiving increasing attention of late due to invariable inaccuracies in nanometer-scale circuits and a concomitant growing desire for ultra low energy embedded systems. Most of the previous approaches to realize inexact circuits relied on scaling of circuit-level operational parameters (such as supply voltage) to achieve the cost and accuracy tradeoffs, and suffered from serious drawbacks of significant implementation overheads that drastically reduced the gains. In this thesis, two novel architecture-level approaches called Probabilisttc Pruning and Probabilistic Logic Minimization are proposed to realize inexact circuits with zero overhead. Extensive simulations on various architectures of datapath elements and a prototype chip fabrication demonstrate that normalized gains as large as 2X-9.5X in Energy-Delay-Area product can be obtained for relative error as low as 10 -6 % - 1% compared to corresponding conventional correct designs.Item Influence based bit-quantization for machine learning: Cost Quality Tradeoffs(2020-10-30) Jiang, Mingchao; Palem, Krishna V.Due to the significant computational cost associated with machine learning architectures such as neural networks or network for short, there has been significant interest in quantizing or reducing the number of bits used. Current quantization approaches treat all of the network parameters equally by allocating the same bit width budget to all of them. In this work we are proposing a quantization approach which allocates bit budgets to parameters preferentially based on their influence. Here, our notion of influence is inspired by the traditional definition of this concept from the Fourier analysis of Boolean functions. We show that guiding investment of bit budgets using influence can get acceptable accuracy with lower overall bit budgets when compared to approaches that do not use quantization. We show that by trading 4.5% in accuracy, we can gain in bit budgets by a factor of 28. To better understand our approach, we also considered allocating bit budgets through random allocations and found that an our influence based approach outperforms most of the time by noticeable margins. All of these results are based on the MNIST data set and our algorithm for computing influence is based on a simple and easy to implement greedy approach.Item On the use of inexact, pruned hardware in atmospheric modelling(Royal Society, 2014) Düben, Peter D.; Joven, Jaume; Lingamneni, Avinash; McNamara, Hugh; De Micheli, Giovanni; Palem, Krishna V.; Palmer, T.N.Inexact hardware design, which advocates trading the accuracy of computations in exchange for significant savings in area, power and/or performance of computing hardware, has received increasing prominence in several error-tolerant application domains, particularly those involving perceptual or statistical end-users. In this paper, we evaluate inexact hardware for its applicability in weather and climate modelling. We expand previous studies on inexact techniques, in particular probabilistic pruning, to floating point arithmetic units and derive several simulated set-ups of pruned hardware with reasonable levels of error for applications in atmospheric modelling. The set-up is tested on the Lorenz ‘96 model, a toy model for atmospheric dynamics, using software emulation for the proposed hardware. The results show that large parts of the computation tolerate the use of pruned hardware blocks without major changes in the quality of short- and long-time diagnostics, such as forecast errors and probability density functions. This could open the door to significant savings in computational cost and to higher resolution simulations with weather and climate modelsItem Optimization Techniques for Minimizing Energy Consumption in Approximate Circuits(2011) Muntimadugu, Kirthi Krishna; Palem, Krishna V.This work presents different global and local optimization techniques for designing "approximate" circuits which decrease energy consumption, one of the most important criteria in present day circuit design. The concept of "approximate" circuits which trades off energy consumption to output quality, thus creating a new dimension to the design space, is radically different from the conventional design principle in which all circuits operate correctly all the time. But efficient and intelligent designs have to be realized to tap its full potential. These techniques, which have not been explored till date, are based on a rigorous mathematical model and target to improve the output quality of a given circuit keeping the energy consumption to a minimum. They use the value of information and the architecture of the circuit to maximize efficiency. They have been applied to digital signal processing circuits to realize energy savings up to 2X the conventional value.Item Optimizing Energy to Minimize Errors in Approximate Ripple Carry Adders(2011-07-25) Kedem, Zvi M.; Mooney, Vincent J.; Muntimadugu, Kirthi Krishna; Palem, Krishna V.We present a theoretical foundation and a methodology for automatically assigning supply voltages to approximate ripple carry adders in which accuracy is traded for energy consumption. The error minimization problem for a fixed energy budget is formulated as a binned geometric program. We first use geometric programming to minimize the average error of the adder and compute the supply voltages at the gate level, after which we bin the voltages to a finite set (of four or five voltages) using a heuristic. Using HSPICE in 90nm technology, we show simulation results by applying our methodology to a ripple carry adder and obtain savings of up to 2:58X (and by a median of 1:58X) in average error, when compared to uniform voltage scaling, for the same energy consumption. Compared to a naive biased voltage scaling (n-BIVOS), which is the best prior art in literature, a Binned Geometric Program Solution (BGPS) as proposed in this paper saves 32.3% energy with the same PSNR in an 8point FFT example or, alternatively, increases the PSNR by 8.5db for the same energy consumption for the FFT.Item Realizing Ultra Energy-efficient Hardware Systems through Inexact Computing(2014-04-25) Lingamneni, Avinash; Palem, Krishna V.; Burrus, C. Sidney; Vardi, Moshe Y.; Enz, ChristianIn this dissertation, novel methodologies for designing energy-efficient hardware systems that deliver just "good-enough" results are proposed by leveraging the principles of inexact computing, wherein perceptually- or statistically-acceptable accuracy degradation is permitted in exchange for substantial hardware savings. These inexact computing systems are of particular relevance today owing to the widely acknowledged limit to the exponentially improving resource-savings sustained by Moore's law driven technology scaling as well as the emergence of a large classes of workloads (in particular, embedded, multimedia and Recognition, Mining and Synthesis (RMS) applications) that could still process information usefully with unreliable or error-prone elements. This thesis proposes several inexact design methodologies to efficiently realize energy-efficient hardware systems by intentionally rendering reliable components unreliable. These inexact systems are shown to produce ``good-enough" results, judged through domain-specific quality evaluation metrics, in a wide variety of error-resilient applications, while consuming significantly less hardware resources—quantified through energy consumed, critical path delay and/or area occupied. The proposed inexact design techniques span several layers of design abstraction: voltage overscaling (overclocking) and gate sizing at the physical layer; inexact logic minimization at the logic-layer; probabilistic pruning and compensation buddies at the architectural-layer and waveform shaping at the algorithm-layer. Furthermore, a cross-layer co-design framework is presented that creates a symbiotic interaction between the techniques from different layers of abstraction to maximize the resulting energy gains for a targeted accuracy loss while overcoming the drawbacks of individual techniques; this framework uses machine-learning approaches to further enhance the cost-accuracy tradeoff gains in DSP hardware systems. The effectiveness of the proposed techniques has been validated through extensive experimental simulations and backed up by two ASIC chip fabrications—64-bit inexact arithmetic adders in 180nm(LP) and 256-point quality-tunable Fast Fourier Transform (FFT) accelerators in 65nm process technology. The utility of the proposed techniques is also shown in applications from other domains including image/multimedia codecs as well as neural network accelerators—all of which can tolerate inaccuracies to varying extents and can synthesize sufficient information even from inaccurate computations.