Browsing by Author "Radosavljevic, Predrag"
Now showing 1 - 20 of 30
Results Per Page
Sort Options
Item 802.11b Operating in a Mobile Channel: Performance and Challenges(2003-09-20) Steger, Christopher; Radosavljevic, Predrag; Frantz, Patrick; Center for Multimedia Communications (http://cmc.rice.edu/)In the past, the worlds of wireless voice and data transmission have been largely disjoint. Voice traffic has been carried over circuit-switched cellular links, and data has been largely restricted to packet-switched wireless LANs. Now, as consumers demand higher bandwidth connections without sacrificing mobility and traffic transitions from primarily voice to data, service providers must produce what is essentially a ubiquitous wireless LAN. To this end, we have studied the effects of a mobile channel on current generation 802.11 A, B, and G wireless LAN cards to see how readily they can be applied to more challenging environments. Not surprisingly, current WLAN technology suffers from significantly degraded performance when subjected to the rigors of a mobile channel. We created emulated bi-directional peer-to-peer links in which we were able to manipulate individual channel parameters. By isolating individual propagation effects and testing several different implementations of the standards, we have discovered which channel parameters have the most significant impact on performance. For instance, the large delay spreads typical of an outdoor channel seem to produce the most deleterious effect on throughput in 802.11b. We use our observations to evaluate the viability of direct-sequence spread-spectrum systems (similar to 802.11b) versus that of OFDM systems (like 802.11a and 802.11g). Then we offer suggestions for how future systems should be adapted in order to manage these effects, and we project the ultimate limitations and possibilities for subsequent 802.11-like systems.Item Architecture and Algorithm for a Stochastic Soft-output MIMO Detector(IEEE, 2007-11-04) Amiri, Kiarash; Radosavljevic, Predrag; Cavallaro, Joseph R.; CMCIn this paper, we propose a novel architecture for a soft-output stochastic detector in multiple-input, multiple-output (MIMO) systems. The stochastic properties of this detector are studied and derived in this work, and several complexity reduction techniques are proposed to significantly reduce its cost from an architecture-implementation perspective. We also propose an efficient architecture to implement this detector. Finally, this detector is incorporated into an iterative detectiondecoding structure, and through simulations, it is shown that the overall frame error rate (FER) performance and complexity is of the same order as that of the conventional K-best sphere detector.Item Architecture and Algorithm for a Stochastic Soft-output MIMO Detector(IEEE, 2007-11-01) Amiri, Kiarash; Radosavljevic, Predrag; Cavallaro, Joseph R.; Center for Multimedia CommunicationIn this paper, we propose a novel architecture for a soft-output stochastic detector in multiple-input, multiple-output (MIMO) systems. The stochastic properties of this detector are studied and derived in this work, and several complexity reduction techniques are proposed to significantly reduce its cost from an architecture-implementation perspective. We also propose an efficient architecture to implement this detector. Finally, this detector is incorporated into an iterative detection-decoding structure, and through simulations, it is shown that the overall frame error rate (FER) performance and complexity is of the same order as that of the conventional K-best sphere detector.Item ASIP Architecture for Future Wireless Systems: Flexibility and Customization(2004-06-01) Cavallaro, Joseph R.; Radosavljevic, Predrag; Center for Multimedia Communications (http://cmc.rice.edu/)Efficiency and flexibility are crucial features of the processors in the next generation of wireless cellular systems. Processors need to be efficient in order to satisfy real-time requirements for very demanding algorithms in new emerging wireless standards (3GPP, 4G, 802.11x, WiFi, DVD-S2, DAB, just to name a few). Flexibility, on the other hand, allows design modifications to respond to the evolution of standards (from GPRS to 3G, for example), worldwide compatibility (UMTS in Europe and Asia, CDMA2000 in North America), changes of user requirements depending of the quality of service (QoS), etc. Often, efficiency and flexibility goals are conflicting. Efficiency is related to the more custom hardware implementation such as ASIC processors. On the other hand, flexibility is the basic feature of programmable platforms such as DSP processors. While computationally efficient and low power solutions, ASIC processors for wireless applications are often not flexible enough to support necessary variations of implemented algorithms. ASIC design, especially in deep sub micron technologies, is very complex task and the manufacturing costs are also high. It is cheaper to write and debug software (application written in high level languages) than directly design, debug and manufacture hardware. Furthermore, there are increasing demands for products with low time-to-market, which is not primary characteristic of the ASIC design. On the other hand, DSP processor, although fully programmable, cannot achieve high performance with low power dissipation. DSP cores are often not able to achieve high level of instruction and data parallelism required for future generations of wireless systems.Item ASIP Architecture Implementation of Channel Equalization Algorithms for MIMO Systems in WCDMA Downlink(2004-09-01) Radosavljevic, Predrag; Cavallaro, Joseph R.; de Baynast, Alexandre; Center for Multimedia Communications (http://cmc.rice.edu/)This paper presents a customized and flexible hardware implementation of linear iterative channel equalization algorithms for WCDMA downlink transmission in 3G wireless system with multiple transmit and receive antennas (MIMO system). Optimized (in terms of area and execution time) and power efficient Application Specific Instruction set Processors (ASIPs) based on Transport Triggered Architecture (TTA) are designed that can operate efficiently in slow and fast fading high scattering environments. The instruction set of TTA processors is extended with several user-defined operations specific for channel equalization algorithms that dramatically optimize the architecture solution for the physical layer of the mobile handset. The final results of presented design-space exploration method are the ASIP processors with low cost/performance ratio. Automatic software-hardware co-design flow for conversion of C application code into gate-level hardware design of ASIP architectures is also described. Implemented ASIP solutions achieve real time requirements for 3GPP wireless standard (1xEV-DV standard, in particular) with reasonable clock speed and power dissipation.Item Channel Equalization Algorithms for MIMO Downlink and ASIP Architectures(2004-04-01) Radosavljevic, Predrag; Center for Multimedia Communications (http://cmc.rice.edu/)Processors for mobile handsets in 3G cellular systems require: high speed, flexibility and low power dissipation. While computationally efficient, ASIC processors are often not flexible enough to support necessary variations of implemented algorithms. On the other hand, programmable DSP processors are not optimized for a specific application and often they are not able to achieve high performance with low power dissipation. As a solution we exploit programmable architectures with possibility for customization - Application Specific Instruction set Processors (ASIPs). Channel equalization based on iterative Conjugate Gradient and Least Mean Square algorithms and several algorithmic modifications are implemented in MIMO context on the same ASIPs based on Transport Triggered Architecture. Customization of ASIPs is achieved by extending the instruction set with application-specific operations. Identical customized ASIP architecture can achieve 3GPP real-time requirements in broad range of channel environments and for different equalization algorithms with reasonable clock frequency and low power dissipation.Item Channel equalization algorithms for MIMO downlink and ASIP architectures(2004) Radosavljevic, Predrag; Cavallaro, Joseph R.Processors for mobile handsets in 3G cellular systems require: high speed, flexibility and low power dissipation. While computationally efficient, ASIC processors are often not flexible enough to support necessary variations of implemented algorithms. On the other hand, programmable DSP processors are not optimized for a specific application and often they are not able to achieve high performance with low power dissipation. As a solution we exploit programmable architectures with possibility for customization---Application Specific Instruction set Processors (ASIPs). Channel equalization based on iterative Conjugate Gradient and Least Mean Square algorithms and several algorithmic modifications are implemented in MIMO context on the same ASIPs based on Transport Triggered Architecture. Customization of ASIPs is achieved by extending the instruction set with application-specific operations. Identical customized ASIP architecture can achieve 3GPP real-time requirements in broad range of channel environments and for different equalization algorithms with reasonable clock frequency and low power dissipation.Item Chip level LMMSE Equalization for Downlink MIMO CDMA in fast fading environments(2004-11-01) de Baynast, Alexandre; Radosavljevic, Predrag; Cavallaro, Joseph R.; Center for Multimedia Communications (http://cmc.rice.edu/)In this paper, we consider linear MMSE equalization for wireless downlink transmission with multiple transmit and receive antennas in fast fading environment. We propose a new algorithm based on conjugate-gradient algorithm with enhanced channel estimation. In order to be robust to the channel variations, the channel coefficients are estimated by using a weighted sliding window. Two methods to determine optimal weights with respect to the Doppler frequency are proposed. The algorithm has been tested in fast fading environment (Vehicular A for a velocity for the mobile station of 120 km/h). We show by simulations that good performance are obtained in correlated fast fading environment with reasonable complexity. Moreover, this method outperforms approaches based on forgetting factor, basic sliding window and LMS.Item Configurable LDPC Decoder Architecture for Regular and Irregular Codes(Springer, 2008-11-01) Karkooti, Marjan; Radosavljevic, Predrag; Cavallaro, Joseph R.; Center for Multimedia CommunicationLow Density Parity Check (LDPC) codes are one of the best error correcting codes that enable the future generations of wireless devices to achieve higher data rates with excellent quality of service. This paper presents two novel flexible decoder architectures. The first one supports (3, 6) regular codes of rate 1/2 that can be used for different block lengths. The second decoder is more general and supports both regular and irregular LDPC codes with twelve combinations of code lengths −648, 1296, 1944-bits and code rates-1/2, 2/3, 3/4, 5/6- based on the IEEE 802.11n standard. All codes correspond to a block-structured parity check matrix, in which the sub-blocks are either a shifted identity matrix or a zero matrix. Prototype architectures for both LDPC decoders have been implemented and tested on a Xilinx field programmable gate array.Item Configurable, High Throughput, Irregular LDPC Decoder Architecture: Tradeoff Analysis and Implementation(2006-09-01) Karkooti, Marjan; Radosavljevic, Predrag; Cavallaro, Joseph R.; Center for Multimedia Communications (http://cmc.rice.edu/)With the current trend of the increase in the data-rate requirements of wireless systems, there will be a huge need to increase their performance by utilizing more sophisticated channel coding algorithms. Low Density Parity Check (LDPC) codes are one of the best error correcting codes that enable these future wireless systems to grow with the demand. This paper presents a novel flexible architecture for irregular LDPC decoder that supports twelve combinations of code lengths - 648, 1296, 1944 bits - and code rates- 1/2, 2/3, 3/4, 5/6 - based on the IEEE 802.11n standard. All the codes correspond to a block-structured parity check matrix, in which the sub-blocks are either a shifted identity matrix or a zero matrix. A prototype of the LDPC decoder has been implemented and tested on a Xilinx FPGA and has been synthesized for ASIC.Item Frequency-Domain ICI Estimation, Shortening, and Cancellation in OFDM Receivers(2006-04-01) Sestok, Charles K.; Radosavljevic, PredragOrthogonal frequency division multiplexing (OFDM) communication systems encounter performance limitations due to time-varying channels common in wireless applications. The channel variations introduce inter-carrier interference (ICI) in the received signal. To compensate, groups of adjacent OFDM carriers can be processed to cancel the interference. This paper considers a two-stage ICI cancellation technique. In the first stage, linear preprocessing compresses the effective ICI response. The second stage generates an MMSE estimate of the transmitted data. Simulation results for a DVB-T receiver using this technique show that frequency domain ICI response shortening performs best when combined with the receiver channel estimator.Item A General Hardware/Software Co-design Methodology for Embedded Signal Processing and Multimedia Workloads(IEEE, 2006-11-01) Brogioli, Michael; Radosavljevic, Predrag; Cavallaro, Joseph R.; Center for Multimedia CommunicationThis paper presents a hardware/software co-design methodology for partitioning real-time embedded multimedia applications between software programmable DSPs and hardware based FPGA coprocessors. By following a strict set of guidelines, the input application is partitioned between software executing on a programmable DSP and hardware based FPGA implementation to alleviate computational bottlenecks in modern VLIW style DSP architectures used in embedded systems. This methodology is applied to channel estimation firmware in 3.5G wireless receivers, as well as software based H.263 video decoders. As much as an 11x improvement in runtime performance can be achieved by partitioning performance critical software kernels in these workloads into a hardware based FPGA implementation executing in tandem with the existing host DSP.Item Hardware/Software Co-design Methodology and DSP/FPGA Partitioning: A Case Study for Meeting Real-Time Processing Deadlines in 3.5G Mobile Receivers(IEEE, 2006-08-01) Brogioli, Michael; Radosavljevic, Predrag; Cavallaro, Joseph R.; Center for Multimedia CommunicationThis paper presents a DSP/FPGA hardware/software partitioning methodology for signal processing workloads. The example workload is the channel equalization and user-detection in HSDPA wireless standard for 3.5G mobile handsets. Channel equalization and user-detection is a major component of receiver baseband processing and requires strict adherence to real time deadlines. By intelligently exploring the embedded design space, this paper presents a hardware/software system-on-chip partitionings that utilizes both DSP and FPGA based coprocessors to meet and exceed the real time data rates determined by the HSDPA standard. Hardware and software partitioning strategies are discussed with respect to real time processing deadlines, while an SOC simulation toolset is presented as vehicle for prototyping embedded architectures.Item High-Throughput Multi-rate LDPC Decoder based on Architecture-Oriented Parity Check Matrices(2006-09-01) Radosavljevic, Predrag; de Baynast, Alexandre; Karkooti, Marjan; Cavallaro, Joseph R.; Center for Multimedia Communications (http://cmc.rice.edu/)A high throughput pipelined LDPC decoder that supports multiple code rates and codeword sizes is proposed. In order to increase memory throughput, irregular block structured parity-check matrices are designed with the constraint of equally distributed odd and even nonzero block-columns in each horizontal layer for the pre-determined set of code rates. The designed decoder achieves a data throughput of more than 1 Gb/s without sacrificing the error-correcting performance of capacity-approaching irregular block codes. The architecture is prototyped on an FPGA and synthesized for an ASIC design flow.Item High-Throughput Multi-rate LDPC Decoder based on Architecture-Oriented Parity-Check Matrices(2006-02-01) Radosavljevic, Predrag; de Baynast, Alexandre; Karkooti, Marjan; Cavallaro, Joseph R.; Center for Multimedia Communications (http://cmc.rice.edu/)High throughput pipelined LDPC decoder that supports multiple code rates and codeword sizes is proposed. In order to increase memory throughput, irregular block structured parity-check matrices are designed with the constraint of equally distributed odd and even nonzero block-columns in each horizontal layer for pre-determined set of code rates. Designed decoder achieves data throughput of approximately 1 Gb/s without sacrificing error-correcting performance of capacity-approaching irregular block codes. The prototype architecture is implemented on FPGA.Item Implementation of Iterative Channel Equalization for MIMO Systems in WCDMA Downlink(2003-10-20) Radosavljevic, Predrag; Cavallaro, Joseph R.; de Baynast, Alexandre; Center for Multimedia Communications (http://cmc.rice.edu/)In downlink transmission, channel multipaths destroy the orthogonality between users causing Multiple Access Interference (MAI). In order to restore the orthogonality between the users, chip-level channel equalization based on the iterative conjugate-gradient (CG) algorithm has been proposed in [Heikkilla '02]. We extend this approach to the multiple transmit antenna case and propose 16-bit fixed point implementation. Simulations show the robustness of the fixed point implementation of the proposed algorithm. Moreover, this approach outperforms RAKE receiver in all cases.Item Multi-Rate High-Throughput LDPC Decoder: Tradeoff Analysis between Decoding Throughput and Area(2006-09-01) Radosavljevic, Predrag; de Baynast, Alexandre; Karkooti, Marjan; Cavallaro, Joseph R.; Center for Multimedia Communications (http://cmc.rice.edu/)In order to achieve high decoding throughput (hundreds of MBits/sec and above) for multiple code rates and moderate codeword lengths (up to 2.5K bits), several decoder solutions with different levels of processing parallelism are possible. Selection between these solutions is based on a threefold criterion:~hardware complexity, decoding throughput, and error-correcting performance. In this work, we determine multi-rate LDPC decoder architecture with the best tradeoff in terms of area size, error-correcting performance, and decoding throughput. The prototype architecture of optimal LDPC decoder is implemented on FPGA.Item On Turbo-Schedules for LDPC Decoding(2006-05-01) de Baynast, Alexandre; Radosavljevic, Predrag; Sabharwal, Ashutosh; Cavallaro, Joseph R.; Center for Multimedia Communications (http://cmc.rice.edu/)The convergence rate of LDPC decoding is comparatively slower than turbo code decoding: 25 LDPC iterations versus 8-10 iterations for turbo codes. Recently, Mansour proposed a â turbo-scheduleâ to improve the convergence rate of LDPC decoders. In this letter, we first extend the turbo-scheduling principle to the check messages. Second, we show analytically that the convergence rate of both turbo-schedules is about twice as fast as the standard message passing algorithm for most LDPC codes.Item Optimized Message Passing Schedules for LDPC Decoding(2005-11-01) Radosavljevic, Predrag; de Baynast, Alexandre; Cavallaro, Joseph R.; Center for Multimedia Communications (http://cmc.rice.edu/)The major drawback of the LDPC codes versus the turbo-codes is their comparative low convergence speed: 25-30 iterations vs. 8-10 iterations for turbo-codes. Recently, Hocevar showed by simulations that the convergence rate of the LDPC decoder can be accelerated by exploiting a â turbo-schedulingâ applied on the bit-node messages (rows of the parity check matrix). In this paper, we show analytically that the convergence rate for this type of scheduling is about two times increased for most of the regular LDPC codes. Second we prove that â turbo-schedulingâ applied on the rows of the parity check matrix is identical belief propagation algorithm as standard message passing algorithm. Furthermore, we propose two new message passing schedules: 1) a turbo-scheduling is applied on the checknode messages (columns of the parity check matrix); 2) a hybrid version of both previous schedules where the turbo-effect is applied on both check-nodes and bit-nodes. Frame error rate simulations validate the effectiveness of the proposed schedules.Item Optimized Message Passing Schedules for LDPC Decoding(2005-11-01) Radosavljevic, Predrag; de Baynast, Alexandre; Cavallaro, Joseph R.; Center for Multimedia Communications (http://cmc.rice.edu/)The major drawback of the LDPC codes versus the turbo-codes is their comparative low convergence speed: 25-30 iterations vs. 8-10 iterations for turbo-codes. Recently, Hocevar showed by simulations that the convergence rate of the LDPC decoder can be accelerated by exploiting a â turbo-schedulingâ applied on the bit-node messages (rows of the parity check matrix). In this paper, we show analytically that the convergence rate for this type of scheduling is about two times increased for most of the regular LDPC codes. Second we prove that â turbo-schedulingâ applied on the rows of the parity check matrix is identical belief propagation algorithm as standard message passing algorithm. Furthermore, we propose two new message passing schedules: 1) a turbo-scheduling is applied on the checknode messages (columns of the parity check matrix); 2) a hybrid version of both previous schedules where the turbo-effect is applied on both check-nodes and bit-nodes. Frame error rate simulations validate the effectiveness of the proposed schedules.