GPU Accelerated Reconfigurable Detector and Precoder for Massive MIMO SDR Systems
We present a reconfigurable GPU-based unified detector and precoder for massive MIMO software-defined radio systems. To enable high throughput, we implement the linear minimum mean square error detector/precoder and further reduce the algorithm complexity by numerical approximation without sacrificing the error-rate performance. For efficient GPU implementation, we explore the algorithm's inherent parallelism and take advantage of the GPU's numerous computing cores and hierarchical memories for the optimization of kernel computations. We furthermore perform multi-stream scheduling and multi-GPU workload deployment to pipeline multiple detection or precoding tasks on GPU streams for the reduction of host-device memory copy overhead. The flexible design supports both detection and precoding and can switch between Cholesky based mode and conjugate gradient based mode for accuracy and complexity tradeoff. The GPU implementation exceeds 250 Mb/s detection and precoding throughput for a 128x16 antenna system.
Li, Kaipeng. "GPU Accelerated Reconfigurable Detector and Precoder for Massive MIMO SDR Systems." (2015) Master’s Thesis, Rice University. https://hdl.handle.net/1911/88088.