Design of a Single-Precision Floating-Point Square Root Unit for use in a Hardware Ray Tracer
Abstract
A unit for calculating square root and inverse square root of 32-bit floating-point numbers was designed using SystemVerilog. The unit was made to be incorporated into an existing design for a hardware ray tracer in order to replace software approximations for these functions. The resulting unit calculates the intended functions within the accuracy of the two least significant bits, meaning a maximum relative error of 2.38e-7. The maximum possible frequency of the square root unit was measured at 106MHz on the target FPGA, with a latency of 6 cycles per operation, and a throughput of one operation per cycle. When compared to four other square root architectures, the proposed design was shown to have a lower latency than the compared designs.