In a previous post, we discussed hardware implementation for array sorting using comparators. However, it is also possible to sort arrays without comparators.
Surajeet Ghosh, Shaon Dasgupta and Sanchita Saha Ray presented a novel solution in their paper “A Comparison-Free Hardware Sorting Engine”. We will cover the high-level idea and a few implementation details of their solution.
The proposed architecture receives unsorted array data from an Unsorted Memory (UM), detects the largest element in each clock cycle, and stores it in a Sorted Memory (SM). Assuming the array has N entries, both UM and SM have N entries, and this solution can complete sorting in N clock cycles / iterations.

Assuming each element in the array has n bits, the hardware sorting engine consists of n cascaded blocks. In fact, each block is filtering the smaller elements from the array, forwarding the larger elements to the next block for further filtering, and finally deciding the Largest Element (LE) among the participating elements.

Each of these blocks consists of N basic cells. The internal structures of a block and a basic cell are shown below.

At the start of each clock cycle / iteration, the Element Vector Table (EVT) indicates the data elements yet to be sorted. The 1st block takes two inputs, one from UM, and the other from EVT. All other blocks take corresponding data bits from UM, and previous block’s outputs as inputs. In the final block, the index to the LE is identified. It is possible to have duplicated values in the unsorted array, we need to have a masking logic or an arbiter in the Largest Element Detector (LED), to get the unique index to the LE in each iteration.
At the end of each clock cycle / iteration, EVT is updated by the Final Output (FO) of its previous iteration. This is to exclude the current iteration’s LE from the next iteration.
The following diagram shows an illustrative example from the paper. The example shows the sorting process of 5 data elements: 13, 4, 6, 1, 10. Note that, the 1st block uses MSBs of the data elements, while the last block uses LSBs of the data elements.

We encourage interviewees to go through this example by yourself.
Reference
- “A Comparison-Free Hardware Sorting Engine” by Surajeet Ghosh, Shaon Dasgupta and Sanchita Saha Ray

Leave a comment