Treffer: O(n) key–value sort with active compute memory

Title:
O(n) key–value sort with active compute memory
Contributors:
Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. PM - Programming Models
Publisher Information:
Institute of Electrical and Electronics Engineers (IEEE)
Publication Year:
2024
Collection:
Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge
Document Type:
Fachzeitschrift article in journal/newspaper
File Description:
16 p.; application/pdf
Language:
English
Relation:
https://ieeexplore.ieee.org/abstract/document/10454132; info:eu-repo/grantAgreement/EC/H2020/955606/EU/DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES/DEEP-SEA; info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PCI2021-121958/ES/DEEP-SOFTWARE FOR EXASCALE ARCHITECTURES/; info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-107255GB-C21/ES/BSC - COMPUTACION DE ALTAS PRESTACIONES VIII/; http://hdl.handle.net/2117/403907
DOI:
10.1109/TC.2024.3371773
Rights:
Open Access
Accession Number:
edsbas.4AD3C034
Database:
BASE

Weitere Informationen

We propose the Active Compute Memory (ACM), a near-memory-processing architecture capable of performing key–value sort directly in the DRAM. In the ACM architecture, sort is merely the writing of data into memory with one addressing protocol (perspective) and reading it back with different perspective. The first perspective is conventional, based on the data address; the second perspective is the sorted order. The ACM requires additional tables to store the meta-data and moderate control logic enhancements that can be implemented directly in the DRAM silicon. By these modest enhancements to DRAM, ACM exploits the parallelism inherently available in the row buffer to enable sort with O ( n ) complexity. This leads to an order of magnitude improvement in ACM performance and energy compared to conventional O ( n log n ) CPU-centric sort algorithms. The ACM also shows superior performance compared to other near-memory sort accelerators. This is because the ACM processing is done near the row buffer and it exploits much lower memory access latency, higher bandwidth and wider parallel processing. The sort operation covered in this paper is just an example of an address management operation that can be efficiently implemented directly in the DRAM silicon. We release as an open source the simulation infrastructure for the ACM performance and energy modeling. We would encourage the community to use it, adapt it to other PIM proposals, and share their own evaluations. ; This work was funded by the Collaboration Agreement between Micron Technology, Inc. and BSC. The work was also supported by the Spanish Government, under the contracts PID2019-107255GB-C21 and CEX2021-001148-S funded by MCIN/AEI/ 10.13039/501100011033. The work also received funding from the Department of Research and Universities of the Government of Catalonia to the AccMem Research Group (Code: 2021 SGR 00807) and project DEEP-SEA funded by European Union’s Horizon 2020 research and innovation programme , under Grant Agreement no.955606 and the Spanish ...