Timewarp - Projects - Dvé: Improving DRAM Reliability and Performance

← Back to Projects

Dvé: Improving DRAM Reliability and Performance On-Demand via Coherent Replication

► Appears in International Symposium on Computer Architecture - ISCA 2021
► Artifact available on github

What is Dvé?

Dvé is a memory system architecture to improve the reliability and performance of DRAM main memory.

Features and Properties:
What does it provide?

◉ higher memory reliability than any commercially-available, high memory reliability products like Chipkill ECC, Intel Partial Address Space Mirroring, IBM RAIM
◉ improves memory performance in a multi-socket NUMA architecture (lower memory access latency and higher bandwidth)
◉ provides on-demand replication for programs at runtime by using the idle memory present in systems
◉ strict recovery semantics and strongly consistent replication

How does it achieve it?
▣ Replicates memory on two different sockets of a multi socket NUMA system
▣ Uses cache coherence protocols (allow-based and deny-based) to provide "Coherent Replication"
▣ Builds on existing reliability mechanisms and coherence protocols for graceful degradation and fallback on failure
▣ Maps data in DRAM between replicas in a thermal risk aware manner to reduce failures
▣ Sets out the application interface and OS mechanisms required for providing replicating memory at runtime

Frequently asked questions (FAQs):
◈ Is it practical to sacrifice 50% of the memory capacity? How do you ensure enough memory capacity is available for replication (say if a workload is already using >50% memory)?
A: Firstly, Dvé can be deployed on-demand at finer granularity (per application, per VM, per container) so its not always static 50% capacity overhead. Secondly, we allow the control place (like the workload placement infrastructure) to make the smart decision of enabling/disabling replication depending on the needs of the workload. Thirdly, the OS can use balloon drivers to steal memory and create space for replication. It then monitors the page fault rate to decide if replication should be disabled when there is high memory pressure. Even if replication is disabled, Dvé will still fall back to baseline reliability provided by the underlying system.

◈ What is the main purpose of Dvé. Is it reliability or performance?
A: Dvé's main purpose is to to gurantee reliability. It maintains both replicas in sync for ensuring recovery from errors. The performance aspect of comes from the coherent replication which is workload dependent and best effort. Prior DRAM reliability techniques caused performance degradation whereas Dvé actually improves performance while providing reliability.

◈ How does Dvé work if its a single socket machine?
A: Even modern single socket machines have chiplet architecture (like in AMD Ryzen) or quadrant mode (like in Xeon Phi) and also have multiple memory controllers. Dvé can be employed even in such systems.
Upcoming CXL-based memories are also exposed as a remote NUMA node. Dvé's design can be applied in such systems to provide a high-availability remote memory.

◈ How are the power/energy metrics affected because of the additional DRAMs needed for the replica?
A: Firstly, Dvé uses under-utilized memory that is already provisioned but idle in the system. But even if we actually add additional memory, Dvé actually ends up saving energy (counter intuitively). This is because memory is in the order of 20% of the total system power and Dvé speedup reduces execution times and therefore over all system EDP is actually lower for Dvé.

◈ Can you comment on the performance overheads due to additional writes (because all memory writes have to be replicated to both NUMA nodes)? Does this cause any performance degradation? How is the inter-socket interconnect affected by this additional traffic?
A: Yes correct, Dvé does have to replicate all writes to both NUMA nodes and this does add some extra requests to the interconnect. But if we look at the overall interconnect traffic, Dvé reduces the total number of requests over the interconnect. This is because many of the reads that would have crossed the interconnect are now serviced from the local memory. This is one of the reasons for Dvé's improved performance.

◈ Specifically, what differentiates Dvé from Intel memory mirroring scheme which also does replication for reliability?
A: Intel mirroring is different in all 3 regards. Reliability, Performance and on-demandness of the scheme. Intel replicates data in 2 channels of the same memory controller while Dvé replicates on 2 different memory controllers. Therefore its able to keep replias farther apart and in a completely dijoint subsystem and hence provides better reliability. In the Intel scheme, the replica is passive. Replica is not used to service reads/write. Even we hypothetically allow for such active / load balaced replication it can only provide 2-3% performance. Dvé provides performance by mitigating the inter-socke interconnect latency and provides much higher performance. Finally, the amount of memory replicated is fixed at boot-time and can only be used for kernel allocations. However, Dvé can provide replicated memory on-demand at runtime and allows applications to allocate data in the high reliability region of memory. ◈ What are the area overheads for the replica directory and how is this structure architected?
A: In our simulations we use a 2k entry SRAM structure backed by a reserved area of memory to spill overflows. But the replica directory structure has similar design points as a traditional directory like invalidations can be sent to owner/sharers when evicting entries from replica directory.

◈ How does Dvé compare to the NUMA latency mitigation line of works like Carrefour/Shoal?
A: Carrefour/Shoal do not provide any reliability gurantees. They replicate read-only pages between NUMA nodes for performance in software (using OS support). While has to replicate all pages irrespective of its sharing characteristics to gurantee reliability and it maintains the replicas in sync using hardware cache coherence.

◈ Can you clarify precisely what OS support is required for Dvé's replication to work?
A: OS needs to create space for replication.
OS needs to map replica page pairs.
OS decides when replication should be enabled/disabled.
Full details are in the paper Section V.D

Dvé Trivia:
◈ Dvé is inspired by distributed system deployments where replication is frequently employed for fault tolerance and performance. We bring this insight into shared memory architecture.
◈ Dvé's philosophy is rooted in the holistic design approach, leveraging the time-tested "end-to-end argument". We advocate exposing faults in DRAM memory to the highest-level end point of memory (i.e., the memory controller level), thereby subsuming all other types of errors.
◈ The title of the project - Dvé - is derived from the Sankrit word (द्वे) which means "the two", referring here to the dual benefits of replication
◈ Dvé can be simply summarized as
replicate dram
* (Dvé was almost entirely done in the COVID-19 lockdowns; this is a homage to gov.uk 3-word campaigns like "Stay Home->Protect the NHS->Save Lives")