In-Persistent Memory Computing with Java*

How Open-Source Libraries Make it Easy to Integrate Persistent Memory into your Applications

Despite the evolution of computers since the 1960s, hard disks have consistently remained the most conventional and viable way to store large amounts of data. But although they offer large capacity and durability, they have several shortcomings. Hard disks have low bandwidth and high latency, increasing the amount of time the processor has to wait for data to be transferred from disk to DRAM. Also, hard disks store data as a stream of bytes, leading to additional overhead from serialization and deserialization. As a result, disk IO presents a major challenge for many resource-intensive software applications that require data persistence, such as a database.

The ideal solution would be to store the data in memory, but DRAM can’t provide sufficient capacity and durability at an affordable cost. Persistent memory answers this challenge by combining the best of both worlds (Figure 1).

Figure 1 – Storage versus memory

What is Persistent Memory?

Persistent memory offers memory-like performance while providing durability and storage capacity in the terabyte range. The data remains fully persistent across machine reboots, and also allows direct user-mode access, thereby eliminating the kernel/IO from the data path.

Developers program persistent memory like regular memory, but it persists like storage. The Persistent Memory Development Kit (PMDK) is an open-source C/C++ library that can be used to write applications that take advantage of persistent memory. The Persistent Collections for Java* (PCJ) library extends its scope to Java-based applications such as Apache Cassandra*, Apache Spark*, and Apache Ignite*, to name a few.

Persistent Collections for Java (PCJ): A Library for Persistent Memory Java Programming

The PCJ library is an open-source pilot project that enables Java developers to design or retrofit their applications around persistent memory without having to worry about disk IO limitations. Figure 2 shows the implementation stack. The PCJ library offers persistent classes that can be used to create persistent objects similar to regular Java objects. These objects are stored on the persistent heap and persist across JVM sessions and machine reboots. These persistent objects have a reachability based lifetime and are not garbage-collected until they become unreachable. Since data is stored directly in an object layout form within persistent memory, no serialization or deserialization is necessary. The PCJ library also exposes APIs for defining customized persistent classes similar to that of regular Java classes. Figure 3 shows a nonexhaustive list of built-in persistent classes in the PCJ library.

Figure 2 – PCJ implementation stack

Figure 3 – Persistent classes

Programming with PCJ

Figure 4 shows a code snippet demonstrating how to use the PCJ library to allocate and store data in persistent memory.

Figure 4 – Data persistence after reboot

Figure 5 shows a regular Java class, followed by its persistent version (Figure 6).

Figure 5 – Regular Java class

Figure 6 – Persistent class

The PCJ library also provides support for transactions. All methods in the PCJ library are transactional―changes to persistent memory happen completely or not at all. If you want to expand the transactional nature to multiple method calls, put them in a transaction (Figure 7).

The PCJ library enables Java developers to write high-performance applications that manipulate large amounts of persistent data in a natural way. While new applications can include PCJ functionality in the design process, existing applications must be retrofitted to use persistent memory.

Figure 7 – Transactions with the PCJ

Low-Level Persistent Library (LLPL)

The PCJ library also offers separate low-level access to arbitrary regions of persistent memory, giving Java developers greater flexibility to create their own data abstractions. Figure 8 shows the implementation stack of LLPL. A heap API offers allocation and deallocation of MemoryRegions. The MemoryRegion interface provides setters and getters to access persistent memory. Three kinds of MemoryRegions are provided:

  1. MemoryRegion<Raw>: Suitable for volatile use or caller-provided data consistency
  2. MemoryRegion<Flushable>: Includes fail-safe flush() and isFlushed() methods.
  3. MemoryRegion<Transactional>: Writes are transactional.

LLPL lets developers choose whichever implementation best fits their requirements and also retrofit their existing applications at a low level.

In-Persistent-Memory Enabled Apache Cassandra

From Wikipedia: “Apache Cassandra is a free and open-source distributed NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure.” It’s used by several major Web services (Figure 9).

Figure 8 – LLPL implementation stack

Figure 9 – Apache Cassandra* users

Currently, Cassandra uses HD/SSD for data storage. To improve performance, it employs several optimization mechanisms including:

  • Caches
  • Index/offsets
  • Data summary
  • Bloom filter

The primary goal of these optimizations is to reduce the number of disk IOs, since a large number of disk IOs can significantly hamper throughput and increase latency under a heavy load. Persistent memory eliminates the disk storage requirement. Intel developed a proof-of-concept version of Cassandra to demonstrate how the overall software stack can be greatly simplified by using the PCJ library. With this library, data can be stored in an object layout format directly in persistent memory.

The existing underlying storage mechanism in Cassandra is based on the log-structured merge-tree (LSM tree) data structure that employs multiple levels to organize the data in the form of key-value pairs. The data at each level is sorted on keys and migrated across these levels using merge sort. Each key has a row associated with it. Figure 10 demonstrates a simple version of an LSM tree with two layers in the context of Cassandra. The Memtable (an in-memory buffer to hold the data until it’s full and eventually gets flushed to disk) at level-0 is stored in the DRAM, where the data is sorted by keys. Once the Memtable has reached its configurable size threshold, the data is flushed to disk in the form of sorted string tables (SSTables, an on-disk immutable file that gets generated from the Memtable flush) at level-1.

Figure 10 – Existing Cassandra storage mechanism

SSTables are immutable by nature, so multiple SSTables can have different versions of a row as a result of multiple writes to the same key. The write operations are efficient in this scheme, since the client is returned a response as soon as the data is written to the Memtable without the need to wait for flushing to the on-disk level-1 storage. However, the gradual increase in the number of SSTables holding multiple versions of the same row can lead to high read latencies, since multiple on-disk SSTables will need to be read to generate the latest, accurate response for a client. Also, it may lead to inefficient disk utilization, since the stale pieces of data continue to reside on the disk due to immutability.

To mitigate the eventual drop in read performance, and to reclaim disk space, Cassandra uses a background process known as Compaction. The primary job of Compaction is to merge one or more SSTables into a single new one (Figure 11). It merges keys by combining columns and retaining the latest copy of the data. It also discards tombstones (a special value written for a key to indicate that it has been deleted), which reduces the overall size of the data in the new SSTable. By recurrently running Compaction, both the read latency and disk capacity are regulated in Cassandra. However, Compaction can still be a CPU-intensive task that can often result in unpredictable latency spikes.

Figure 11 – Compaction process

The persistent-memory-enabled Cassandra prototype developed using the PCJ library allows a mutable data structure (PMTable) to be stored in persistent memory. In contrast to the serialized on-disk SSTables, the mutable PMTables can store data directly in the form of objects and fields (Figure 12). This provides several benefits. Because PMTables are mutable, data is written or updated in place and only the latest version of a row is retained―so the latency to read the data remains consistent. Since the data is stored in an object layout format, there is no serialization and deserialization overhead. The keys can be used to look up the data in the PMTable that resides in persistent memory without any disk IO overhead. This scheme also removes the overhead of Compaction, since no data merging is required due to in-place updates. Disk-based optimization features (i.e., bloom filter, index, caches, etc.) are no longer required―which simplifies the read path and delivers more predictable outlier latencies. The current prototype is limited in terms of the features it offers. Intel is working with the Apache community to open-source this prototype and make it fully functional.

Figure 12 – PMTable (Persistent Memory Table)

In short, the open-source PCJ library provides easy-to-use persistent classes that allow Java developers to integrate persistent memory into their applications.

References

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance.

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Performance varies by use, configuration, and other factors. Learn more at www.Intel.com/PerformanceIndex.