An Overview of MemoryAM’s Storage Engine

The storage engine that MemoryAM uses is extremely simple, supporting the basics of reading from and writing to a table. It does include basic transaction isolation.

The storage engine is written in C++ to simplify hash tables, vectors, and other in-memory storage mechanisms. This is not completely necessary, but it does make the code easier to read.

Basic Principles

We store tables as temporary tables in memory for a single PostgreSQL connection/process. This allows for us to not need to worry about syncronizing between multiple backends, nor worry about WAL, disk storage, or full isolation.

The storage engine attempts to gather what information it needs, so that calls into PostgreSQL itself are minimized. It is likely that any calls back into PostgreSQL can be eliminated completely to completely decouple all storage from compute.

Storing

The top-level storage is a Database class. It is responsible for Table creation and deletion, as well as applying any changes when a transaction is committed or rolled back.

A Table consists of a unique identifier, a list of columns, RowMetadata, a list of columns, and a list of transactional changes for transaction isolation and MVCC support.

MVCC

The storage engine supports basic MVCC by storing RowMetadata that consists of the minimum transaction and maximum transaction for each row. This provides some basic visibility to allow for multiple versions of the same row to exist, but to isolate these at a transaction level.

Transactions

Transactions are isolated by creating a list of transactions by transaction ID for each Table. These lists are separated into insertion lists and deletion lists. Upon disposition of the transaction (commit or rollback), these lists are iterated and any transactional changes are either discarded or committed.