Recent Posts

Developing a Document Database and Testing it on 147 Million Reddit Comments

Posted on May 23, 2020.

We can now store arbitrarily complex data in our key/value store using newly developed support for semi-structured data (e.g., JSON). Our latest project, Document Storage Engine for Semi-Structured Data, generalize ourpartitioned, log-structured merge-tree (LSMTree) key/value store to use generic and expendable serialization formats. Further, we… (read more)

Adding Durability Using Write-Ahead Logging

Posted on May 14, 2020.

Our most recent project addresses data durability. I.e., can we preserve data if the process or host crashes. The previously developed partitioned, log-structured merge-tree key/value store would lose data in such an event since it collects writes in memory for batch persisting to disk. (read more)

Developing Partitioned Storage and Analyzing Time-Dependent Benchmarks

Posted on May 05, 2020.

Our key/value storage engine continues to mature with the addition of partitioning. Partitioning, (i.e., sharding) consists of splitting a database up into separate sub-databases (i.e., partitions) that each store a portion of the full data. Each partition can then act with some level of independence. (read more)

What We're Building Towards (For Now…)

Posted on May 01, 2020.

I imagine that many of you joining me on this journey through database development would like to know where we're headed. While I don't have a formal plan, nor do I have final destination in mind, I do have an intermediate goal. I want us to push key/value stores to their limits and in doing so understand the motivation for other types of… (read more)

Developing a Log-Structured Merge-Tree for Persistent Reads and Writes

Posted on April 30, 2020.

We now have a disk-based, key/value storage engine that supports random writes in addition to reads. This is an exciting milestone in our journey through database development! To address the unique challenges associated with random writes, we explore the Log-Structured Merge-Tree (LSMTree) data structure. You can learn all about the LSMTree and… (read more)

Aspiring to Write More Unit Tests

Posted on April 27, 2020.

In initial DB From Zero projects, I developed few if any unit tests for correctness. Tests were only developed to expedite debugging issues that were encountered in benchmarking. I simply didn't feel tests were necessary for these exploratory and educational projects. (read more)