I imagine that many of you joining me on this journey through database development would like to know where we're headed. While I don't have a formal plan, nor do I have final destination in mind, I do have an intermediate goal. I want us to push key/value stores to their limits and in doing so understand the motivation for other types of databases; e.g., relational databases and columnar stores.
Initially, we're be covering some standard aspects of a general key/value store. For example, we've already explored the following topics.
Next, I plan for us to explore and develop these additional standard components of a key/value store.
Yet, I also want us to experiment with support for features not commonly associated with a key/value store, including:
Yes, such features are rarely supported by key/values stores and are instead are more commonly associated with relational databases. In developing support for these features, I believe we'll start to understand some of the deficiencies of a key/value store and the motivation for different types of databases. Further, we'll get to explore some of the algorithms and components used by relational databases and see how they can be adapted to meet the needs of our novel and increasingly sophisticated key/value stores.
From there, we can decide whether we want to build a proper relational database or possibly go a different direction on our journey through database development.
Additionally, I hope we can regularly extend and refine our benchmarking techniques so as to best quantify the different aspects of databases and the workloads they support. For example, I'm currently figuring out how to best simulate the workload experienced by a key/value store that serves as the data store for a website. Measurement can include a distribution of latency to see if any of our hypothetical users would have a particularly bad experience in terms of long waits times due to certain design and configuration decisions of the key/value store.
We can also take diversions from developing databases to instead study and benchmark existing databases. I've personally spent some time reading through PostgreSQL and LevelDB source code and I think it would be interesting for us to learn how robust databases address certain challenges, including how they implement their solutions in code.
So thats the plan for now. Let me know if you have suggestions about projects that we should explore or feedback on the current work at matthew.hagy@gmail.com.