Database stored as a folder where each file represent single collection. Collection files used in append only mode which ensures safe access to data but can cause space overuse as you update your data. As workaround for this database will make automatic compactization when overuse ratio grows above configured limit.
MongoDB uses memory mapping which employ Linux Kernel caching and it is perfect solution. But we also need cache because without it things will be too slow. Every query will pull data twice when it does actual search and then when results fetched. So TingoDB has size limited memory cache which use the fastest possible replace strategy. We tried more sophisticated options like LRU but they are not effective.
Indexes are represented by in-memory B-tree lists. They are wrapped to high level objects that implement MongoDB specific behaviour (sparse indexes, unique and so on). Indexes are not serialized and recreated every time you load database. This is not the most efficient approach but for initial design goals it is more than enough.
Search & Sorting
Search use indexes when possible. We didn’t make any super smart optimizations so this code is relatively simple. But still indexes are used in almost all operators and can work with multiple indexes. Cursor limit option will always speedup queries. Skip options works as well, but it will be most effective when query use only indexes.
Sorting will use existing indexes when possible even if they are not part of query itself. When indexes absent sorting will still work by using dynamic indexes.
Every collection has its own work queue. Read only operations are non blocking and executed in parallel. Write operations are blocking and executed in sequence. Write operations will wait for completion of current read only operations and block collection until completion. Search operation will return cursor object that is consistent with the data that was in database on the moment of query. All updates that will happens after query was executed will not be visible for cursor consumers.
Based on our benchmarks we found that integer keys will speedup things a lot. For in-process database integers almost not have any drawbacks comparing to GUIDs. So by default TingoDB will use its own implementation of ObjectID which will generate integer keys that will be unique in collection scope. ObjectID API and behaviour is designed to be close compatible with BSON.ObjectID and with some small hacks it is possible to write code that will work with both transparently. If you prefer BSON.ObjectID you can enable it using configuration option.
So far we have near to 3 hundreds of tests that give us 95% code coverage. Significant part of tests is taken as is from MongoDB NodeJS driver project. Tests designed to work both on MongoDB and TingoDB to ensure exactly same behaviour. Same tests launched twice for BSON.ObjectID and TingoDB.ObjectID to ensure that they work equally.