From: Dhruba Borthakur Date: Thu, 27 Dec 2012 02:03:34 +0000 (-0800) Subject: Every level can have overlapping files. X-Git-Url: http://git-server-git.apps.pok.os.sepia.ceph.com/?a=commitdiff_plain;h=899840a07ce4f51e899d59bbb85f7cfc1cf22688;p=rocksdb.git Every level can have overlapping files. Summary: Leveldb has high write amplification because one file from level n is compacted with all overlapping files in level n+1. This method of compaction reduces read amplification (becasue there is only one file to inspect per level) but the write amplification is high. Another option would be to compact multiple files from the same level and push it to a new file in level n+1. This means that there will be overlapping files in each level. Each read request might have to inspect multiple files at each level. This could increase read amplification but should reduce write amplification. This is called the "Hybrid" mode of operations. This patch introduces the "Hybrid" mode of operations (this deserves a better name?). In the Hybrid mode, all levels can have overlapping files. The number of files in a level determine whether compaction is needed or not. Files in higher levels are larger in size that files in lower levels. The default option is to have files of size 10MB in L1, 100MB in L2, 1000MB in L3, and so on. If the number of files in any level exceed 10 files, then that level is a target for compactions. A compaction process takes many files from level n and produces a single file in level n+1. The number of files that are picked as part of a single compaction run is limited by the size of the output file to be produced at the next higher level. This patch was produced by a two-full-day Christmas-Hack of 2012. Test Plan: All unit tests pass. This patch switches on the Hybrid mode by default. This is done so that all unit tests pass with the Hybrid Mode turned on. At time of commit, I will switch off the Hybrid Mode. Differential Revision: https://reviews.facebook.net/D7647 --- diff --git a/db/version_set.cc b/db/version_set.cc index 11d62753..efb49dc2 100644 --- a/db/version_set.cc +++ b/db/version_set.cc @@ -366,7 +366,7 @@ Status Version::Get(const ReadOptions& options, files = &tmp[0]; num_files = tmp.size(); } - assert(num_files <= 1 || level == 1 || vset_->IsHybrid()); + assert(num_files <= 1 || level == 0 || vset_->IsHybrid()); Saver saver; saver.ucmp = ucmp;