Igor Canadi [Tue, 2 Sep 2014 20:29:05 +0000 (13:29 -0700)]
Ignore missing column families
Summary:
Before this diff, whenever we Write to non-existing column family, Write() would fail.
This diff adds an option to not fail a Write() when WriteBatch points to non-existing column family. MongoDB said this would be useful for them, since they might have a transaction updating an index that was dropped by another thread. This way, they don't have to worry about checking if all indexes are alive on every write. They don't care if they lose writes to dropped index.
Torrie Fischer [Fri, 22 Aug 2014 22:28:58 +0000 (15:28 -0700)]
Refactor PerfStepTimer to stop on destruct
This eliminates the need to remember to call PERF_TIMER_STOP when a section has
been timed. This allows more useful design with the perf timers and enables
possible return value optimizations. Simplistic example:
class Foo {
public:
Foo(int v) : m_v(v);
private:
int m_v;
}
int main(int argc, char[] argv)
{
Foo f;
int errno;
f = bar(&errno);
if (errno)
return -1;
return 0;
}
After bar() is called, perf_context.some_timer would be incremented as if
Stop(&perf_context.some_timer) was called at the end, and the compiler is still
able to produce optimizations on the return value from makeFrobbedFoo() through
to main().
Igor Canadi [Tue, 2 Sep 2014 15:34:54 +0000 (08:34 -0700)]
Don't let flush preempt compaction in certain cases
Summary:
I have an application configured with 16 background threads. Write rates are high. L0->L1 compactions is very slow and it limits the concurrency of the system. While it's happening, other 15 threads are idle. However, when there is a need of a flush, that one thread busy with L0->L1 is doing flush, instead of any other 15 threads that are just sitting there.
This diff prevents that. If there are threads that are idle, we don't let flush preempt compaction.
Lei Jin [Sat, 30 Aug 2014 04:21:49 +0000 (21:21 -0700)]
limit max bytes that can be read/written per pread/write syscall
Summary:
BlockBasedTable sst file size can grow to a large size when universal
compaction is used. When index block exceeds 2G, pread seems to fail and
return truncated data and causes "trucated block" error. I tried to use
```
#define _FILE_OFFSET_BITS 64
```
But the problem still persists. Splitting a big write/read into smaller
batches seems to solve the problem.
Test Plan:
successfully compacted a case with resulting sst file at ~90G (2.1G
index block size)
Implementing a cache friendly version of Cuckoo Hash
Summary: This implements a cache friendly version of Cuckoo Hash in which, in case of collission, we try to insert in next few locations. The size of the neighborhood to check is taken as an input parameter in builder and stored in the table.
Test Plan:
make check all
cuckoo_table_{db,reader,builder}_test
Igor Canadi [Thu, 28 Aug 2014 17:06:28 +0000 (13:06 -0400)]
Don't let other compactions run when manual compaction runs
Summary:
Based on discussions from t4982833. This is just a short-term fix, I plan to revamp manual compaction process as part of t4982812.
Also, I think we should schedule automatic compactions at the very end of manual compactions, not when we're done with one level. I made that change as part of this diff. Let me know if you disagree.