David Zafman [Thu, 31 May 2018 00:18:03 +0000 (17:18 -0700)]
osd: Handle omap and data digests independently
Caused by: be078c8b7b131764caa28bc44452b8c5c2339623
The original attempt above to fix the omap_digest handling when
data_digest isn't present had 2 errors. First, it checked
is_data_digest() and is_omap_digest() instead of digest_present and
omap_digest_present which indicate the source digest is available.
Second, MAYBE could only be set if both digests are available.
Fixes: http://tracker.ceph.com/issues/24366 Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 01f9669928abd571e14421a51a749d44fa041337)
Jan Fajerski [Tue, 29 May 2018 12:29:41 +0000 (14:29 +0200)]
cmake: fix cython target in test/CMakeFile.txt
The cython target is called cython_modules in python2 environments and
cython3_modules in python3 environments. Reflect that naming in
src/test/CMakeFile.txt. Otherwise the test target can not build in
python3 environments.
Casey Bodley [Tue, 8 May 2018 18:22:42 +0000 (14:22 -0400)]
cmake: move crypto_plugins target
the crypto_plugins target was defined in
src/crypto/isa-l/CMakeLists.txt, but this is only included
if(HAVE_INTEL AND HAVE_BETTER_YASM_ELF64 AND (NOT APPLE))
moving it out of the if() block allows the os target to depend on it
even if no plugins are built
Kefu Chai [Mon, 28 May 2018 11:37:44 +0000 (19:37 +0800)]
qa: wait longer for osd to flush pg stats
pg sends pg-stats to mgr every 5 seconds, so we cannot check for the
number of pgs right after creating the pool, at that moment, the number
of pgs could be 0, that's why manger.wait_for_clean() returns right
away, and leaves us with 0 pgs: the pgs serving the pool are still being
created. that's why `manager.get_num_active_clean()` returns `0`
sometimes. so, we should force osd to flush their stats to mgr, and wait
until the pg stats converages.
Kanika Murarka [Sun, 27 May 2018 18:17:20 +0000 (18:17 +0000)]
mgr/dashboard: Fixes documentation link- to open in new tab
Adds 'target' attribute to open link in new tab.
Fixes : https://tracker.ceph.com/issues/24288
Sage Weil [Sat, 26 May 2018 13:38:09 +0000 (08:38 -0500)]
Merge PR #22226 into mimic
* refs/pull/22226/head:
tests/crypto: print compile warning when NSS is unavailable.
tests/crypto: add tests for the no-bl encrypt/decrypt, part 2.
tests/crypto: add tests for the no-bl encrypt/decrypt.
auth: use OpenSSL for CryptoAESKeyHandler's no-bl encrypt/decrypt.
auth: extend CryptoKey with no-bl encrypt/decrypt.
auth: CryptoAESKeyHandler switches from NSS to OpenSSL.
auth: the outbuf of AES should be multiple of block size
auth: cache the PK11Context for CryptoAESKeyHandler
Patrick Donnelly [Thu, 24 May 2018 19:11:54 +0000 (12:11 -0700)]
Merge PR #22138 into mimic
* refs/pull/22138/head:
mds: reply session reject for open request from blacklisted client
qa/tasks/cephfs: add timeout parameter to kclient umount_wait
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
auth: cache the PK11Context for CryptoAESKeyHandler
in the flame graph, 0.50% of total time is used by
CephxSessionHandler::check_message_signature(), in which 0.27% is used
by PK11_CreateContextBySymKey(). so we should cache the PK11Context.
Sage Weil [Mon, 21 May 2018 15:06:37 +0000 (10:06 -0500)]
os/bluestore: simplify and fix SharedBlob::put()
There is a narrow race possible:
A: lookup foo
A: put on foo
A: foo --nref == 0
B: lookup foo
B: put foo
B: foo --nref == 0
B: try_remove() succeeds, removes
A: try_remove() tries to remove foo again, probably crashes
We could fix this by flagging the object in some way to indicate it was
removed (maybe clearing parent?), but then we need to be careful about
dereferencing foo to get parent from put().
Fix this by moving to a simpler model: make lookup fail if nref == 0.
This eliminates the races around put() entirely because once nref reaches
0 it never goes up again.
common: OpTracker doesn't visit TrackedOp when nref == 0.
The patch fixes a race condition that happens between
`unregister_inflight_op` and `visit_ops_in_flight` of
`OpTracker`. When a callable passed to the former one
turns the plain reference it gets into `TrackedOpRef`,
an almost-to-terminate `TrackedOp` (with `nref == 0`)
can be resurrected (`nref++`). This will be reflected
in extra call to `unregister_inflight_op` for same op
leading to e.g. use-after-free. For more details see:
https://tracker.ceph.com/issues/24037#note-5.
The fix deals with the problem by ensuring there will
be no call to the visitor for ops with zeroized `nref`.
Sage Weil [Mon, 14 May 2018 17:56:59 +0000 (12:56 -0500)]
mon/MonClient: set configs via finisher
The config observers may want to take locks that are ordered relative
to monc_lock.
We could simply drop monc_lock for this call, but that would implicitly
rely on a single-threaded dispatch to avoid having two incoming MConfig
messages get reordered. Explicitly putting it on a finisher is safer.
Note that we adjust the get_monmap_and_config() to start, drain, and stop
the finisher to ensure we have incoming config processed and applied
before returning.
Tiago Melo [Fri, 18 May 2018 15:08:37 +0000 (16:08 +0100)]
mgr/dashboard: Fix RBD task metadata
Error message template for RBD copy was trying to read
an unexistent property of the returned metada.
Metadata for RBD edit was missing the new image name.
The new name should be displayed, instead of the old one,
when the user tries to use an existent image name.
Zhi Zhang [Wed, 16 May 2018 03:21:48 +0000 (11:21 +0800)]
mds: broadcast quota to relevant clients when quota is explicitly set
Try to broadcast quota to relevant clients proactively if quota is
explicitly set by someone, in case that client won't get quota update
for a long time.
Yan, Zheng [Mon, 14 May 2018 03:34:42 +0000 (11:34 +0800)]
mds: properly setup client_need_snapflush for snap inode
MDCache::cow_inode() checks "cap->issued() & CEPH_CAP_ANY_WR" to decide
if it needs to setup client_need_snapflush for the new snap inode. If
cap message flushes dirty caps and releases the same caps, cap->issued()
may have no WR caps when MDCache::cow_inode() gets called. The solution
is temporarily setting NEEDSNAPFLUSH on Capability::state.
Yan, Zheng [Mon, 14 May 2018 02:48:16 +0000 (10:48 +0800)]
Revert "mds: properly setup need_snapflush for snapped inode"
commit de3f3d88b3e make Locker::_do_cap_update() get called before
adjusting wanted caps. This is wrong because Locker::_do_cap_update()
need uptodate wanted caps to calculate max size.
Yan, Zheng [Fri, 11 May 2018 06:55:12 +0000 (14:55 +0800)]
mds: reply session reject for open request from blacklisted client
Kernel client and old version libcephfs do not check if themselves
are blacklisted. They can be stuck at opening session after getting
blacklisted. The session reject message can avoid this.
Yan, Zheng [Fri, 18 May 2018 06:26:32 +0000 (14:26 +0800)]
client: fix issue of revoking non-auth caps
when non-auth mds revokes caps, Fcb caps can still be issued by auth
auth mds. It's wrong to flush buffer or invalidate cache when non-auth
mds revokes other caps. This bug can cause client to not respond the
revoke.
YunfeiGuan [Tue, 8 May 2018 11:35:32 +0000 (19:35 +0800)]
client: avoid freeing inode when it contains TX buffer heads
ObjectCacher::discard_set() prematurely delete TX buffer heads. But
the pending writebacks still pin parent objects of these buffer heads.
Assertion "oset.objects.empty()" gets triggered if inode with pending
writebacks get freed.
Sage Weil [Fri, 18 May 2018 18:11:57 +0000 (13:11 -0500)]
crush: update choose_args on bucket removal
The specific bug I see is that a bucket no longer exists but its
choose_args still does. However, I'm also taking the opportunity to
verify that the choose_args agrees with the bucket sizes and position
counts everywhere else, too. Check for
- ids or weight_sets for buckets that don't exist or aren't straw2
- weight_set_positions that don't match the choose_args
- don't fix this, just warn. i'm not sure how it would happen. :/
- weight_set sizes that don't match the bucket size