mon/ConfigKeyService: dump: print placeholder value for binary blobs
JSON cannot express arbitrary binary blobs. Instead of outputting invalid
and unparseable JSON, represent the value of blobs as something like
'<<< binary blob of length 12 >>>'.
Conflicts:
PendingReleaseNotes: 12.2.6 dropped, will be added in a seperate pr
src/mon/ConfigKeyService.cc (ConfigKeyService::store_dump does not take
prefix argument in luminous)
Marcus Watts [Wed, 30 May 2018 20:37:31 +0000 (16:37 -0400)]
rgw: making implicit_tenants backwards compatible.
In jewel, "rgw keystone implicit tenants" only applied to swift. As of
luminous), this option applies to s3 also.
Sites that used this feature with jewel now have outstanding data that
depends on the old behavior.
The fix here is to expand "rgw keystone implicit tenants" so that it
can be set to any of "none", "all", "s3" or "swift" (also 0=false=none,
1=true=all). When set to "s3" or "swift", the actual id lookup
is also partitioned.
Formerly "rgw keystone implicit tenants" was a legacy opt.
This change converts it to the new style of option,
including support for dynamically changing it.
Fixes: http://tracker.ceph.com/issues/24348 Signed-off-by: Marcus Watts <mwatts@redhat.com>
(cherry picked from commit a28a38f6e91da3abe59c34fad0e059eeaf29a65f)
Yan, Zheng [Sat, 17 Feb 2018 01:37:48 +0000 (09:37 +0800)]
mds: fix check of underwater dentries
Underwater dentry is dentry that is dirty in our cache from journal
replay, but had already been flushed to disk before the mds failed.
To decide if an dentry is underwater, original code compares dirty
dentry's version to on-disk dirfrag's version. This method is racy
because CDir::log_mark_dirty() can increase dirfrag's version without
adding log event. After mds failover, version of dirfrag from journal
replay can be less than on-disk dirfrag's version. So newly dirtied
dentry can be equal to or less than the on-disk dirfrag's version.
Jason Dillaman [Thu, 31 May 2018 13:29:00 +0000 (09:29 -0400)]
librbd: commit IO as safe when complete if writeback cache is disabled
We do not need to flush IO to ensure its safe if the writeback cache is
disabled when performing a journal replay. Instead, immediately mark the
IO as safe and let the journal's periodic commit throttle handle updating
the position.
Fixes: http://tracker.ceph.com/issues/23516 Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit a27269ab95e4cadcade2131367e40ccf1011316a)
David Zafman [Thu, 31 May 2018 00:18:03 +0000 (17:18 -0700)]
osd: Handle omap and data digests independently
Caused by: be078c8b7b131764caa28bc44452b8c5c2339623
The original attempt above to fix the omap_digest handling when
data_digest isn't present had 2 errors. First, it checked
is_data_digest() and is_omap_digest() instead of digest_present and
omap_digest_present which indicate the source digest is available.
Second, MAYBE could only be set if both digests are available.
Fixes: http://tracker.ceph.com/issues/24366 Signed-off-by: David Zafman <dzafman@redhat.com>
(cherry picked from commit 01f9669928abd571e14421a51a749d44fa041337)
osd/PrimaryLogPG: do not set data/omap digest blindly
As bluestore has bulitin csum, we generally no longer generate
object data digest for now. The consequence is that we should
handle data/omap digest more carefully to make certain ops,
such as copy_from/promote, to work properly since they heavily
relies on data digest for data transfer correctness.
Example of failure:
http://pulpito.ceph.com/xxg-2017-09-30_11:46:34-rbd-master-distro-basic-mira/1690609/
Sage Weil [Mon, 21 May 2018 15:06:37 +0000 (10:06 -0500)]
os/bluestore: simplify and fix SharedBlob::put()
There is a narrow race possible:
A: lookup foo
A: put on foo
A: foo --nref == 0
B: lookup foo
B: put foo
B: foo --nref == 0
B: try_remove() succeeds, removes
A: try_remove() tries to remove foo again, probably crashes
We could fix this by flagging the object in some way to indicate it was
removed (maybe clearing parent?), but then we need to be careful about
dereferencing foo to get parent from put().
Fix this by moving to a simpler model: make lookup fail if nref == 0.
This eliminates the races around put() entirely because once nref reaches
0 it never goes up again.
Sage Weil [Tue, 22 May 2018 21:55:03 +0000 (16:55 -0500)]
mon/MgrMonitor: change 'unresponsive' message to info level
We generate a MGR_DOWN health warning at the appropriate points; having
this at WRN level just triggers failed teuthology runs but doesn't much
value for the user.
Clear out teuthology whitelisting for this message.
Matt Benjamin [Thu, 24 May 2018 20:09:01 +0000 (16:09 -0400)]
rgw: add configurable AWS-compat invalid range get behavior
If rgw_ignore_get_invalid_range is set, treat invalid range
restrictions as a request for the full object. By default, retain
the RGW behavior to fail with ERANGE.
Fixes: http://tracker.ceph.com/issues/24317 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
(cherry picked from commit b8a3baffddb0f0082a9b250693d26d934eaf2650) Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Zhi Zhang [Wed, 16 May 2018 03:21:48 +0000 (11:21 +0800)]
mds: broadcast quota to relevant clients when quota is explicitly set
Try to broadcast quota to relevant clients proactively if quota is
explicitly set by someone, in case that client won't get quota update
for a long time.
Yan, Zheng [Fri, 18 May 2018 06:26:32 +0000 (14:26 +0800)]
client: fix issue of revoking non-auth caps
when non-auth mds revokes caps, Fcb caps can still be issued by auth
auth mds. It's wrong to flush buffer or invalidate cache when non-auth
mds revokes other caps. This bug can cause client to not respond the
revoke.
Yan, Zheng [Fri, 11 May 2018 06:55:12 +0000 (14:55 +0800)]
mds: reply session reject for open request from blacklisted client
Kernel client and old version libcephfs do not check if themselves
are blacklisted. They can be stuck at opening session after getting
blacklisted. The session reject message can avoid this.
Yan, Zheng [Tue, 8 May 2018 03:32:01 +0000 (11:32 +0800)]
mds: tighten conditions of calling rejoin_gather_finish()
Handle two cases:
1. mds receives all cache rejoin messages, then receives mdsmap that
says mds cluster enters rejoining state.
2. when opening undef inodes/dirfrags, other mds restarts.
Yan, Zheng [Tue, 8 May 2018 02:42:05 +0000 (10:42 +0800)]
mds: avoid calling rejoin_gather_finish() two times successively
If MDCache::rejoin_gather is empty and MDCache::rejoins_pending is true
when MDCache::process_imported_caps() calls maybe_send_pending_rejoins()
Both MDCache::rejoin_send_rejoins() and MDCache::process_imported_caps()
may call rejoin_gather_finish().
Yan, Zheng [Wed, 2 May 2018 02:23:33 +0000 (10:23 +0800)]
mds: properly reconnect client caps after loading inodes
Commit e43c02d6 "mds: filter out blacklisted clients when importing
caps" makes MDCache::process_imported_caps() ignore clients that are
not in MDCache::rejoin_imported_session_map. The map does not contain
clients from which mds has received reconnect messages. This causes
some client caps (corresponding inodes were not in cache when mds was
in reconnect state) to get dropped.
mds: filter out blacklisted clients when importing caps
The very first step of importing caps is calling
Server::prepare_force_open_sessions(). This patch makes the function
ignore blacklisted clients and return a session map for clients that
are not blacklisted. This patch also modify the codes that actually
do cap imports, make them skip caps for clients that are not in the
session map.
Conflicts:
src/mds/MDCache.h: Resolved in rejoin_open_sessions_finish
src/mds/Migrator.cc : Resolved in handle_export_dir
and decode_import_inode_caps
src/mds/Server.cc : Resolved in _rename_prepare_import