]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Shyamsundar Ranganathan [Wed, 22 Jul 2020 19:21:50 +0000 (15:21 -0400)]
client: expose ceph.quota.max_bytes xattr within snapshots
For directories within snapshots, expose the ceph.quota.max_bytes
extended attribute information. This enables fetching quota
information when the snapshot was taken and is particularly useful
when cloning subvolume snapshots, to enforce the quota on the
clone subvolume as well.
Fixes: https://tracker.ceph.com/issues/46278
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Kefu Chai [Wed, 29 Jul 2020 09:05:36 +0000 (17:05 +0800)]
Merge pull request #36323 from tchaikov/wip-crimson-msgr-v1-v2
crimson: picking peer addr of the compatible type
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Jan Fajerski [Wed, 29 Jul 2020 08:56:57 +0000 (10:56 +0200)]
Merge pull request #35728 from jan--f/c-v-add-subcommand-parse-drive-groups
ceph-volume: add drive-group subcommand
Kefu Chai [Wed, 29 Jul 2020 08:48:23 +0000 (16:48 +0800)]
Merge pull request #36342 from tchaikov/wip-crimson-heartbeat-erase
crimson/osd: erase an element by iterator instead
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Sebastian Wagner [Wed, 29 Jul 2020 08:19:42 +0000 (10:19 +0200)]
Merge pull request #36217 from sebastian-philipp/cephadm-common-mypy-ini
cephadm: use src/mypy.ini instead
Reviewed-by: Joshua Schmid <jschmid@suse.de>
Reviewed-by: Michael Fritch <mfritch@suse.com>
Sebastian Wagner [Wed, 29 Jul 2020 08:15:07 +0000 (10:15 +0200)]
Merge pull request #36235 from matthewoliver/cephadm_iscsi_tcmu_runner
cephadm: Add tcmu-runner container when deploying ceph-iscsi
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Kefu Chai [Wed, 29 Jul 2020 07:33:59 +0000 (15:33 +0800)]
crimson/osd: implement cls_get_pool_stripe_width
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 29 Jul 2020 06:37:57 +0000 (14:37 +0800)]
crimson/osd: erase an element by iterator instead
we should not remove an element while iterating it in a map, as erasing
the element invalidates the iterator, which causes segmfault when we are
advancing it after erasing the dereferenced element.
in this change, an iterator is used for walking through the map, in
comparision with creating a to-be-removed list, this one is more
efficient and more idiomatic.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 29 Jul 2020 06:18:56 +0000 (14:18 +0800)]
Merge pull request #36341 from tchaikov/wip-crimson-cls
crimson/osd: correct the function name of cls_cxx_map_get_vals_by_keys()
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Wed, 29 Jul 2020 04:32:26 +0000 (12:32 +0800)]
crimson/osd: correct the function name of cls_cxx_map_get_vals_by_keys()
it was an oversight in
7a4c6359e483f8c71276ece5cde16eb0771ac5d2
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Wed, 29 Jul 2020 01:44:46 +0000 (09:44 +0800)]
Merge pull request #36079 from winndows/superfluous_break6
msg: Remove superfluous breaks
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Neha Ojha [Wed, 29 Jul 2020 01:11:12 +0000 (18:11 -0700)]
Merge pull request #36297 from dvanders/dvanders_46443
osd: fix crash in _committed_osd_maps if incremental osdmap crc fails
Reviewed-by: Xiaoxi Chen <xiaoxchen@ebay.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Neha Ojha [Tue, 28 Jul 2020 17:36:09 +0000 (10:36 -0700)]
qa/suites/rados/thrash/crc-failures: randomly inject bad incremental osdmap crc
Signed-off-by: Neha Ojha <nojha@redhat.com>
Dan van der Ster [Mon, 27 Jul 2020 15:40:27 +0000 (17:40 +0200)]
osd: don't write transaction when inc crc failed
80da5f9a987c6a48b93f25228fdac85890013520 exposed a flaw in how
handle_osd_map falls back to a full osdmap if the crc of an incremental
failed.
If the first message in a map message had a crc error, then the
loop would exit with last < start, which would then cause a null
dereference in _committed_osd_maps.
Fixes: https://tracker.ceph.com/issues/46443
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Dan van der Ster [Mon, 27 Jul 2020 12:23:54 +0000 (14:23 +0200)]
qa/standalone/osd: add bad-inc-map.sh
Test that the osd doesn't crash when it gets a bad incremental osdmap.
Related-to: https://tracker.ceph.com/issues/46443
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
Mykola Golub [Tue, 28 Jul 2020 18:30:07 +0000 (21:30 +0300)]
Merge pull request #36287 from dillaman/wip-librbd-close
librbd: use task finisher thread for image open/close callbacks
Reviewed-by: Mykola Golub <mgolub@suse.com>
Jason Dillaman [Tue, 28 Jul 2020 17:33:17 +0000 (13:33 -0400)]
Merge pull request #36253 from changchengx/exclusive
doc: specify RBD_LOCK_MODE_EXCLUSIVE for exclusive-lock
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Kefu Chai [Tue, 28 Jul 2020 15:22:14 +0000 (23:22 +0800)]
Merge pull request #36328 from tchaikov/wip-crimson-cls_cxx_map_get_vals
crimson/osd: implement cls_cxx_map_get_vals()
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Tue, 28 Jul 2020 14:39:21 +0000 (22:39 +0800)]
crimson/osd: implement cls_cxx_map_get_vals()
Signed-off-by: Kefu Chai <kchai@redhat.com>
Changcheng Liu [Thu, 23 Jul 2020 03:09:46 +0000 (11:09 +0800)]
doc: specify RBD_LOCK_MODE_EXCLUSIVE for exclusive-lock
The exclusive-lock could be transited transparently between clients
after finishing write operation. To disable "transparent" transition,
it needs to acquire the lock with RBD_LOCK_MODE_EXCLUSIVE.
Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
Kefu Chai [Tue, 28 Jul 2020 12:49:58 +0000 (20:49 +0800)]
Merge pull request #34928 from p-se/wip-pse-revise-monitoring-doc
mgr/dashboard: revise monitoring documentation
Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>
Reviewed-by: Stephan Müller <smueller@suse.com>
Kefu Chai [Tue, 28 Jul 2020 12:32:59 +0000 (20:32 +0800)]
crimson: use pick_addr() for picking peer addr
in teuthology tests, there is good chance that we have ceph.conf
containing:
mon host = 172.21.15.122
which is translated to two monitors
- a: 172.21.15.122:3300
- a-legacy: 172.21.15.122:6789
both has protocol type of "any". so, to enable crimson to use settings
like this, we should let crimson to accept them, and drop the connection
if the peer claim to be using an incompatible protocol, when they are
exchanging banners.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 28 Jul 2020 12:32:00 +0000 (20:32 +0800)]
crimson/mon: use mon with only v2 address
crimson msgr supports v2 protocol now, so we can connect to monitor
which only provides v2 addresses.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 28 Jul 2020 12:29:29 +0000 (20:29 +0800)]
msg/msg_types.h: add pick_addr()
for picking an addr from an entity_addrvec_t by given protocol type.
so:
- v2 => v2, any
- v1 => v1, any
- any => any, v1, v2
and add a helper of `addr_of_type()` to avoid repeatings.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Sebastian Wagner [Tue, 28 Jul 2020 12:46:20 +0000 (14:46 +0200)]
Merge pull request #36285 from sebastian-philipp/orch-completion-generic
mgr/orch: Add some more type annotations
Reviewed-by: Michael Fritch <mfritch@suse.com>
Volker Theile [Tue, 28 Jul 2020 12:00:25 +0000 (14:00 +0200)]
Merge pull request #36258 from rhcs-dashboard/fix-cpu-stats
mgr/dashboard: cpu stats incorrectly displayed
Reviewed-by: Patrick Seidensal <pseidensal@suse.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Volker Theile <vtheile@suse.com>
Sebastian Wagner [Fri, 24 Jul 2020 15:29:28 +0000 (17:29 +0200)]
mgr/orch: Add some more type annotations
Made `orch.Completion` a generic type
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Sebastian Wagner [Tue, 28 Jul 2020 09:54:12 +0000 (11:54 +0200)]
Merge pull request #36012 from adk3798/cephadm_44886
mgr/cephadm: allow use of authenticated registry
Sebastian Wagner [Tue, 28 Jul 2020 09:52:53 +0000 (11:52 +0200)]
Merge pull request #36262 from sebastian-philipp/orch-readd-apply_dg
mgr/cephadm: re-add `apply_drivegroups()`
Reviewed-by: Joshua Schmid <jschmid@suse.de>
Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
Nathan Cutler [Tue, 28 Jul 2020 09:33:30 +0000 (11:33 +0200)]
Merge pull request #36306 from smithfarm/wip-add-octopus-to-release-table
doc/releases: add "octopus" column to Release Timeline
Reviewed-by: Neha Ojha <nojha@redhat.com>
Sebastian Wagner [Tue, 28 Jul 2020 09:29:01 +0000 (11:29 +0200)]
Merge pull request #36301 from sebastian-philipp/doc-cephadm-status-no-progress
doc/cephadm: `status` doesn't show a progress
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Zac Dover <zac.dover@gmail.com>
Matthew Oliver [Wed, 22 Jul 2020 07:09:12 +0000 (17:09 +1000)]
cephadm: Add tcmu-runner container when deploying ceph-iscsi
Currently when we deploy ceph-iscsi via cephadm it doesn't include a
running tcmu-runner. Which means initiators will be able to login but
you wont see the LUNS on the initiator.
This patch deploys an additional tcmu-runner container along side the
ceph-iscsi container that just runs the tcmu-runner service.
Fixes: https://tracker.ceph.com/issues/46540
Signed-off-by: Matthew Oliver <moliver@suse.com>
Kefu Chai [Tue, 28 Jul 2020 02:21:24 +0000 (10:21 +0800)]
Merge pull request #36090 from inspur-wyq/wip-37532
mon: fix the 'Error ERANGE' message when conf "osd_objectstore" is filestore
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Tue, 28 Jul 2020 02:20:00 +0000 (10:20 +0800)]
Merge pull request #36283 from rzarzynski/wip-bl-raw-privatization
common/bl: don't access raw::data nor raw::len directly. Use getters instead.
Reviewed-by: Neha Ojha <nojha@redhat.com>
Nathan Cutler [Mon, 27 Jul 2020 15:40:58 +0000 (17:40 +0200)]
doc/releases: add "octopus" column to Release Timeline
Octopus has been out for awhile. I suppose this should have been done
earlier, but "better late than never".
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Nathan Cutler [Mon, 27 Jul 2020 15:39:22 +0000 (17:39 +0200)]
Merge pull request #36245 from smithfarm/wip-mimic-is-eol
doc/releases: Mimic is EOL
Reviewed-by: Abhishek Lekshmanan <abhishek@suse.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Mon, 27 Jul 2020 14:53:07 +0000 (22:53 +0800)]
Merge pull request #36279 from tchaikov/wip-crimson-msgr-v2.1
crimson/net: enable on-wire-encryt and v2.1 support
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Sebastian Wagner [Mon, 27 Jul 2020 14:50:01 +0000 (16:50 +0200)]
doc/cephadm: `status` doesn't show a progress
Fixes: https://tracker.ceph.com/issues/45858
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Nathan Cutler [Mon, 27 Jul 2020 14:25:30 +0000 (16:25 +0200)]
Merge pull request #35852 from smithfarm/wip-opensuse-os-recommendations
doc/start/os-recommendations: current state of openSUSE
Reviewed-by: Tim Serong <tserong@suse.com>
Casey Bodley [Mon, 27 Jul 2020 13:54:46 +0000 (09:54 -0400)]
Merge pull request #36269 from dang/wip-dang-46692
RGW - fix bulkupload, broken by zipper
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Sebastian Wagner [Mon, 20 Jul 2020 11:55:09 +0000 (13:55 +0200)]
cephadm: use src/mypy.ini instead
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Jason Dillaman [Fri, 24 Jul 2020 16:13:10 +0000 (12:13 -0400)]
librbd: use task finisher thread for image open/close callbacks
There was a potential race condition with utilizing the AsioEngine
to deliver asynchronous image open and close callbacks. This left
the potential for the io_context thread to attempt to destroy itself.
This commit changes the behavior of the image open and close callbacks
to always delete the ImageCtx (now matches the synchronous API behavior)
and it always invokes the callback in Finisher thread whose lifetime is
tied to the CephContext.
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Jan Fajerski [Mon, 27 Jul 2020 08:29:21 +0000 (10:29 +0200)]
Merge pull request #36219 from guits/guits-fix_zap_osdid_osdfsid
ceph-volume: filter by osd-id or osd-fsid when zapping
Guillaume Abrioux [Mon, 20 Jul 2020 13:43:38 +0000 (15:43 +0200)]
ceph-volume: filter by osd-id or osd-fsid when zapping
2f5c10c12c37e6865ce54bb4940d3779353cba4f introduced a bug:
`ceph-volume lvm zap` command fails under certain conditions.
when passing `--osd-id` or `--osd-fsid` to `ceph-volume lvm zap` command
it tries to zap additionnal devices that have nothing to do with the osd
being zapped.
When calling `api.get_lvs()` in `ensure_associated_lvs()` we have to
pass the osd-id/osd-fsid information so only related devices are
returned by `get_lvs()` method
Closes: https://tracker.ceph.com/issues/46627
Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 09:22:14 +0000 (17:22 +0800)]
messages/MOSDBoot: pass OSDSuperblock by const ref
MOSDBoot's ctor does not change the parameter, so let's pass by const
reference.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 09:13:41 +0000 (17:13 +0800)]
crimson/os/alienstore: always use fsid in bluestore
alienstore should not be stateful in this perspective, it should proxy
all acccess of fsid to bluestore.
there are couple issues in existing implementation:
* when mkfs, bluestore tries to generate a new osd_fsid if the specified
one is empty. but we explicitly pass the given uuid down to
AlienStore::mkfs() so the bluestore can use it. so we should pass it
down instad of storing it locally.
* when persisting superblock in OSD::mkfs(), superblock.osd_fsid() is
read from store->get_fsid(), if user specifies an empty uuid, we
should persist the generated uuid in the superblock.
in this change, all access to fsid is proxied to the underlying
bluestore.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 07:45:26 +0000 (15:45 +0800)]
stop.sh: stop osd before mon
osd sends a MOSDMarkMeDown message to monitor and waits for its ack
before timeout, so if we can stop osd before stopping mon, stop.sh can
return sooner without waiting until the timeout.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 05:04:58 +0000 (13:04 +0800)]
crimson/mgr: only pick the addr of the same type
to avoid the attempts to connect an OSD which is bound to a v2
address to a v1 address of a mgr.
in general, osd is bound to both v1 and v2 addresses, but crimson
msgr does not support multiple bound address at the time of writing, so
to avoid the failures when trying to connect to incompatible addresses,
let's filter out them when connecting to monitor. this change
silence warnings like:
peer_addr_for_me v1:172.21.15.106:60008/0 type doesn't match myaddr
v2:0.0.0.0:6802/26710
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 05:02:49 +0000 (13:02 +0800)]
mon/MgrMap: let get_active_addrs() return a const ref
no need to create a temporary instance for referencing those addresses.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 04:38:04 +0000 (12:38 +0800)]
crimson/mon: only pick the addr of the same type
to avoid the attempts to connect an OSD which is bound to a v2 address
to a v1 addrss of a monitor.
in general, osd is bound to both v1 and v2 addresses, but crimson msgr
does not support multiple bound address at the time of writing, so to
avoid the failures when trying to connect to incompatible addresses,
let's filter out them when connecting to monitor. this change silence
warnings like:
peer_addr_for_me v1:172.21.15.106:60008/0 type doesn't match myaddr
v2:0.0.0.0:6802/26710
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 15:13:37 +0000 (23:13 +0800)]
crimson/osd: print out client caps
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 15:10:51 +0000 (23:10 +0800)]
auth/cephx: implement random()->get_bytes() for crimson
instead of using CryptoRandom use the C++ standard library for
generating secret.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 13:11:08 +0000 (21:11 +0800)]
crimson/admin: catch thrown exception
if the socket file exists, a std::system_error is thrown. and we should
catch it.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 09:00:06 +0000 (17:00 +0800)]
crimson/msgr: Revert "don't advertise the on-wire format v2.1."
This reverts commit
a74948bc5095b32212189352c163030bfe10db71 .
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 10:03:16 +0000 (18:03 +0800)]
crimson/net: enable msgr v2.1 support
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 10:01:12 +0000 (18:01 +0800)]
crimson/net: enable on_wire encryption support
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 08:59:58 +0000 (16:59 +0800)]
crimson/net: set is_rev1 for messenger v2.1 support
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 08:55:36 +0000 (16:55 +0800)]
crimson/net: keep rx_preamble for msgr v2.1 support
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 08:45:46 +0000 (16:45 +0800)]
crimson/net: drop stale TODO
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 08:33:22 +0000 (16:33 +0800)]
crimson/net: use rx_frame_asm for handling data read from wire
by leveraging FrameAssembler, it's much simpler. and it also pave the
road to a better messenger v2.0 and v2.1 protocol support.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 08:30:44 +0000 (16:30 +0800)]
crimson/net: mark abort_ functions [[noreturn]]
otherwise compiler complains if control reaches end of non-void
function.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 03:51:57 +0000 (11:51 +0800)]
msg/async/crypto_onwire: drop unused member variable
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 18:03:02 +0000 (02:03 +0800)]
Merge pull request #36259 from majianpeng/bluefs-reduce-ceph_clock_now
os/bluestore/BlueFS: reduce unnecessary ceph_clock_now().
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Kefu Chai [Sat, 25 Jul 2020 18:00:15 +0000 (02:00 +0800)]
Merge pull request #33899 from rs-fabrica/rados_generic_options_usage_message
rados: include generic options in usage message
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 17:57:49 +0000 (01:57 +0800)]
Merge pull request #36115 from BenoitKnecht/diskprediction-local-array-shape
mgr/diskprediction_local: Fix array size error
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 17:57:11 +0000 (01:57 +0800)]
Merge pull request #35306 from changchengx/blk
blk: add option to set device type to select blk driver
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 17:54:29 +0000 (01:54 +0800)]
Merge pull request #36274 from xiexingguo/wip-peer-num-objects
osd/PeeringState: prevent peer's num_objects going negative
Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: yanjun <yan.jun8@zte.com.cn>
Kefu Chai [Sat, 25 Jul 2020 10:22:08 +0000 (18:22 +0800)]
Merge pull request #36236 from tchaikov/wip-std-bind
test/librados_test_stub: use std::bind
Reviewed-by: Adam C. Emerson <aemerson@redhat.com>
Kefu Chai [Sat, 25 Jul 2020 06:41:33 +0000 (14:41 +0800)]
Merge pull request #36071 from rzarzynski/wip-crimson-errorator-assert-cleanup
crimson: improve assertions in errorator
Reviewed-by: Samuel Just <sjust@redhat.com>
xie xingguo [Fri, 24 Jul 2020 01:57:40 +0000 (09:57 +0800)]
osd/PeeringState: prevent peer's num_objects going negative
Saw it in a teuthology run:
-5645> 2020-07-20 04:34:32.067
7f351e329700 5 osd.5 pg_epoch: 667 ... exit Started/Primary/Active/Backfilling
-5642> 2020-07-20 04:34:32.067
7f351e329700 5 osd.5 pg_epoch: 667 ... enter Started/Primary/Active/Recovered
-5633> 2020-07-20 04:34:32.067
7f351e329700 20 osd.5 pg_epoch: 667 ... _update_calc_stats shard 5 primary objects 0 missing 0
-5632> 2020-07-20 04:34:32.067
7f351e329700 20 osd.5 pg_epoch: 667 ... _update_calc_stats shard 3 objects -1 missing 1
-5631> 2020-07-20 04:34:32.067
7f351e329700 20 osd.5 pg_epoch: 667 ... _update_calc_stats shard 6 objects 0 missing 0
This will crash the choose_acting() procedure as it will mistakenly
think that peer 3 should continue to perform asynchronous recovery
(e.g., due to num_objects_missing = 1) in contrast to fully
backfill-recovered.
While I did not dig into the real cause, there are a couple of
possible explanations of how num_objects can be off. I think that
if a roll forward or log replay could delete something twice, maybe
there would be an undercount. Or maybe something as simple as a
corruption.
Since _update_calc_stats() is going to fix num_objects_missing
for that peer anyway, let's make sure it always starts with a
clean state.
Fixes: https://tracker.ceph.com/issues/46705
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
Neha Ojha [Fri, 24 Jul 2020 23:13:55 +0000 (16:13 -0700)]
Merge pull request #36121 from aclamk/wip-bluefs-log-replay-rescue
Rescue procedure for extremely large bluefs log
Reviewed-by: Neha Ojha <nojha@redhat.com>
Neha Ojha [Fri, 24 Jul 2020 21:47:44 +0000 (14:47 -0700)]
Merge pull request #35909 from dzafman/wip-46275
osd: Cancel in-progress scrubs (not user requested)
Reviewed-by: Neha Ojha <nojha@redhat.com>
Daniel Gryniewicz [Thu, 23 Jul 2020 16:38:11 +0000 (12:38 -0400)]
RGW - fix bulkupload, broken by zipper
Bulkupload depended on the existence of empty bucketinfo. Fix that, to
avoid a crash. In additions, the error handler for swift used buckets.
Fixes 46692
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
David Zafman [Tue, 7 Jul 2020 01:02:08 +0000 (18:02 -0700)]
test: Check for interuption of scrubs with nosrub/nodeep_scrub
Signed-off-by: David Zafman <dzafman@redhat.com>
David Zafman [Thu, 2 Jul 2020 17:05:57 +0000 (10:05 -0700)]
osd: Cancel in-progress scrubs (not user requested)
This change adds new scrubber.req_scrub to track user
requested scrubs, deep_scrub or repair.
Fixes: https://tracker.ceph.com/issues/46275
Signed-off-by: David Zafman <dzafman@redhat.com>
David Zafman [Thu, 23 Jul 2020 16:40:54 +0000 (09:40 -0700)]
osd: Arrange code so that it is clearer should not cause any change
Signed-off-by: David Zafman <dzafman@redhat.com>
David Zafman [Tue, 21 Jul 2020 20:58:42 +0000 (13:58 -0700)]
test: mon-last-epoch-clean.sh fixed to avoid shell globbing
Signed-off-by: David Zafman <dzafman@redhat.com>
David Zafman [Thu, 9 Jul 2020 18:25:46 +0000 (11:25 -0700)]
osd: Fix dump_scrub_reservation usage
Signed-off-by: David Zafman <dzafman@redhat.com>
Laura Paduano [Fri, 24 Jul 2020 13:26:56 +0000 (15:26 +0200)]
Merge pull request #35376 from Devp00l/update-backport-doc
doc: Resolving conflicts with ceph-backport.sh
Reviewed-by: Lenz Grimmer <lgrimmer@suse.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Radoslaw Zarzynski [Fri, 24 Jul 2020 10:14:25 +0000 (10:14 +0000)]
common/bl: don't access raw::len directly. Use the getter instead.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Radoslaw Zarzynski [Wed, 22 Jul 2020 18:57:51 +0000 (18:57 +0000)]
common/bl: don't access raw::data directly. Use the getter instead.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Yuval Lifshitz [Fri, 24 Jul 2020 11:46:55 +0000 (14:46 +0300)]
Merge pull request #36238 from yuvalif/fix_zippet_notif_merge
rgw/notification: fix merge issues from zipper6
Kefu Chai [Fri, 24 Jul 2020 10:17:14 +0000 (18:17 +0800)]
Merge pull request #36268 from tchaikov/wip-crimson-msgr
crimson/net: do not reset need_addr before learning it
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Kefu Chai [Thu, 23 Jul 2020 16:09:08 +0000 (00:09 +0800)]
crimson/net: do not reset need_addr before learning it
because we don't bind both v1 and v2 addresses, when monitor returns a
v1 peer address, as the client side, crimson-osd just drops the
connection. but this failed attempt to learn the myaddr resets
`need_addr`. and this prevents crimson-osd from learning the v2 address
returned by monitor.
in this change, we reset need_addr only after it is learned from the
peer.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 24 Jul 2020 02:11:02 +0000 (10:11 +0800)]
Merge pull request #36256 from tchaikov/wip-ceph-debug-docker-crimson
ceph-debug-docker: add --flavor option
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Kefu Chai [Thu, 23 Jul 2020 05:58:25 +0000 (13:58 +0800)]
ceph-debug-docker: add --flavor option
* add --flavor option, which is "default" by default, so one can, for
example, pass "--flavor crimson" to ceph-debug-docker
* extract $repo_url to avoid repeating the shared bits between centos
and debian derivatives envs.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Jianpeng Ma [Fri, 24 Jul 2020 00:52:14 +0000 (08:52 +0800)]
os/bluestore/BlueFS: avoid useless ceph_clock_now() call.
The overhead of utime_t constructor utime_t() is less than ceph_clock_now().
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Patrick Donnelly [Fri, 24 Jul 2020 00:12:43 +0000 (17:12 -0700)]
Merge PR #36136 into master
* refs/pull/36136/head:
qa/tasks/nfs:Add test for relative and just '/' pseudo path
mgr/nfs: Check if pseudo path is absolute path or just '/'
Reviewed-by: Michael Fritch <mfritch@suse.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
Adam King [Fri, 10 Jul 2020 12:09:39 +0000 (08:09 -0400)]
mgr/cephadm: allow use of authenticated registry
Add option to use custom authenticated registry during
bootstrap as well as a registry-login command in order
to let user change authenticated registry login info
Fixes: https://tracker.ceph.com/issues/44886
Signed-off-by: Adam King <adking@redhat.com>
Sebastian Wagner [Thu, 23 Jul 2020 15:53:31 +0000 (17:53 +0200)]
Merge pull request #36239 from sebastian-philipp/cephadm-parallel-hosts
mgr/cephadm: create OSDs in parallel
Reviewed-by: Joshua Schmid <jschmid@suse.de>
Reviewed-by: Kiefer Chang <kiefer.chang@suse.com>
Sebastian Wagner [Thu, 23 Jul 2020 15:52:49 +0000 (17:52 +0200)]
Merge pull request #35839 from mgfritch/cephadm-ignore-mon-mgr-svc-id
python-common: clean-up ServiceSpec.service_id handling
Reviewed-by: Joshua Schmid <jschmid@suse.de>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>
Neha Ojha [Thu, 23 Jul 2020 14:36:47 +0000 (07:36 -0700)]
Merge pull request #36251 from neha-ojha/wip-fix-43888
osd/OSD.cc: remove osd_lock for bench
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Adam Kupczyk <akucpzyk@redhat.com>
Kefu Chai [Thu, 23 Jul 2020 14:26:11 +0000 (22:26 +0800)]
Merge pull request #36261 from tchaikov/wip-unforty-seastar
ceph.spec.in: cull _FORTIFY_SOURCE macro from CXXFLAGS for seastar
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Thu, 23 Jul 2020 14:19:56 +0000 (22:19 +0800)]
Merge pull request #36252 from tchaikov/wip-doc-crimson
doc/dev/crimson: remove redundant options
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Thu, 23 Jul 2020 13:13:44 +0000 (21:13 +0800)]
Merge pull request #36263 from tchaikov/wip-more-log-for-dashboard-test
mgr/dashboard: print more osd log when backend-api-tests fails
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Jan Fajerski [Tue, 21 Apr 2020 13:47:42 +0000 (15:47 +0200)]
ceph-volume: add drive-group subcommand
This new subcommand takes a drive group specification as json and deploys
the OSDs accordingly.
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Fixes: https://tracker.ceph.com/issues/46689
Kefu Chai [Thu, 23 Jul 2020 09:47:14 +0000 (17:47 +0800)]
mgr/dashboard: print more log when backend-api-tests fails
Signed-off-by: Kefu Chai <kchai@redhat.com>
Mykola Golub [Thu, 23 Jul 2020 09:20:43 +0000 (12:20 +0300)]
Merge pull request #36242 from dillaman/wip-46668
librbd: flush all queued object IO from simple scheduler
Reviewed-by: Mykola Golub <mgolub@suse.com>
Kefu Chai [Thu, 23 Jul 2020 08:53:41 +0000 (16:53 +0800)]
doc/dev/crimson: add more examples for seastar-addr2line
Signed-off-by: Kefu Chai <kchai@redhat.com>
Sebastian Wagner [Thu, 23 Jul 2020 08:43:17 +0000 (10:43 +0200)]
mgr/cephadm: re-add `apply_drivegroups()`
Fixes: d348d7bf8d3663140c089937b62a0b316b69176b
Fixes: https://tracker.ceph.com/issues/46681
Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>