]>
git.apps.os.sepia.ceph.com Git - ceph.git/log
Zac Dover [Tue, 29 Jun 2021 01:29:33 +0000 (11:29 +1000)]
doc/cephadm: improve "Upgrading Ceph" (main)
This PR makes a couple of minor improvements to the text under the
top-level section "Upgrading Ceph" in the "Upgrading Ceph" chapter of
the cephadm documentation.
This one, mercifully, contains only a couple of changes.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
Patrick Donnelly [Mon, 28 Jun 2021 18:57:22 +0000 (11:57 -0700)]
Merge PR #42011 into master
* refs/pull/42011/head:
mds: just respawn mds daemon when osd op requests timeout
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Mon, 28 Jun 2021 18:55:23 +0000 (11:55 -0700)]
Merge PR #41988 into master
* refs/pull/41988/head:
logrotate: include cephfs-mirror daemon
cephfs-mirror: reopen logs on SIGHUP
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Mon, 28 Jun 2021 18:52:46 +0000 (11:52 -0700)]
Merge PR #41917 into master
* refs/pull/41917/head:
mgr/mgr_util: switch using unshared cephfs connections whenever possible
Reviewed-by: Ramana Raja <rraja@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
Patrick Donnelly [Mon, 28 Jun 2021 18:50:22 +0000 (11:50 -0700)]
Merge PR #41849 into master
* refs/pull/41849/head:
mds: try to flush the mdlog when requesting the rdlock
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Patrick Donnelly [Mon, 28 Jun 2021 16:48:59 +0000 (09:48 -0700)]
Merge PR #42038 into master
* refs/pull/42038/head:
mds: fix compile warning
Reviewed-by: Kefu Chai <kchai@redhat.com>
zdover23 [Mon, 28 Jun 2021 15:16:49 +0000 (01:16 +1000)]
Merge pull request #42049 from zdover23/wip-doc-cephadm-serve-man-disable-auto-deploy-of-daemons
doc/cephadm: enrich "Disabling Automatic Deploy..."
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Ali Maredia [Mon, 28 Jun 2021 14:15:12 +0000 (10:15 -0400)]
Merge pull request #41681 from TRYTOBE8TME/wip-rgw-dpp
src/rgw: DPP addition
Reviewed-by: Ali Maredia <amaredia@redhat.com>
Kefu Chai [Mon, 28 Jun 2021 12:33:38 +0000 (20:33 +0800)]
Merge pull request #42050 from rzarzynski/wip-crimson-alienstore-fix-attrs-conv
crimson/os: fix memory corruption in AlienStore::get_attrs().
Reviewed-by: Kefu Chai <kchai@redhat.com>
Radoslaw Zarzynski [Sun, 27 Jun 2021 21:50:37 +0000 (21:50 +0000)]
crimson/os: fix memory corruption in AlienStore::get_attrs().
`FuturizedStore` and `ObjectStore` use different memory layout for
conveying object attributes: map of `bufferlists` and map of `bptrs`
respectively. Unfortunately, `AlienStore` was trying to solve this
mismatch with just a `reinterpret_cast`.
Very likely this problem was the root cause behind the observed
crashes in `PGBackend::load_matadata` like the following one:
```
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: DEBUG 2021-06-15 09:24:19,199 [shard 0] osd - peering_event(id=412, detail=PeeringEvent(from=7 pgid=5.14 sent=49 requested=49 evt=epoch_sent: 49 epoch_requested: 49 MInfoRec from 7 info: 5.14( v 45'2 (0'0,45'2] local-lis/les=48/49 n=0 ec=44/44 lis/c=48/44 les/c/f=49/45/0 sis=48) pg_lease_ack(ruub 19.176788330s))): complete
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: Segmentation fault on shard 0.
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: Backtrace:
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 0# 0x000055C99757FFBF in /usr/bin/ceph-osd
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 1# FatalSignal::signaled(int, siginfo_t const*) in /usr/bin/ceph-osd
2021-06-15T09:25:07.511 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in /usr/bin/ceph-osd
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 3# 0x00007F34BB632B20 in /lib64/libpthread.so.0
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 4# 0x000055C99263D4D2 in /usr/bin/ceph-osd
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 5# 0x000055C992740E47 in /usr/bin/ceph-osd
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 6# seastar::continuation<seastar::internal::promise_base_with_type<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > >, seastar::noncopyable_function<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > > (seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)>, seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >::then_wrapped_nrvo<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > >, seastar::noncopyable_function<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > > (seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)> >(seastar::noncopyable_function<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > > (seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)>&&)::{lambda(seastar::internal::promise_base_with_type<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > >&&, seastar::noncopyable_function<crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)84> > >::_future<crimson::errorated_future_marker<std::unique_ptr<PGBackend::loaded_object_md_t, std::default_delete<PGBackend::loaded_object_md_t> > > > (seastar::future<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)>&, seastar::future_state<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >&&)#1}, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >::run_and_dispose() in /usr/bin/ceph-osd
2021-06-15T09:25:07.512 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 7# 0x000055C99CFD195F in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 8# 0x000055C99CFD6EA0 in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 9# 0x000055C99D188F0B in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 10# 0x000055C99CCE698A in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 11# 0x000055C99CCF0AAE in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 12# main in /usr/bin/ceph-osd
2021-06-15T09:25:07.513 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 13# __libc_start_main in /lib64/libc.so.6
2021-06-15T09:25:07.514 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: 14# _start in /usr/bin/ceph-osd
2021-06-15T09:25:07.514 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:19 smithi100 conmon[54917]: Fault at location: 0x31dfff8000
2021-06-15T09:25:07.514 INFO:journalctl@ceph.osd.3.smithi100.stdout:Jun 15 09:24:20 smithi100 podman[55356]: 2021-06-15 09:24:20.
230341885 +0000 UTC m=+0.
072958807 container died
a3ea2a1d0a176286b93b8f5b94458982b9038e70d09128fb55f53b92976f0c42 (image=quay.ceph.io/ceph-ci/ceph@sha256:
13ae953e3f83ee011d784d6eb9126fdc692f5bb688fe7d918be61ca7a7282b3c , name=ceph-
43579b90 -cdba-11eb-8c13-
001a4aab830c -osd.3)
```
The fix deals with the issue by wrapping the `bptrs` in `bufferlists`.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Sebastian Wagner [Mon, 28 Jun 2021 09:46:34 +0000 (11:46 +0200)]
Merge pull request #41989 from zdover23/wip-doc-cephadm-serve-man-deploy-of-daemons-2021-06-24
doc/cephadm: enrich "deployment of daemons"
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Zac Dover [Mon, 28 Jun 2021 09:17:43 +0000 (19:17 +1000)]
doc/cephadm: enrich "Disabling Automatic Deploy..."
This PR rewrites and reformats the section "Disabling Automatic
Deployment of Daemons" in the "Service Management" chapter of the
cephadm guide.
I've rewritten some sentences, removed some "please"s, and added
some section titles so that the content in this is better
signposted.
Signed-off-by: Zac Dover <zac.dover@gmail.com>
Kefu Chai [Sun, 27 Jun 2021 14:31:23 +0000 (22:31 +0800)]
Merge pull request #41998 from kevinzs2048/arm64-rwl-cache-optional
ceph.spec.in, debian/rules: enable rbd-rwl-cache by default only on x86_64
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sun, 27 Jun 2021 11:20:31 +0000 (19:20 +0800)]
Merge pull request #42021 from tchaikov/wip-rpm-memory-constraint
ceph.spec.in: increase memory per core to 3000MB on SUSE distros
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>
Sage Weil [Sat, 26 Jun 2021 14:41:27 +0000 (10:41 -0400)]
Merge PR #41574 into master
* refs/pull/41574/head:
qa/tasks/vstart_runner: add LocalCluster.run
qa/tasks/cephfs/test_nfs: fiddle with sudo
mgr/nfs/export: some cleanup, minor refactoring
mgr/nfs/cluster: remove unused @cluster_setter
nfs/mgr: fix help message case
doc/cephfs/fs-nfs-export: add note about export update behavior
mgr/nfs: move user create/delete into helper
mgr/nfs: refactor _delete_user helper
mgr/nfs: refactor create_export_from_dict() helper
mgr/nfs: keep 'nfs export get' around for backward-compat
mgr/nfs: rename method
qa/tasks/cephfs/test_nfs: test new export via apply
doc/cephfs/fs-nfs-export: be consistent with cluster_id and _ vs -
mgr/nfs: addr -> client_addr for 'nfs export create ...'
mgr/nfs: fix tests
mgr/nfs: 'nfs export get' -> 'nfs export info'
mgr/nfs: binding -> pseudo_path
mgr/nfs: more revisions based on review
mgr/nfs: adjust NFSExceptoin errno arg
doc/cephfs: update 'nfs export {get,apply}' docs
mgr/nfs: merge FSExport back into ExportMgr
doc/radosgw/nfs: document mgr/nfs way to add/remove rgw exports
mgr/nfs: merge 'nfs export {update,import}' -> 'nfs export apply'
mgr/nfs: test export creation and list
mgr/nfs: test export_update (+ fixes)
mgr/nfs: test Export.validate(); several fixes
mgr/nfs: test that export <-> block+dict conversions go both ways
mgr/nfs: clean up test a bit
mgr/nfs/export: fix export validation
mgr/nfs/export: fix tests
mgr/nfs: handle option addr/client block in create_export()
mgr/nfs: allow multiple addrs for new exports
mgr/nfs: fix/finish rgw export
mgr/nfs/module: clusterid -> cluster_id
mgr/nfs/export: fix export_update_1 to type check
mgr/nfs/cluster: fix type error
mgr/nfs/export: wrap long lines
mgr/nfs: ExportMgr._delete_export only works for cephfs for now
mgr/nfs: Remove pool_ns from NFSCluster
mgr/nfs: Remove ExportMgr.rados_namespace
mgr/nfs: flake8
mgr/nfs: Add type checking
mgr/nfs: Add __eq__ method to Export
mgr/nfs: Add some compatibility to mgr/dashboard
mgr/nfs: Fix whitespace handling
mgr/nfs: Copy unit tests from mgr/dashboard
mgr/nfs: partially implement rgw export support
mgr/nfs: abstract FSAL; add RGWFSAL
mgr/nfs: refactor to merge 'update' and 'import' code
mgr/nfs: add 'nfs export import' command
mgr/nfs: refactor 'nfs export update' and export validation
mgr/nfs: fix _fetch_export to distinguish between clusters
mgr/nfs: move export ganesha conf translation into caller
mgr/nfs: name nfs cephfs client key 'nfs.{cluster_id}.{export_id}'
mgr/nfs: add --addr to 'nfs export create'
mgr/nfs: add --squash to 'nfs export create'
mgr/nfs/export_utils: include false but non-None items in config
vstart.sh: enable nfs module
mgr/cephadm: nfs: drop attr_expiration_time from top-level config
mgr/cephadm: remove Dir_Chunk = 0
Reviewed-by: Michael Fritch <mfritch@suse.com>
Kefu Chai [Sat, 26 Jun 2021 14:18:14 +0000 (22:18 +0800)]
Merge pull request #41937 from liewegas/mgr-crash
mgr: generate crash dumps for Python exceptions in mgr modules
Reviewed-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Sat, 26 Jun 2021 14:17:30 +0000 (22:17 +0800)]
Merge pull request #41946 from liewegas/fix-51294
mgr/devicehealth: fix _get_device_metrics ValueError
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Fri, 25 Jun 2021 23:16:19 +0000 (19:16 -0400)]
qa/tasks/vstart_runner: add LocalCluster.run
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Fri, 25 Jun 2021 19:08:03 +0000 (15:08 -0400)]
qa/tasks/cephfs/test_nfs: fiddle with sudo
- no sudo for 'ceph' commands
- explicit sudo for _sys_cmd (things like 'rados' don't need sudo!)
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 23 Jun 2021 16:42:17 +0000 (12:42 -0400)]
mgr/nfs/export: some cleanup, minor refactoring
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 24 Jun 2021 20:05:14 +0000 (16:05 -0400)]
mgr/nfs/cluster: remove unused @cluster_setter
Signed-off-by: Sage Weil <sage@newdream.net>
Kefu Chai [Sat, 26 Jun 2021 01:08:34 +0000 (09:08 +0800)]
Merge pull request #41977 from rzarzynski/wip-crimson-common-print-more-on-crash
crimson/common: dump more on faults
Reviewed-by: Samuel Just <sjust@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Patrick Donnelly [Sat, 26 Jun 2021 00:54:18 +0000 (17:54 -0700)]
mds: fix compile warning
../src/mds/Server.cc: In member function ‘void Server::handle_set_vxattr(MDRequestRef&, CInode*)’:
../src/mds/Server.cc:5703:18: warning: unused variable ‘realm’ [-Wunused-variable]
SnapRealm *realm = cur->find_snaprealm();
^~~~~
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Sage Weil [Thu, 24 Jun 2021 16:41:18 +0000 (12:41 -0400)]
nfs/mgr: fix help message case
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 23 Jun 2021 16:46:07 +0000 (12:46 -0400)]
doc/cephfs/fs-nfs-export: add note about export update behavior
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Tue, 22 Jun 2021 16:25:44 +0000 (12:25 -0400)]
mgr/nfs: move user create/delete into helper
- Do user create or delete via a helper
- Defer until after we have validated the Export (on create or update)
- Support updates to user_id, which is needed to keep the naming consistent
and to also support changing the bucket, since the user_id is derived
from that.
Signed-off-by: Sage Weil <sage@newdream.net>
Ernesto Puerta [Fri, 25 Jun 2021 18:45:28 +0000 (20:45 +0200)]
Merge pull request #41838 from p-se/grafana-clean-up
monitoring: Clean up Grafana dashboards
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: jan--f <NOT@FOUND>
Reviewed-by: p-se <NOT@FOUND>
Reviewed-by: Paul Cuzner <pcuzner@redhat.com>
Sage Weil [Fri, 25 Jun 2021 17:48:45 +0000 (13:48 -0400)]
qa/suites/rados/mgr: whitelist module crash during selftest
One of the selftests triggers an exception from serve().
Signed-off-by: Sage Weil <sage@newdream.net>
Ernesto Puerta [Fri, 25 Jun 2021 16:48:34 +0000 (18:48 +0200)]
Merge pull request #41721 from aaryanporwal/telemetry-ident-fix
mgr/dashboard: telemetry activate: show ident fields when checked
Reviewed-by: aaryanporwal <NOT@FOUND>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>
Daniel Gryniewicz [Fri, 25 Jun 2021 16:00:37 +0000 (12:00 -0400)]
Merge pull request #41991 from dang/wip-dang-bucket-delete
RGW - Bucket Remove Op: Pass in user
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Neha Ojha [Fri, 25 Jun 2021 15:48:45 +0000 (08:48 -0700)]
Merge pull request #41993 from ronen-fr/wip-ronenf-50346
osd/scrub: replace a ceph_assert() with a test
Reviewed-by: Neha Ojha <nojha@redhat.com>
Kefu Chai [Fri, 25 Jun 2021 13:02:47 +0000 (21:02 +0800)]
Merge pull request #42024 from rzarzynski/wip-crimson-load_obc_nocpy
crimson/osd: don't extra copy hobject in PG::load_head_obc().
Reviewed-by: Kefu Chai <kchai@redhat.com>
Radoslaw Zarzynski [Wed, 23 Jun 2021 09:25:41 +0000 (09:25 +0000)]
crimson/osd: don't extra copy hobject in PG::load_head_obc().
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Kefu Chai [Fri, 25 Jun 2021 05:29:23 +0000 (13:29 +0800)]
ceph.spec.in: increase memory per core to 3000MB on SUSE distros
in the KVM instance offered by OBS, we have
[ 346s] + cat /proc/meminfo
[ 347s] MemTotal:
10167736 kB
[ 347s] MemFree:
4983964 kB
[ 347s] MemAvailable:
9826800 kB
[ 347s] Buffers: 85856 kB
[ 347s] Cached:
4615192 kB
[ 347s] SwapCached: 0 kB
...
[ 347s] SwapTotal:
2097148 kB
and its number of hardware threads is
[ 346s] ++ /usr/bin/getconf _NPROCESSORS_ONLN
[ 346s] + _threads=8
so ($MemTotal+$SwapTotal)/1024/2600 = 4.6, which is less
than the # of threads, so "4" was used for the number of jobs.
but per our recent observation in
38be14bc0fa32be6877dea08ebd35495d39e464f , some compiling jobs could
take up to 3GB. in the OOM failure in OBS, we had
[24915s] [24848.843594] Out of memory: Killed process 16894 (cc1plus) total-vm:4293756kB, anon-rss:2970012kB, file-rss:0kB, shmem-rss:0kB, UID:399 pgtables:8324kB oom_score_adj:0
where 4GiB memory was allocated, in which 3GiB was mapped into
memory. this matches with our findings.
in this change, the memory per core is bumped up to 3000MB
in hope to address the OOB. the downside of this change is
that it would take even longer to finish the build if the
building host is limited in memory.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Fri, 25 Jun 2021 09:01:11 +0000 (17:01 +0800)]
Merge pull request #41615 from tchaikov/wip-avl-alloc-ff
os/bluestore/AvlAllocator: introduce bluestore_avl_alloc_ff_max_* options
Reviewed-by: Igor Fedotov <ifedotov@suse,com>
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Kefu Chai [Fri, 25 Jun 2021 06:57:31 +0000 (14:57 +0800)]
Merge pull request #38939 from ronen-fr/wip-ronenf-scrub-blocked
osd: issue a warning if the scrubber blocks for too long on an object
Reviewed-by: David Zafman <dzafman@redhat.com>
Kefu Chai [Fri, 25 Jun 2021 06:51:25 +0000 (14:51 +0800)]
Merge pull request #40850 from varshar16/wip-vstart-support-cephadm-rgw
src/vstart: deploy rgw service with cephadm and create rgw user with system flag
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Kefu Chai <kchai@redhat.com>
Samuel Just [Fri, 25 Jun 2021 06:08:02 +0000 (23:08 -0700)]
Merge pull request #42020 from athanatos/sjust/wip-cache-assert
crimson/os/seastore: transaction conflict handling improvements
Reviewed-by: Yingxin Cheng <yingxin.cheng@intel.com>
Kefu Chai [Fri, 25 Jun 2021 04:52:23 +0000 (12:52 +0800)]
Merge pull request #42003 from cyx1231st/wip-seastore-fix-onode-tree
crimson/onode-staged-tree: fix ref-counter assert failures
Reviewed-by: Kefu Chai <kchai@redhat.com>
Radoslaw Zarzynski [Tue, 22 Jun 2021 14:24:22 +0000 (14:24 +0000)]
crimson/common: dump entire siginfo on segmentation fault.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Radoslaw Zarzynski [Tue, 22 Jun 2021 14:23:02 +0000 (14:23 +0000)]
crimson/common: FatalSignal::signaled() takes siginfo by a reference.
There is no point in having the distincted `nullptr` value.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Radoslaw Zarzynski [Tue, 22 Jun 2021 14:15:40 +0000 (14:15 +0000)]
crimson/common: dump /proc/self/maps on crash.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Xiubo Li [Thu, 24 Jun 2021 06:41:10 +0000 (14:41 +0800)]
mds: just respawn mds daemon when osd op requests timeout
Fixes: https://tracker.ceph.com/issues/51280
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Kevin Zhao [Thu, 24 Jun 2021 00:00:03 +0000 (08:00 +0800)]
ceph.spec.in, debian/rules: Set rbd-rwl-cache optional on arm64 and ppc64le
set rwl cache option on arm64 and ppc64le as PMDK is not well supported.
Currently, only 64-bit Linux* and Windows* on x86 are supported PMDK
Reference:
1. Experimental support on Arm64, but lacking of librpmem:
See: https://github.com/pmem/pmdk#experimental-support-for-64-bit-arm
2. No RPM for PMDK on Arm64:
See: https://bugzilla.redhat.com/show_bug.cgi?id=
1340635
3. > Does PMDK support ARM64*?
> Currently only 64-bit Linux* and Windows* on x86 are supported.
See: https://software.intel.com/content/www/us/en/develop/articles/persistent-memory-faq.html
4. Make check fail on Arm64
See: https://github.com/pmem/pmdk/issues/5255
Fixes: https://tracker.ceph.com/issues/51339
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
Kefu Chai [Fri, 25 Jun 2021 02:59:55 +0000 (10:59 +0800)]
Merge pull request #41889 from ChenFanTony/mkfs_wait_complete
osd/OSD: mkfs need wait for transcation completely finish
Reviewed-by: Kefu Chai <kchai@redhat.com>
Yingxin Cheng [Thu, 24 Jun 2021 07:50:18 +0000 (15:50 +0800)]
crimson/onode-staged-tree: reset root node after lookup
Otherwise there could be unexpected references that will break the
asserts when remove nodes during insert/delete.
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Yingxin Cheng [Thu, 24 Jun 2021 07:49:23 +0000 (15:49 +0800)]
crimson/onode-staged-tree: add missing mutable keyword
Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>
Kefu Chai [Fri, 25 Jun 2021 00:27:41 +0000 (08:27 +0800)]
Merge pull request #42004 from tchaikov/wip-crimson-osd-fsm
crimson/osd: shutdown if osdmap forces us to do so
Reviewed-by: Chunmei Liu <chunmei.liu@intel.com>
Reviewed-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
Samuel Just [Thu, 24 Jun 2021 23:25:54 +0000 (16:25 -0700)]
seastore/.../staged_fltree/node: check for conflict in Node::load
This will be unnecessary once converted to interruptible_future.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 24 Jun 2021 23:22:43 +0000 (16:22 -0700)]
crimson/os/seastore/lba_manager/btree/lba_btree_node_impl: add debugging
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 24 Jun 2021 23:28:10 +0000 (16:28 -0700)]
seastore/.../node_extent_manager/seastore: detect transaction conflicts in read_extent
This won't be necessary once converted to interruptible_future.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 24 Jun 2021 22:24:09 +0000 (15:24 -0700)]
crimson/os/seastore/cache: mark conflict in get_extent
After wait_io, the extent may have been mutated again, so it may be
invalid. Check in the caller and mark the transaction conflicted as
needed.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 24 Jun 2021 23:27:34 +0000 (16:27 -0700)]
crimson/os/seastore/transasction: expose is_conflicted
Useful for components not yet converted to use interruptible_future.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 24 Jun 2021 20:19:47 +0000 (13:19 -0700)]
Merge pull request #41963 from athanatos/sjust/wip-interruptible-tm
crimson/os/seastore: refactor transaction_manager and below to use interruptible_future
Reviewed-by: Kefu Chai <kchai@redhat.com>
Sage Weil [Sun, 20 Jun 2021 22:49:27 +0000 (17:49 -0500)]
mgr/devicehealth: fix _get_device_metrics ValueError
This appears to have broken with
abd35d47696c208990355395d48c1c1e261de95c
The SQL OR doesn't work because in the case that sample is passed,
_t2epoch(min_sample) is 0 and the 0 <= time portion of the expression
is always true.
Fixes: https://tracker.ceph.com/issues/51294
Signed-off-by: Sage Weil <sage@newdream.net>
Samuel Just [Thu, 24 Jun 2021 17:08:34 +0000 (17:08 +0000)]
test/crimson/test_interruptible_future: disable handle_error
Seems to cause a linker hang with gcc-9 in bionic.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Sat, 19 Jun 2021 07:43:27 +0000 (00:43 -0700)]
crimson/os/seastore/transaction_manager: pass t by ref to submit_transaction
Signed-off-by: Samuel Just <sjust@redhat.com>
Casey Bodley [Thu, 24 Jun 2021 16:17:53 +0000 (12:17 -0400)]
Merge pull request #39934 from Jeegn-Chen/wip-tracker-49128
rgw: write meta of a MP part to a correct pool
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Casey Bodley [Thu, 24 Jun 2021 16:16:19 +0000 (12:16 -0400)]
Merge pull request #41739 from liewegas/rgw-realm-metadata
radosgw: include realm_{id,name} in service map
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Casey Bodley <cbodley@redhat.com>
Daniel Gryniewicz [Wed, 23 Jun 2021 15:31:22 +0000 (11:31 -0400)]
RGW - Bucket Remove Op: Pass in user
When a bucket remove op is called on the non-master zone, the op is
forwarded to the master zone, but this needs a user, so pass the user
in.
Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
zdover23 [Thu, 24 Jun 2021 13:51:30 +0000 (23:51 +1000)]
Merge pull request #41994 from anthonyeleven/anthonyeleven/adjust-rados-operations-pools
doc/rados/operations: Update pools.rst
Reviewed-by: Zac Dover <zac.dover@gmail.com>
Ilya Dryomov [Thu, 24 Jun 2021 12:48:13 +0000 (14:48 +0200)]
Merge pull request #42005 from trociny/wip-51342
test/librbd: use really invalid domain
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Kefu Chai [Thu, 24 Jun 2021 11:52:51 +0000 (19:52 +0800)]
ceph.spec.in: enable --with-rbd_ssd_cache by default
unlike rbd_rwl_cache, rbd_ssd_cache does not depend on pmdk (libpmem),
so let's enable it on all supported architecture and rpm based distros.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Mykola Golub [Thu, 24 Jun 2021 10:23:21 +0000 (11:23 +0100)]
test/librbd: use really invalid domain
in TestMockMigrationHttpClient.OpenResolveFail
Fixes: https://tracker.ceph.com/issues/51342
Signed-off-by: Mykola Golub <mgolub@suse.com>
Kefu Chai [Thu, 24 Jun 2021 11:10:22 +0000 (19:10 +0800)]
Merge pull request #41828 from tchaikov/wip-btree-alloc
os/bluestore: add BtreeAllocator
Reviewed-by: Adam Kupczyk <akupczyk@redhat.com>
Kefu Chai [Thu, 24 Jun 2021 07:57:53 +0000 (15:57 +0800)]
crimson/osd: document fsm of crimson osd
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 24 Jun 2021 06:34:41 +0000 (14:34 +0800)]
crimson/osd: mark more OSD methods private
they are internal helpers, not part of the public interface of the OSD
class.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 24 Jun 2021 06:26:07 +0000 (14:26 +0800)]
crimson/osd: shutdown if osdmap forces us to do so
mirror the change introduced by
5dbae13ce0f5b0104ab43e0ccfe94f832d0e1268
in classic osd.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kefu Chai [Thu, 24 Jun 2021 06:16:07 +0000 (14:16 +0800)]
crimson/osd: use discard_result() in stop()
we don't care about the result of shutdown() of messengers, when
shutting down the daemon actually, and we don't handle the failures.
Signed-off-by: Kefu Chai <kchai@redhat.com>
Kalpesh Pandya [Tue, 1 Jun 2021 09:15:14 +0000 (14:45 +0530)]
rgw: DPP addition to log messages
Following are the files focused in this PR:
1. rgw_lc.cc
2. rgw_rados.cc
3. rgw_op.cc
4. services/svc_rados.cc
5. rgw_object_expirer_core.cc
6. rgw_quota.cc
Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>
Venky Shankar [Thu, 24 Jun 2021 04:57:29 +0000 (00:57 -0400)]
logrotate: include cephfs-mirror daemon
Fixes: http://tracker.ceph.com/issues/51318
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Venky Shankar [Wed, 23 Jun 2021 06:59:13 +0000 (02:59 -0400)]
cephfs-mirror: reopen logs on SIGHUP
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Patrick Donnelly [Wed, 23 Jun 2021 20:24:58 +0000 (13:24 -0700)]
Merge PR #41935 into master
* refs/pull/41935/head:
mds: avoid journaling overhead for ceph.dir.subvolume for no-op case
Reviewed-by: Venky Shankar <vshankar@redhat.com>
Samuel Just [Sat, 19 Jun 2021 07:40:34 +0000 (00:40 -0700)]
crimson/os/seastore: convert transaction_manager internally to use interruptible_future
Consumers of TransactionManager use wrapper classes InterruptedTransactionManager
and InterruptedTMRef for now until we convert them.
Also converts users of InterruptedCache etc and removes.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Mon, 14 Jun 2021 23:25:14 +0000 (23:25 +0000)]
test/crimson/seastore/test_seastore_cache: use cache directly
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Fri, 11 Jun 2021 00:21:26 +0000 (17:21 -0700)]
crimson/os/seastore/lba_manager/btree: convert to use interruptible_future
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Tue, 22 Jun 2021 00:10:29 +0000 (17:10 -0700)]
crimson/os/seastore/cache: convert to use interruptible future
Introduces InterruptedCache wrapper for now for components not yet
converted.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 3 Jun 2021 21:51:03 +0000 (14:51 -0700)]
crimson/os/seastore/transaction: introduce TransactionConflictCondition interruptor
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 3 Jun 2021 21:43:37 +0000 (14:43 -0700)]
crimson/os/seastore/cache.h: remove unused get_extents
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 12 May 2021 09:04:16 +0000 (09:04 +0000)]
crimson/os/seastore: invalidate transaction referencing invalid extents
Modify read_set to retain a reverse mapping from extents back to
transactions and use it to update Transaction::conflicted upon
invalidation.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Mon, 21 Jun 2021 23:57:48 +0000 (16:57 -0700)]
test/crimson/test_interruptible_future: add tests for errorated behavior
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Tue, 15 Jun 2021 00:24:41 +0000 (17:24 -0700)]
crimson/common/interruptible_future: add interruptor::base_ertr
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Fri, 11 Jun 2021 00:03:37 +0000 (17:03 -0700)]
crimson/common/interruptible_future: add safe_then_interruptible for multiple error handlers
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Fri, 18 Jun 2021 06:19:16 +0000 (23:19 -0700)]
crimson/common/interruptible_future: refactor handle_interruption
handle_interruption can't really be validly used outside of
with_interruption_cond. Make private, and adjust is_interruption
to not require an instance.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 17 Jun 2021 21:06:41 +0000 (14:06 -0700)]
crimson/common/interruptible_future: remove enable/disable_interruption
with_interruption_cond needs to check the condition on the way in.
call_with_interruption_impl already has the required machinery, so
let's just use it and dispense with the other helpers.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 9 Jun 2021 23:09:09 +0000 (16:09 -0700)]
crimson/common/interruptible_future: remove unnecessary make_ready_future template
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Mon, 7 Jun 2021 20:20:46 +0000 (13:20 -0700)]
crimson/common/interruptible_future: add ready|exception_future_marker constructors
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 3 Jun 2021 21:50:22 +0000 (14:50 -0700)]
crimson/common/interruptible_future: introduce future<> helper to interruptor
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 3 Jun 2021 21:49:56 +0000 (14:49 -0700)]
crimson/common/interruptible_future: introduce si_then as shorthand for safe_then_interruptible
safe_then_interruptible is too long for common use within seastore.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 2 Jun 2021 03:06:44 +0000 (20:06 -0700)]
crimson/common/interruptible_future: introduce with_interruption_to_error
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 10 Jun 2021 00:34:21 +0000 (17:34 -0700)]
crimson/common/interruptible_future: add handle_interruption
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 10 Jun 2021 00:32:44 +0000 (17:32 -0700)]
crimson/common/interruptible_future: add futurize::invoke
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 2 Jun 2021 03:05:12 +0000 (20:05 -0700)]
crimson/common/interruptible_future: add common errorator forwards
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 2 Jun 2021 02:52:27 +0000 (19:52 -0700)]
crimson/common/errorator: add futurize::apply
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 10 Jun 2021 00:07:33 +0000 (17:07 -0700)]
common/interruptible_future: use errorated future as core_type, fix constructor
No reason really to remember the underlying seastar::future type, we should
only be interacting with the errorated future wrapped type.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Wed, 2 Jun 2021 03:03:35 +0000 (20:03 -0700)]
test/crimson/test_interruptible_future: using namespace crimson
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Mon, 7 Jun 2021 20:14:45 +0000 (13:14 -0700)]
crimson/common/fixed_kv_node_layout: add reference type for do_for_each implementations
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Tue, 22 Jun 2021 00:10:23 +0000 (17:10 -0700)]
crimson/os/seastore/cache: fix typo in comment
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Thu, 3 Jun 2021 21:51:43 +0000 (14:51 -0700)]
crimson/os/seastore/cache: rename retire_extent_addr for addr overload
Makes InterruptibleCache bit in the later patch simpler, and is somewhat
clearer.
Signed-off-by: Samuel Just <sjust@redhat.com>
Samuel Just [Sat, 19 Jun 2021 09:05:44 +0000 (02:05 -0700)]
crimson/os/seastore/lba_manager: make complete_transaction void
This really can't result in mutations (the transaction already committed!)
and presently doesn't require any IO at all. Just make it void.
Signed-off-by: Samuel Just <sjust@redhat.com>