osd: remove invalid put on message
message converted to use smart pointer, the put is extra step that
causing the double put and valgrind error
the regression was introduced by d4a71c3
Kefu Chai [Sat, 6 Aug 2022 00:24:12 +0000 (08:24 +0800)]
mgr/dashboard: bump up teuthology
to include the fix of e7c5d67e10fe29da22180f9e09b8973ae166c8fc,
see https://github.com/ceph/teuthology/pull/1746.
to address the test failure on ubuntu jammy. where we have python3.10
Kefu Chai [Fri, 5 Aug 2022 08:40:41 +0000 (16:40 +0800)]
cmake: remove spaces in macro used for compiling cython code
we are facing following FTBFS on jammy + GCC-11.2 + Cython 0.29 +
CMake 3.22:
creating /home/jenkins-build/build/workspace/ceph-api/build/lib/cython_modules/temp.linux-x86_64-3.10/home/jenkins-build/build/workspace/ceph-api/build/src/pybind/cephfs
compile options: '-I/usr/include/python3.10 -I/usr/include/python3.10 -c'
extra options: '-Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -iquote/home/jenkins-build/build/workspace/ceph-api/src/include -w -Dvoid0=dead_function(void) -D__Pyx_check_single_interpreter(ARG)=ARG ## 0 -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2'
cc: /home/jenkins-build/build/workspace/ceph-api/build/src/pybind/cephfs/cephfs.c
cc: warning: ##: linker input file unused because linking not done
cc: error: ##: linker input file not found: No such file or directory
cc: warning: 0: linker input file unused because linking not done
cc: error: 0: linker input file not found: No such file or directory
it seems cython is not able to escape the space in the "extra options"
anymore, so the "##" and "0" are considered as object files passed to
compiler in addition to cephfs.c.
in this change the spaces are removed to help cython to make the right
decision.
ceph-volume unit tests shouldn't actually create contents on the
filesystem from where it runs (even though they are written in a tmp
dir), let's use pyfakefs.
ceph-volume shouldn't report devices `/dev/sdy` and `/dev/sdz`, they will never be
available in such a scenario. Considering this, in a host with a bunch of mpath devices
it will pollute the inventory output.
When the class `Device` is instantiated with a path instead of a
block device, it fails like following.
```
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 130, in __init__
self._parse()
File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 233, in _parse
self.ceph_device = disk.has_bluestore_label(self.path)
File "/usr/lib/python3.6/site-packages/ceph_volume/util/disk.py", line 906, in has_bluestore_label
with open(device_path, "rb") as fd:
IsADirectoryError: [Errno 21] Is a directory: '/var/lib/ceph/osd/ceph-0/'
```
passing a path instead of a block device is valid, `simple scan` needs it.
Zack Cerza [Tue, 21 Jun 2022 17:28:30 +0000 (11:28 -0600)]
ceph-volume: Rename env var; add warning
So that we can have a nice big warning that fires once per invocation, I
think using a callable class with a class attribute seems like a decent
approach. A closure could work too.
Zack Cerza [Tue, 17 May 2022 17:29:02 +0000 (11:29 -0600)]
ceph-volume: Optionally consume loop devices
A similar proposal was rejected in #24765; I understand the logic
behind the rejection, but this will allow us to run Ceph clusters on
machines that lack disk resources for testing purposes. We just need to
make it impossible to accidentally enable, and make it clear it is
unsupported.
ceph-volume: do not call get_device_vgs() per devices
let's call `ceph_volume.api.lvm.get_all_devices_vgs` only one time instead
so we avoid a bunch of subprocess calls that slow down the process when running
`ceph-volume inventory` command.
ceph-volume: fix fast device alloc size on mulitple device
The size computed by get_physical_fast_allocs() was wrong when the
function had multiple devices to treat.
For instance if there is 4 OSDs and 2 fast devices of each 10G while
allocating 2 slots per fast devvices. The behavior before was that each
slot would be 2.5G meaning that both fast devices would half full. The
behavior now is that each slot will take 5G so that the fast devices
would be full.
Fixes: https://tracker.ceph.com/issues/56031 Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch>
(cherry picked from commit d0f9e93914e2b7feac41a634311d74c146c8868b)
`lsblk_all()` should return an empty dict `{}` if nothing was found.
If we raise `RuntimeError()` then the loop in `scan.Scan.main` will stop
and make ceph-volume fails because we don't try to catch this exception.
`scan.Scan.main()` has its own logic in order to detect the given path
is a ceph-disk created OSD anyway.
rgw: maintain object instance within RGWRadosObject::get_obj_state method
It seems that the object instance was being stripped from the rgw_obj
inadvertently wihtin the RGWRadosObject::get_obj_state method. This
change fixes some S3 behavior related to concurrency and ECANCELED errors.
We don't backport PRs merged into doc/releases. Therefore, when one browses to an older Ceph release version on docs.ceph.com (e.g., https://docs.ceph.com/en/pacific/), the information is out of date at best.
The doc/releases page is only accurate if browsing https://docs.ceph.com/en/latest/, for example.
So this post_checkout command will make sure we've checked out doc/releases from main before building and publishing.
when option --gpg-url is specified, the name used for the gpg filename is missing and throws an exception
this adds the string "manual" to the gpg key : /etc/apt/trusted.gpg.d/ceph.manual.gpg
Adam King [Tue, 26 Jul 2022 13:55:05 +0000 (09:55 -0400)]
mgr/cephadm: clear error message when resuming upgrade
the message field in the output of "ceph orch upgrade status"
will first take the value of the error field of the UpgradeState,
and if only if it blank/None, display an info string we periodically
update throughout the upgrade with useful info such as that
we're upgrading a daemon of a particular type or pulling an image
on a certain host. When an upgrade fails, we set the error field
of the UpgradeState, pause the upgrade and raise a health warning.
Sometimes, the user is able to resolve the issue and simply resume
the upgrade. The issue here is, in that case, the error field of
the UpgradeState is still set, so instead of seeing the useful info
messages, it will continue to display an error message that may
no longer be relevant. By emptying the error field of the UpgradeState
when upgrades are resumed, we return to normal behavior of
displaying the info string, and will only show another error message
if another error actually occurs.
Marcus Watts [Thu, 7 Jul 2022 07:33:31 +0000 (03:33 -0400)]
rgw: better tenant id from the uri on anonymous access
When anonymous tries access public bucket, it gets 404,
because rgw doesn't check tenant correctly.
A previous fix for this broke legacy implicit tenants,
because it didn't check for anonymous access. This version
restricts its behavior to the anonymous user.
Fixes: https://tracker.ceph.com/issues/48001 https://tracker.ceph.com/issues/48382
Original fix by
Author: Rafał Wądołowski <rafal@rafal.net.pl> Signed-off-by: Rafał Wądołowski <rwadolowski@cloudferro.com>
This fix Signed-off-by: Marcus Watts <mwatts@redhat.com>
(cherry picked from commit b5880caa505e2df7ff035537c19a8be3a3a8bb8f)
test/encoding: verify that e.what() starts with expected str
boost changes the way how it prints boost::system::system_error in
boost 1.79 -- it appends the stringified error_category at end of
exception::what(), and our buffer::malformed_input is a subclass
of boost::system::system_error.
so we cannot just compare the return value of what() with the
expected string, to be more future proof, let's check if i
starts with the expected string instead.
Deferred writes can sometimes update regions that are no longer mapped to any object.
This cannot happen when BlueStore is running, as blobs are being held,
and allocations are not released until deferred op is executed.
However in case of restart allocations that deferred is targetting are already freed.
Deferred replay is done on BlueStore bootup, before any new object can be allocated,
so no collision with object is possible.
But BlueFS can allocate space from block with deferred ops still pending.
Fixes: https://tracker.ceph.com/issues/54547 Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 2afb951a490aac9ca4d01614867396cbd06b793d)
Conflicts: src/os/bluestore/BlueStore.cc
Modified non-critical BlueFS call foreach_block_extents to get_block_extents.
Kefu Chai [Mon, 28 Feb 2022 13:46:39 +0000 (21:46 +0800)]
include/buffer: include <memory>
to address following FTBFS:
/usr/bin/ccache /usr/bin/clang++-13 -DBOOST_ALL_NO_LIB -DBOOST_ASIO_DISABLE_CONCEPTS -DBOOST_ASIO_DISABLE_THREAD_KEYWORD_EXTENSION -DBOOST_ASIO_USE_TS_EXECUTOR_AS_DEFAULT -DBOOST_PROGRAM_OPTIONS_DYN_LINK -DBOOST_T$
In file included from /var/ssd/ceph/src/crimson/os/seastore/seastore_types.cc:4:
In file included from /var/ssd/ceph/src/crimson/os/seastore/seastore_types.h:14:
In file included from /var/ssd/ceph/src/include/denc.h:47:
/var/ssd/ceph/src/include/buffer.h:98:37: error: no template named 'unique_ptr' in namespace 'std'; did you mean 'boost::movelib::unique_ptr'?
struct unique_leakable_ptr : public std::unique_ptr<T, ceph::nop_delete<T>> {
^~~~~~~~~~~~~~~
boost::movelib::unique_ptr
/opt/ceph/include/boost/move/unique_ptr.hpp:354:7: note: 'boost::movelib::unique_ptr' declared here
class unique_ptr
^
libcephsqlite: ceph-mgr crashes when compiled with gcc12
regex in libcephsqlite, when compiled with GCC12 treats '-' as a range
operator resulting in the following error.
"Invalid start of '[x-x]' range in regular expression"
Kotresh HR [Fri, 4 Feb 2022 09:58:39 +0000 (15:28 +0530)]
qa: validate subvolume discover on upgrade
Validate subvolume discover on upgrade from
legacy subvolume to v1. The handcrafted
`.meta' file on legacy subvolume root should
not be used for any subvolume apis like getpath,
authorize.
Kotresh HR [Fri, 4 Feb 2022 09:25:03 +0000 (14:55 +0530)]
mgr/volumes: Fix subvolume discover during upgrade
Fixes the subvolume discover to use the correct
metadata file after an upgrade from legacy subvolume
to v1. The fix makes sure, it doesn't use the
handcrafted metadata file placed in the subvolume
root of legacy subvolume.
Co-authored-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@cern.ch> Co-authored-by: Dan van der Ster <daniel.vanderster@cern.ch> Co-authored-by: Ramana Raja <rraja@redhat.com> Signed-off-by: Kotresh HR <khiremat@redhat.com>
(cherry picked from commit 7eba9cab6cfb9a13a84062177d7a0fa228311e13)
(cherry picked from commit f8c04135150a7fb3c43607b43a8214e0d57547bc)
myoungwon oh [Tue, 28 Jun 2022 04:42:21 +0000 (13:42 +0900)]
osd: return ENOENT if pool information is invalid during tier-flush
During tier-flush, OSD sends reference increase message to target OSD.
At this point, sending message with invalid pool information (e.g., deleted pool)
causes unexpected behavior.
Therefore, this commit return ENOENT early before sending the message
Adam C. Emerson [Tue, 8 Feb 2022 18:47:49 +0000 (13:47 -0500)]
rgw: Fix data race in ChangeStatus
Fixes: https://tracker.ceph.com/issues/54208 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 27f5ba9e5f649d8767c8ab44d56404e0186f6fc1) Fixes: https://tracker.ceph.com/issues/54491 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Adam C. Emerson [Fri, 8 Jul 2022 18:58:16 +0000 (14:58 -0400)]
rgw: Guard against malformed bucket URLs
Misplaced colons can result in radosgw thinking is has a bucket URL
but with no bucket name, leading to a crash later on.
Fixes: https://tracker.ceph.com/issues/55765 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
(cherry picked from commit 3ee9a3b41a289a926fed8b8927ca2a93b4f120a6) Fixes: https://tracker.ceph.com/issues/56586 Signed-off-by: Adam C. Emerson <aemerson@redhat.com>