Lucian Petrut [Fri, 13 Nov 2020 12:18:32 +0000 (12:18 +0000)]
compat,msg: improve Windows socket checks
win_socketpair can fail with EADDRINUSE under load, which will
lead to an unhandled exception/crash as per this commit [1].
This change adds a retry, also ensuring that the right error code
gets propagated (the one returned by WSAGetLastError() instead of
the generic SOCKET_ERROR).
While at it, we're fixing the "win_socketpair" indentation and
addressing the SOCKET to int casts.
This change will allow mapping rbd images on Windows, leveraging the
WNBD[1] Virtual Storport Miniport driver [2].
The behavior and CLI is similar to the Linux rbd-nbd, with a few
notable differences:
* device paths cannot be requested. The disk number and path will
be picked by Windows. If a device path is provided by the user
when mapping an image, it will be used as an identifier, which
can also be used when unmapping the image.
* the "show" command was added, which describes a specific mapping.
This can be used for retrieving the disk path.
* the "service" command was added, allowing rbd-wnbd to run as a
Windows service. All mappings are currently perisistent, being
recreated when the service stops, unless explicitly unmapped.
The service disconnects the mappings when being stopped.
* the "list" command also includes a "status" column.
The purpose of the "service" mode is to ensure that mappings survive
reboots and that the Windows service start order can be adjusted so
that rbd images can be mapped before starting services that may depend
on it, such as VMMS.
The mapped images can either be consumed by the host directly or exposed
to Hyper-V VMs.
While at it, we'll skip building rbd-mirror as it's quite unlikely that
this daemon is going to be used on Windows for now.
Lucian Petrut [Tue, 23 Jun 2020 09:00:54 +0000 (09:00 +0000)]
build: disable stack protection on Windows
Passing "-fstack-protector-strong" doesn't seem to work with Mingw,
complaining about undefied "__stack_chk_fail". For this reason,
we'll disable it for now.
Lucian Petrut [Wed, 6 May 2020 10:59:39 +0000 (10:59 +0000)]
rbd: fix import image path parsing on Windows
When importing an image, the rbd command uses only the file name
and expects "/" to be used as a separator. On Windows, it will
use the entire path as image name since the path separator is not
the same.
This change updates it so that the "\\" path separator can be
properly handled as well.
dlfcn_win32.cc provides a function converting Windows error codes
to string error messages. We'll move it to the common errno modules
so that it can easily be reused.
Add the same time, we're adding a function that's converting
errno values to NTSTATUS codes.
Xuehan Xu [Sat, 31 Oct 2020 11:53:12 +0000 (19:53 +0800)]
crimson/osd: make PglogBasedRecovery op take recovering objs triggered elsewhere into account
PGRecovery::start_recovery_ops() should wait for all inflight recovery ops, whether they are
started by BackgroundRecovery or not, otherwise there may be circumstances in which BackgroundRecovery
keep recursively invoking its do_recovery when start_recovery_ops returns recovery done while there are
still missing objects.
Zac Dover [Sun, 4 Oct 2020 20:28:51 +0000 (06:28 +1000)]
doc/rados: ceph df output update
This commit updates the "ceph df" output
so that it is current as of October 2020.
-Add correctly formatted `ceph df` output.
-Add explanation of "DIRTY" column.
-(DATA) remains to be defined (1 instance)
-(OMAP) remains to be defined (1 instance)
-USED remains to be defined (1 instance)
-Update prompts in "Checking OSD Status"
The ceph-volume lvm batch --auto introduced by [1] breaks the backward
compatibility when using non rotational devices only (SSD and/or NVMe).
Those devices are reaffected as bluestore db or filestore journal
devices while we want them as data devices.
so we need to pass -fPIC by ourselves. otherwise we'd have
/usr/bin/ld: ../../liburing/src/liburing.a(setup.ol): relocation R_X86_64_PC32 against symbol `io_uring_queue_mmap' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
src/test/fio/CMakeFiles/fio_ceph_objectstore.dir/build.make:154: recipe for target 'lib/libfio_ceph_objectstore.so' failed
Kefu Chai [Mon, 9 Nov 2020 07:34:55 +0000 (15:34 +0800)]
cmake: use make explicitly to build fio
we cannot assume that user uses "make" as the generator of cmake, if,
for instance, ninja is used, `$(MAKE)` is not a valid variable in the
generated `build.ninja`. so we should use "make" explicitly.
Kefu Chai [Mon, 9 Nov 2020 06:34:00 +0000 (14:34 +0800)]
crimson/os: do not configure seastar allocator for alien threads
4cd2b00d2a703510777bd761609be221859bd790 allows us to colocate seastar
allocator used by seastar reactors and libc allocator used by alien threads,
there is no need to configure seastar allocator for alien thread
anymore.
Greg Farnum [Mon, 2 Nov 2020 08:14:48 +0000 (08:14 +0000)]
mon: retain disallowed leader list on restart
We were only setting this when new monmaps were read from paxos -- whoops!
Pull apart that mechanism a little bit and make sure to set them before
doing elections, as part of bootstrap.
Greg Farnum [Thu, 29 Oct 2020 06:10:23 +0000 (06:10 +0000)]
mon: Output the real leader in ::_quorum_status() and get_leader_name()
These functions previously assumed the first mon in the quorum
was the leader. That isn't accurate if the first monitor is
disallowed or it lost a connectivity-mode election, though.
Greg Farnum [Thu, 29 Oct 2020 05:58:56 +0000 (05:58 +0000)]
mon: Do not increase compatv when using monitor location or stretch mode
For mon_info_t, I first wrote things so that when monitors get a location
added in MonMap::mon_info_t, I bumped the struct_v to 5 and also bumped
the min_compat to 5. This made sure that nobody could decode the
struct and lose the location info, which if it were a monitor
would be very bad.
And for the MonMap, when stretch mode is enabled I bumped up the
comptav (in addition to the always-increased struct_v), for the same reason.
But clients also have to decode these structures, and we can't
disallow older clients from connecting to a stretched cluster.
Happily, usage of any stretch modes already requires a feature
bit and sets it as required in the monmap, so these are already
gated. Therefore, just don't set new compat values in these cases.
While at it, also gate setting the location on the monmap indicating
all monitors are updated.
Kefu Chai [Thu, 5 Nov 2020 16:16:45 +0000 (00:16 +0800)]
cmake: set GIT_SHALLOW and UPDATE_DISCONNECTED
* GIT_SHALLOW=TRUE, so we don't pull the full git history,
as we don't care about it.
* UPDATE_DISCONNECTED=TRUE, to skip the UPDATE step, this change
somehow works around
https://gitlab.kitware.com/cmake/cmake/-/issues/19703. otherwise
cmake keeps building liburing.
Kefu Chai [Thu, 5 Nov 2020 16:37:32 +0000 (00:37 +0800)]
blk/kernel/io_uring: bump liburing to v0.7
* use functions exposed by liburing instead of using syscalls
* v0.7 is the latest release at the time of writing, as liburing is under
active development. it'd be better to use a newer release.
* also use https://git.kernel.dk/liburing instead of
http://git.kernel.dk/liburing.
Matt Benjamin [Wed, 4 Nov 2020 17:02:27 +0000 (12:02 -0500)]
rgw_file: fix some zipper flow for RGWLibContinuedReq
Some bits of the standard Zipper conversions were missed for
the RGWLibContinuedReq case, where the setup is encapsulated in
the request, but execution is broken up in to steps. This
currently affects only RGWWriteRequest.
Fixes: https://tracker.ceph.com/issues/48136 Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
Kefu Chai [Thu, 5 Nov 2020 06:27:35 +0000 (14:27 +0800)]
cmake: rename crimson tests named like foo_bar to foo-bar
for two reasons:
* less typing: no need to press "shift" for inputting "_"
* more consistent with executable names like "ceph-conf"
* simpler to grep when compiling the tests. there is chance
we need to kill the dead jobs on a jenkins worker node
where it happens to be compiling the tests.