Douglas Fuller [Thu, 7 Mar 2019 15:15:33 +0000 (10:15 -0500)]
cephfs: store fs names associated with local MDSMap in MDSRank
Modify MMDSMap to include the name of the file system from the FSMap
when bringing up a new MDS. Store the name in the MDSRank structure for
use in security validation.
Fixes: http://tracker.ceph.com/issues/15070 Signed-off-by: Douglas Fuller <dfuller@redhat.com> Signed-off-by: Rishabh Dave <ridave@redhat.com>
Douglas Fuller [Fri, 1 Mar 2019 16:41:45 +0000 (11:41 -0500)]
mon/MDSMonitor: add mon auth caps for CephFS names
Add a 'fsname' clause to mon auth caps to restrict a client's view
of the FSMap. Example:
mon 'allow rw fsname=cephfs2'
This would restrict the client's view of the FSMap to the MDSMap for
cephfs2. Any MDS allocated to a different filesystem will be invisible.
Global standby daemons are always visible. To allow multiple CephFSs,
add multiple caps:
before this change, unittest_seastar_config set the specified option to
the shard id in parallel and expects that the values of the option are
consistent across all shards after setting the value on all shards. but
sharded::invoke_on_all() is executed in parallel, and so does
ConfigProxy::do_change(), so there are actually "n x n" continuations
racing each other. the order of the continuations storing the setting
cannot be determined. so we cannot expected that the last batch of
continuations hitting shareds store the same value.
in this change, only a single `conf.set_val()` call is performed with
a known value. and this value is checked with the values stored on all
shards.
crimson/osd: print notify_id instead of its address
this addresses the FTBFS like
/usr/include/fmt/core.h:902:19: error: static assertion failed: formatting of non-void pointers is disallowed
902 | static_assert(!sizeof(T), "formatting of non-void pointers is disallowed");
| ^~~~~~~~~~
also, i think it is more helpful in this context to have the
notify id instead of the address, as we can use the id to reference
the notification printed by other logging messages.
crimson/os: cast pointers to void* before printing them
to address the FTBFS like:
../src/crimson/os/seastore/lba_manager/btree/btree_range_pin.cc:110:14: required from here
/usr/include/fmt/core.h:902:19: error: static assertion failed: formatting of non-void pointers is disallowed
902 | static_assert(!sizeof(T), "formatting of non-void pointers is disallowed");
| ^~~~~~~~~~
crimson/os: instantiate future_type<> with future_stored_type
to accommodate the change on seastar's side, which started to use
monostate for representing `void` return type, and hide this convention
using future_stored_type_t.
crimson/net: print entity_addr_t not the in4_addr in it
for two reasons:
* we don't have operator<<(ostream&, in4_addr&) defined.
* we don't have any fmt::format() facility specialized for in4_addr
* to print entity_addr_t instead of in4_addr is more correct in
the sense that we should support IPv6
crimson/osd: use temporary RecoveryInfo for recovering object delete ops
Right now, recover_delete use WaitForObjectRecovery::pi, which may not be available
if there were no pulling, to pass recovery_info, so use temporary recovery_info instead
fmt/core.h:902:19: error: static assertion failed: formatting of non-void pointers is disallowed
902 | static_assert(!sizeof(T), "formatting of non-void pointers is disallowed");
| ^~~~~~~~~~
and i searched around, couldn't find any other
statement printing the address of message, so
i think it's not of much use to print their
addresses in a single place, unless we can cross
check them in different places.
This is an implementation of fifo queue over rados. Data is appended
to rados object until it's full. At that point data will be written to
a new object. Data is read from the tail sequentially (can be iterated
using marker). Data can be trimmed (up to and including marker).
Queue has a header object (meta), and zero or more data objects (parts).
The software has two layers: the higher level client operations side
that deals with the application apis, and manages the meta and parts,
and there’s the objclass level that deals with the rados layout of
meta and part objects. There are different objclass methods that deal
with reading and modifying each of these entities.
A single part has max possible size, however, it may become full once
a certain smaller size is reached (full_size_threshold). It is
imperative that once a part has reached its capacity, it will not
allow any more writes into it. For this reason, it is important that
data being written to the queue does not exceed max_entry_size . This
is enforced, by the higher level client api.
Written entries go to the current head object, and when it’s full, a
new head part is created. When listing entries, data is iterated from
tail to the current head. Trim can either change the pointer within
the current tail object, and if needed it removes tail objects.
A unitest has been created to test functionality.
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com> Squashed-by: Adam C. Emerson <aemerson@redhat.com>
Adam C. Emerson [Tue, 7 Jul 2020 20:39:57 +0000 (16:39 -0400)]
common/Thread: Don't store pointer to thread_name
Having Thread::create store a pointer to a string that is passed to
ceph_pthread_setname in Thread::entry_wrapper can lead to using a
pointer in the calling thread's stack that gets freed before use.
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
adds an allocator-aware version of std::make_unique(). this is similar to
std::allocate_shared(), though it's slightly less convenient because,
unlike std::shared_ptr<T>, the Deleter has to be specified as a template
parameter in std::unique_ptr<T, Deleter>
if an existing object is cached with an object version, but it's
mutated without updating that version number, clear the OBJV flag so
that later cache reads asking for an object version result in a miss and
re-read the version from the osd
common/buffer: do not deref _raw ptr when checking for hypercombind buf
_raw could be null when the buffer::ptr is std::moved away. and ASan
complains when we try to dereference a nullptr. so use a little awkward
way to calculate the address instead
crimson/osd: add "ceph tell <pgid> <command>" support
* add an abstract class of `PGCommand` for `ceph tell <pgid> <command>`
* add two sample implementations for the pg tell commands.
- "query"
- "mark_unfound_lost"
include/encoding: Fix encode/decode of float types on big-endian systems
Currently, floating-point types use "raw" encoding, which means they're
simply copied as byte stream.
This means that if the decoding happens on a machine that differs in
byte order from the source machine, the returned value will be
incorrect. As one effect of this problem, a big-endian OSD node cannot
join a cluster where the MON node is little-endian (or vice versa),
because the OSDMap (incremental) structure contains floating-point
values, and as a result of this conversion problem, the OSD node will
crash with an assertion failure as soon as it receives any OSDMap update
from the MON.
This should be fixed by always encoding floating-point values in
little-endian byte order just as is done for integers. (Note that this
still assumes source and target machines used the same floating-point
format except for byte order. But given that nearly all platforms these
days use IEEE binary32/binary64 for float/double, that seems a
reasonable assumption.)