The spec.virtual_ip parameter should be used when it's defined for the ingress daemon
When the ingress spec is built a virtual_ip parameter is provided in the spec and it's
expected to see the haproxy instance using the defined value.
The defined VIP is properly configured using the keepalived instance and, as per doc [1],
the ingress should be able to use this value as entrypoint for the haproxy frontend.
This patch also introduces a basic unit test for the IngressService with the purpose of
validating the config files generated for both haproxy and keepalived.
Adam Kupczyk [Wed, 15 Dec 2021 09:59:55 +0000 (09:59 +0000)]
os/bluestore/bluefs: Add tracking of bluefs log in noop replay mode
Keep updating bluefs log when printing content of bluefs replay log.
Without this modification we only have initial content of log.
Log can be printed by 'ceph-bluestore-tool bluefs-log-dump'.
Adam Kupczyk [Wed, 24 Nov 2021 17:55:05 +0000 (18:55 +0100)]
os/bluestore/bluefs: Sync BlueFS log with its allocation delta
BlueFS log is the only file that we can append to.
When we append to file we must take into consideration previously commited allocations,
otherwise update will be miscalculated.
Adam Kupczyk [Wed, 24 Nov 2021 17:52:35 +0000 (18:52 +0100)]
test/objectstore/bluefs_test: Add test for continuation of previous BlueFS log
Added test that verifies that in update mode we properly pick up delta.
BlueFS log is the only file that can be appended to, but it is done in very indirect way.
gal salomon [Fri, 7 May 2021 21:29:13 +0000 (00:29 +0300)]
RGW: Implement continuation, progress, stats, end s3select response
RGW/S3select: Implement output-serializationi. user may request different CSV defintions
for output (field delimiter, row delimiter, quote handling.
RGW/S3select: Implement presto-alignments. presto-application sends
queries with table-alias,case insensitive, and with no-semicolon at the
end of statement.
Xuehan Xu [Fri, 17 Dec 2021 05:20:35 +0000 (13:20 +0800)]
crimson/os/seastore: reset onode in 'SeaStore::repeat_with_onode' before the transaction gets destroyed
Onodes hold references to the onode tree extents. And if it's referencing the root extent, that root
extent is cached in the onode trees root_tracker which caches onode tree roots by transaction address.
Than root_tracker entry only gets removed when the onode(or the corresponding "super") is destroyed.
On the other hand, two non-concurrent transactions can occupy the same address. So if an onode gets destroyed
after its transaction is destroyed, there will be a chance that another transaction occupying the same
address get that not-yet-destroyed and may-be-outdated onode.
BTW, Since we already cache extents in transactions, might want to drop onode tree root_tracker later?
Sage Weil [Fri, 17 Dec 2021 04:54:25 +0000 (23:54 -0500)]
Merge PR #44228 into master
* refs/pull/44228/head:
qa/suites/orch/cephadm/osds: test 'ceph cephadm osd activate'
mgr/cephadm/services/osd: skip found osds that already have daemons
mgr/cephadm: allow activation of OSDs that have previously started
Reviewed-by: Sebastian Wagner <sewagner@redhat.com>
Sage Weil [Mon, 6 Dec 2021 15:19:16 +0000 (10:19 -0500)]
mgr/cephadm: allow activation of OSDs that have previously started
When this code was introduced way back in ea987a0e56db106f7c76d11f86b3e602257f365e,
for some reason I was focused only on freshly created OSDs. The
get_osd_uuid_map() helper is used by deploy_osd_daemons_for_existing_osds()
which is called not only by OSD creation but also by 'ceph cephadm
osd activate', which is meant to instantiate daemons for existing OSD
devices (e.g., devices that were reattached to a new server, or whose
/var/lib/ceph/$fsid/osd.$id directory was lost for some other reason.
However, if we ignore OSDs with up_from > 0, then we can't recreate a
daemon instance for such existing OSDs--arguably the most important ones,
since they may hold real data.
Fixes: https://tracker.ceph.com/issues/53491 Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Thu, 16 Dec 2021 15:24:46 +0000 (10:24 -0500)]
mon: prevent new sessions during shutdown
From shutdown() we set STATE_SHUTDOWN and then call remove_all_sessions().
ms_handle_accept() is the only caller of add_session, so verifying that
we aren't shutting down (while under the session_map_lock) is sufficient
to prevent any new sessions from being added.
Fixes: https://tracker.ceph.com/issues/39150 Signed-off-by: Sage Weil <sage@newdream.net>
Ronen Friedman [Thu, 16 Dec 2021 10:49:57 +0000 (10:49 +0000)]
crimson/osd: removing an unneeded make_unique()
As the desired lifetime of the object matches the lifetime if
it is allocated on the stack, and as no ownership is transferred,
there is no point in using a unique_ptr here.
And see Google's guidance (https://abseil.io/tips/187),
under "Common Anti-Pattern: Avoiding &".
Paul Cuzner [Fri, 12 Nov 2021 03:16:59 +0000 (16:16 +1300)]
mgr/cephadm: Add snmp-gateway service support
Add a new snmp-gateway service to provide a bridge between
Prometheus and an SNMP management platform. The gateway
service uses https://github.com/maxwo/snmp_notifier to provide
an SNMP v2c and SNMP V3 support.
The SNMP V3 support mandates at least authentication, and also
offers authentication and privacy (encryption).
Fixes: https://tracker.ceph.com/issues/52920 Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
Igor Fedotov [Tue, 2 Nov 2021 12:03:39 +0000 (15:03 +0300)]
os/bluestore: avoid premature onode release.
This was observed when onode's removal is followed by reading
and the latter causes object release before the removal is finalized.
The root cause is an improper 'pinned' state assessment in Onode::get
More detailed overview is:
At some point Onode::get() might face the case when nref == 2 and pinned = true
which means parallel incomplete put is running on the onode - ref count is
decremented but pinned state is still unmodified (and even lock hasn't been
acquired yet).
This might finally result in two puts racing over the same onode with nref == 2
which finally results in a premature onode release:
// nref =3, pinned = 1
// Thread 1 Thread 2
// o->put() o->get()
// --nref(n = 2, pinned=1)
// nref++ (n=3, pinned = 1)
// return
// ...
// o->put()
// --nref(n = 2)
// pinned = 0,
// --nref(n = 1)
// ocs->_unpin_and_rm(o) -> o->put()
// ...
// --nref(n = 0)
// release o
// o->c->get_onode_cache()
// FAULT!
//
The suggested fix is to introduce additional atomic counter tracking
running put() functions. And permit onode release when both regular
nref and put_nref are both equal to zero.
Fixes: https://tracker.ceph.com/issues/53002 Signed-off-by: Igor Fedotov <igor.fedotov@croit.io>
This should prevent omap and xattr extent allocations from clumping near
the onode's hint. Additionally, only generate them past the default
16MB object_data_handler reservation.
Neha Ojha [Tue, 7 Dec 2021 17:47:22 +0000 (17:47 +0000)]
doc/releases/pacific.rst: add core updates for 16.2.7
16.2.7 fixes https://tracker.ceph.com/issues/53062, so remove the
"big scary warning" from the top of the pacific release page. We continue
to warn about this bug under the 16.2.6 section and in
https://docs.ceph.com/en/latest/releases/pacific/#upgrading-from-octopus-or-nautilus.