git.apps.os.sepia.ceph.com Git

librbd/cache/pwl: Fix dead lock issue when pwl initialization failed

when pwl initialization failed, 'AbstractWriteLog' will release itself
in callback, it hold guard lock and want to get lock to delete data,
which causes dead lock. This PR works by release image_cache outside
the callback function.

Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit 937af36e204791554708632245b4bca1d52f49a6)

librbd/cache/pwl: solve the problem of calc m_bytes_allocated when reload entries.

Currently, it will load existing entries after restart and cacl
m_bytes_allocated based on those entries. But currently there are
the following problems：
1: The allocated of write-same is not calculated for rwl & ssd cache.
2: for ssd cache, it not include the size of log-entry itself and don't
consider data alignment. This will cause less calculation and more
allocatation later. And will overwrite the data which don't flush to osd
and make data lost.

The calculation methods of ssd and rwl are different. So add new api
allocated_and_cached_data() to implement their own method.

For SSD cache, we dirtly use m_first_valid_entry & m_first_free_entry to
calc m_bytes_allocated.

trivial fix: new code from PR, nothing unrelated: https://www.diffchecker.com/S1eXatpM
Fixes:https://tracker.ceph.com/issues/52341
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
(cherry picked from commit a96ca93d69d5c1f302f3141082302d4699915397)

librbd/cache/pwl/ssd: fix first_valid_entry calculation in retire_entries()

Consider one control_block which cotain multi encode(WriteLogCacheEntry):
Log1: WriteLogEntry
Log2: WriteLogEntry
Log3: Non-WriteLogEntry
For this case, currently calc method is: control_block_pos + sizeof(control_block).
But in fact, it should: control_block_pos + sizeof(control_block) +
data_length(Log1 + Log2).

Wrong first_valid_entry will persist to superblock and restart to read.
This cause read wrong position and when decode(WriteLogCacheEntry) it
will report bug.

Fixes: https://tracker.ceph.com/issues/52323
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
(cherry picked from commit 2d337fb122d147e32d027d1e7211cd4156a5b72b)

librbd/cache/pwl/ssd: solve competition between read and retire

SSD read is not like rwl's. SSD need aio read. Therefore,
we cannot guarantee that the data will not be retire
during the period from sending the read request to the SSD
and receiving the data to the memory, which may cause
the corresponding data on the SSD to be overwritten.

Fixes: https://tracker.ceph.com/issues/52236
Signed-off-by: Feng Hualong <hualong.feng@intel.com>
(cherry picked from commit dfdb452aa996af40f7b0ec684670ccf9a9b2d4c1)

librbd/cache/pwl/rwl: fix buf_persist and add writeback_lat perf counters

initialize buf_persist_time, then change name buf_persist_time to
buf_persist_start_time, change flush to internal_flush. add
writeback_lat perf conters. update some print formats for perf.

Fixes: https://tracker.ceph.com/issues/52090
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit 8ed3080078ad5eaa51145a9481c7c2223ad38765)

librbd/cache/pwl: avoid stack overflow caused by nested shared_ptr destruction

Destruction of nested shared_ptr will cause stack overflow.
With the explicit assignment of nullptr, the deleted node
is completely disconnected from the current linked list

-------              *******               -------
|sync | <--earlier-- |sync | <--earlier-x- |sync |
|point| --later----> |point| --later----x> |point|
-------              *******               -------
   |                    |                     |
   V                    V                     V
-------              -------               -------
|log_ | ---next----> |log_ | ---next----x> |log_ |
|entry|              |entry|               |entry|
-------              -------               -------

earlier: earlier_sync_point
later:   later_sync_point
next:    next_sync_point_entry

Fixes: https://tracker.ceph.com/issues/51418
Signed-off-by: Feng Hualong <hualong.feng@intel.com>
(cherry picked from commit e706b9db5c5d79366c5167d01ad46e13f8500936)

librbd/cache/pwl/ssd: fix use-after-free on C_BlockIORequest

In setup_schedule_append() function, its first expression
will cause the req to be deleted, and subsequent use of
the variable req becomes an illegal operation. And due to
delete, rep->m_image_ctx will be empty, so it lead to
segfault in AbstractWriteLog::get_context().
So pass the `req` into `schedule_append()` function.

Fixes: https://tracker.ceph.com/issues/50951
Signed-off-by: Hualong Feng <hualong.feng@intel.com>
(cherry picked from commit 2dc3b8881290f1e12c536a232bea37547a73a45b)

librbd/cache/pwl/ssd: flushed_sync_gen capture is unused

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit f8fb7609b30bf3e60a64e4282250b121151d1559)

librbd/cache/pwl/ssd: ensure first_{valid,free}_entry aren't bogus

Ensure first_{valid,free}_entry are inside the expected range when
scheduling root updates and decoding the root on recovery.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit b46f81fbb22e67548e9825756da625d8e9017d10)

librbd/cache/pwl/ssd: avoid corrupting m_first_free_entry

In append_op_log_entries(), new_first_free_entry is read after
append_ops() returns.  This can result in accessing freed memory
because all I/Os may complete and append_ctx callback may run
by the time new_first_free_entry is read.  Garbage value gets
written to m_first_free_entry and depending on the circumstances
it may allow AbstractWriteLog code to accept more dirty user data
than we have space for.  Luckily we usually crash before then.

Fixes: https://tracker.ceph.com/issues/50832
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit d83a0f6db8ff26eeb2c817b1bd192fb357f715df)

librbd/cache/pwl/ssd: ensure bdev and pool sizes match

m_log_pool_size should be multiple of 1M but, just in case, prevent
is_valid_io() assert in KernelDevice::aio_write().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 41b95ac987954dc2c29090926236660b8348e9d2)

librbd/cache/pwl: make pool size a multiple of 1M

In ssd mode, we need it to be a multiple of bdev block size.
Instead of munging it after opening the bdev in ssd/WriteLog.cc, let's
impose a common restriction and round rbd_persistent_cache_size down to
a 1M boundary.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 1fc722bc218acfe8a52ad7be2c06347bddb42bbe)

librbd/cache/pwl: supplement error handle

add return after error handling and clean bdev before return

trivial fix:
<<<<<<< head
      on_finish->complete(-1);
      return;
=======
      on_finish->complete(r);
      return false;
>>>>>>> 2cd9881116a (librbd/cache/pwl: supplement error handle)

picked second patch
Signed-off-by: Yin Congmin <congmin.yin@intel.com>
(cherry picked from commit 2cd9881116a47794051615fcf0a95a5e879ae7e6)

librbd/cache/pwl/ssd: stronger assert in aio_read_data_blocks()

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ab05cc4a8f4709db649ad51d32141e429d9dc2b3)

librbd/cache/pwl/ssd: rename aio_read_data_block() overload

Rename the overload that deals with multiple data blocks to
aio_read_data_blocks().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 00363a0f7c20a7f002b37eac93d88791e5bc2568)

librbd/cache/pwl/ssd: persist correct write_data_pos

WriteLogCacheEntry gets appended to persist_log_entries before
write_data_pos is updated with the actual media offset. Because
push_back() makes a copy, the updated write_data_pos value never
makes it to media, making recovery impossible.

Fixes: https://tracker.ceph.com/issues/50669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit fe757401ada7bfd6784b6f9ca5556e1459df7a69)

librbd/cache/pwl/ssd: set m_bytes_allocated_cap on recovery

Currently it's set only when a new cache is formatted.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit cb9b3afd87a28e96d3688e1c73900d8043fac6cc)

librbd/cache/pwl/ssd: actually use first_{valid,free}_entry on recovery

first_valid_entry and first_free_entry pointers are read from media
but not actually used: both m_first_valid_entry and m_first_free_entry
get assigned 0 (or garbage). next_log_pos gets the same value as well
meaning that not only no recovery is attempted but the cache also gets
corrupted because DATA_RING_BUFFER_OFFSET is not applied.

Fixes: https://tracker.ceph.com/issues/50669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ef020b85fb16c1730fc08eadfd1b51d3c4cd019a)

librbd/cache/pwl/ssd: don't count log entries

In ssd mode log entries are variable size. Attempting to count and
impose watermarks on the number of log entries is bogus because the
total number of entries it would take to fill the cache to capacity
is also variable and can't be precisely estimated.

had conflicts, but no new changes
Fixes: https://tracker.ceph.com/issues/50669
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit ea65553b4a9ee1349c6da8452d861afe579e99e9)

librbd/cache/pwl: fix AbstractWriteLog::check_allocation() signature

All parameters are integers and none of them are (in-)out, so don't
take them by reference. Additionally num_lanes, num_log_entries and
num_unpublished_reserves don't need to be 64-bit as their respective
fields in AbstractWriteLog are 32-bit.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 74ecc4b76a10c53be928807b5be077f080d34724)

librbd/cache/pwl: rename m_log_pool_config_size to m_log_pool_size

trivial fix: no new changes: https://www.diffchecker.com/9btXJhCC
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 829ef952d2e408fe3676b38e7ecd26cbb04571a5)

librbd/cache/pwl: get rid of AbstractWriteLog::m_log_pool_actual_size

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 820bbecfb130ad99483f0d468f1b1c9612e54935)

librbd/cache/pwl/ssd: get rid of WriteLog::pool_size

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 6cd36592f631051d2ae49b7d92f2a30cd12c9c41)

librbd/cache/pwl/ssd/WriteLog: don't crash on split log entries

write_log_entries() will split a log entry at the end of the log, the
remainder is written to the beginning at DATA_RING_BUFFER_OFFSET. On
the read side aio_read_data_block() doesn't handle this case and just
crashes. Unless the workload in use is <= 4K, the image is rendered
unusable sooner or later.

trivial fix: formating changes
Fixes: https://tracker.ceph.com/issues/50589
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 155ee28d98b832e5414aa594094c86cb6bfee45e)

librbd/cache/pwl: include head and tail pointers in STATS

While at it, reduce the number of calls to operator<< and drop
the trailing comma.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 2a974fd0c1b5439be15559133077efe132b6628c)

librbd/cache/pwl: bump "Waiting for allocation" and "Retiring" dout level

Bump "Waiting for allocation" to 5.

"Retiring" is at 20 for rwl and 1 for ssd. Bump the latter to 20 as
well.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 626a995cf6e949ba47517eac62f2b97715e3fb06)

librbd/cache/pwl: use m_bytes_allocated_cap for both rwl and ssd

Follow rwl mode and use AbstractWriteLog::m_bytes_allocated_cap
instead of m_log_pool_ring_buffer_size specific to ssd. This fixes
"bytes available" calculation in STATS output.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 27dd7f85aefecea83a424036ef84116aae5d1857)

librbd/cache/pwl/ssd/WriteLog: decrement m_bytes_allocated when retiring

Currently if ssd cache is filled to capacity, all future I/O hangs
indefinitely because even though the cache eventually becomes clean
and retires enough entries to get back under RETIRE_HIGH_WATER, this
isn't communicated to AbstractWriteLog::check_allocation().

trivial fix: indentation https://www.diffchecker.com/9Vg9hgdl
Fixes: https://tracker.ceph.com/issues/50560
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit e2bbf4167fc8816d329ceae5c07c9e61599d9a17)

librbd/cache/pwl/ssd/WriteLog: fix free()/delete mismatch

Trivial-fix: space mismatch(no new change) https://www.diffchecker.com/WCSkqu2R

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
(cherry picked from commit 5b89c47ec5139a8d07be09c9f0021d90fe4a663b)

Merge pull request #43748 from tchaikov/pacific-doc-build

pacific: admin/doc-requirements.txt: pin Sphinx at 3.5.4

Reviewed-by: Sebastian Wagner <sewagner@redhat.com>

admin/doc-requirements.txt: pin Sphinx at 3.5.4

* pin Sphinx at 3.5.4
* pin docutils at 0.18

at least the combination of these two versions
is known to compile.

to address the bug reported at
https://sourceforge.net/p/docutils/bugs/431/

the backtrace looks like:

/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/sphinx/util/docutils.py:285:
RemovedInSphinx30Warning: function based directive support is now
deprecated. Use class based directive instead.
  warnings.warn('function based directive support is now deprecated. '

Exception occurred:
  File
"/home/jenkins-build/build/workspace/ceph-pr-docs/build-doc/virtualenv/lib/python3.8/site-packages/docutils/writers/html5_polyglot/__init__.py",
line 445, in section_title_tags
    if (ids and self.settings.section_self_link
AttributeError: 'Values' object has no attribute 'section_self_link'

please note this change is not cherry-picked from
master, because master already bumped Sphinx to 3.5.4
in 4968baa2523bd2a5ca6be147b26bc28906a864c9.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>

Merge pull request #43543 from rhcs-dashboard/wip-52870-pacific

pacific: mgr/dashboard: clean-up controllers and API backward versioning compatibility

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>

Merge pull request #43417 from trociny/wip-51646-pacific

pacific: osd/OSD: mkfs need wait for transcation completely finish

Reviewed-by: Kefu Chai <kchai@redhat.com>

Merge pull request #43562 from lxbsz/vino_fix

Pacific: test/libcephfs: put inodes after lookup

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #43559 from batrick/i52654-pacific

pacific: pybind/mgr/cephadm: set allow_standby_replay during CephFS upgrade

Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>

Merge pull request #43475 from lxbsz/tracker_52876

pacific: test: shutdown the mounter after test finishes

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

Merge pull request #43644 from aaSharma14/wip-52965-pacific

pacific: mgr/dashboard: monitoring: grafonnet refactoring for radosgw dashboards

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

Merge pull request #43619 from smithfarm/wip-53005-pacific

pacific: rgw/tracing: unify SO version numbers within librgw2 package

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #43512 from neha-ojha/wip-52770-pacific

pacific: os/bluestore: list obj which equals to pend

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

Merge pull request #43513 from neha-ojha/wip-52620-pacific

pacific: osd: fix partial recovery become whole object recovery after restart osd

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

Merge pull request #43511 from neha-ojha/wip-52843-pacific

pacific: msg/async/ProtocolV2: Set the recv_stamp at the beginning of receiving a message

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

Merge pull request #43445 from k0ste/wip-52848-pacific

pacific: mgr: Add check to prevent mgr from crashing

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

Merge pull request #43437 from trociny/wip-52831-pacific

pacific: osd: re-cache peer_bytes on every peering state activate

Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43421 from callithea/wip-52289-pacific

pacific: qa/tasks/mgr: skip test_diskprediction_local on python>=3.8

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Brad Hubbard <bhubbard@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43353 from kamoltat/wip-ksirivad-backport-pacific-37544

pacific: mgr/progress: optimize global recovery && introduce 5 seconds interval

Reviewed-by: Neha Ojha <nojha@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>

mgr/dashboard: monitoring: grafonnet refactoring for hosts dashboards

This PR intends to refactor hosts dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit f7714de294dd7376a9a8ae5131aa429322b459c3)

Conflicts:
monitoring/grafana/dashboards/jsonnet/grafana_dashboards.jsonnet(merging all the jsonnet dashboards in one PR)

Merge pull request #43646 from rhcs-dashboard/wip-53026-pacific

pacific: mgr/dashboard: pin a version for autopep8 and pyfakefs

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

mgr/dashboard: pin a version for autopep8 and pyfakefs

Fixes: https://tracker.ceph.com/issues/53024
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit 946dab4f608ec47e0a3cfefdf8e7d1afda69117f)

mgr/dashboard: monitoring: grafonnet refactoring for cephfs dashboards

This PR intends to refactor cephfs dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit ed954b0e6ce24fbae66f78f7e4f90416b9ed7749)

mgr/dashboard: monitoring: grafonnet refactoring for osds dashboards

This PR intends to refactor osds dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit e490e2f3abe707a2e891171f3c230d44e282c601)

mgr/dashboard: monitoring: grafonnet refactoring for pools dashboards

This PR intends to refactor pools dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit 8c48821c21f7a6b248de10ff6750a63bab1e4948)

mgr/dashboard: monitoring: grafonnet refactoring for rbd dashboards

This PR intends to refactor rbd dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit e737aaa000a31e2f37ca90eb813f031a42edef3b)

mgr/dashboard: monitoring: grafonnet refactoring for radosgw dashboards

This PR intends to refactor radosgw dashboards using grafonnet

Fixes:https://tracker.ceph.com/issues/52777
Signed-off-by: Aashish Sharma <aasharma@redhat.com>
(cherry picked from commit eb01954cd999430417555628e0099f645d371746)

rgw/tracing: unify SO version numbers within librgw2 package

The librgw2 package contains several SO files. Two of those - librgw_op_tp.so
and librgw_rados_tp.so - had a different version number than the main librgw.

This was a violation of the openSUSE Shared Library Packaging Policy [1] but it
also seems like a "violation" of common sense.

[1] https://en.opensuse.org/openSUSE:Shared_library_packaging_policy#Package_naming

Fixes: https://tracker.ceph.com/issues/52979
Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 172d6e01d5079f445044da9fe0823ceb353bdc86)

Merge pull request #43548 from rzarzynski/pacific-50483

pacific: msgr/async: fix unsafe access in unregister_conn()

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>

Merge pull request #43610 from rhcs-dashboard/wip-pr_triage_dashboard-pacific

.github: add dashboard PRs to Dashboard project

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #43440 from rhcs-dashboard/wip-52835-pacific

pacific: qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Ernesto Puerta <epuertat@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>

.github/pr-triage: rename GH token

Repo projects use GITHUB_TOKEN instead of MY_GITHUB_TOKEN:
https://github.com/srggrs/assign-one-project-github-action/blob/master/entrypoint.sh#L19

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 2220646c2085f6967e61d21ff19145666f5a1285)

.github: add dashboard PRs to Dashboard project

This action automatically adds PRs with 'dashboard' label to the
'Dashboard' project (https://github.com/ceph/ceph/projects/6).

Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit ed55c527f10237c0ab48038639a971e85f8e1377)

Merge pull request #43200 from batrick/i52639

pacific: MDSMonitor: handle damaged state from standby-replay

Reviewed-by: Venky Shankar <vshankar@redhat.com>

qa/tasks/backfill_toofull: make test work when compression on

The osd backfill reservation does not take compression into account so
we need to operate with "uncompressed" bytes when calculating nearfull
ratio.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 429ac06cbb44b8a8263beb0d0780a01cedb517ba)

Merge pull request #43267 from cfsnyder/wip-52588-pacific

pacific: ceph-volume: fix lvm activate --all --no-systemd

Merge pull request #43523 from rhcs-dashboard/wip-52911-pacific

pacific: mgr/dashboard: replace "Ceph-cluster" Client connections with active-standby MGRs

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

Merge pull request #43541 from rhcs-dashboard/wip-52931-pacific

pacific: mgr/dashboard: Fix orchestrator/01-hosts.e2e-spec.ts failure

Reviewed-by: Aashish Sharma <aasharma@redhat.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

Merge pull request #43240 from callithea/wip-52292-pacific

pacific: mgr/dashboard: visual tests: Add more ignore regions for dashboard component

Reviewed-by: aaryanporwal <NOT@FOUND>
Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Avan Thakkar <athakkar@redhat.com>
Reviewed-by: Laura Paduano <lpaduano@suse.com>
Reviewed-by: Nizamudeen A <nia@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

mgr/dashboard: replace string version with class

* APIVersion:
  * Moved to a separate file
  * Added doctests
  * Added sentinel values:
    * DEFAULT = 1.0
    * EXPERIMENTAL = 0.1
    * NONE = 0.0
  * Added to_mime_type() helper method
* Controllers.__init__:
  * Added type hints
  * Replaced string versions with APIVersions
* Feedback controller:
  * Replaced with EXPERIMENTAL (probably it should be NONE)

Fixes: https://tracker.ceph.com/issues/52480
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
Conflicts:
src/pybind/mgr/dashboard/controllers/__init__.py
   - Remove the current changes and keep the incoming new changes
src/pybind/mgr/dashboard/controllers/crush_rule.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/controllers/docs.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/controllers/feedback.py
   - Deleted the file since feedback module isn't backported to pacific
src/pybind/mgr/dashboard/controllers/host.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/openapi.yaml
   - Generated a new openapi yaml file
src/pybind/mgr/dashboard/tests/__init__.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/tests/test_docs.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/tests/test_host.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/tests/test_tools.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/tests/test_versioning.py
   - Changes related to the versioning like importing the APIVersion
src/pybind/mgr/dashboard/controllers/crush_rule.py
   - Removed the MethodMap decorator which updates the version of the
     enpoint to 2.0 because those changes which caused that version
     updating were not backported to pacific

test: shutdown the mounter after test finishes

In the previous backport commit (5772641cb9bde083), when resolving
the conflicts, this has been missed.

Fixes: https://tracker.ceph.com/issues/52876
Signed-off-by: Xiubo Li <xiubli@redhat.com>

test/libcephfs: put inodes after lookup

Otherwise, the client umount will hang due to inability to trim the
inodes looked up using the low-level interface. This results in slow-op
warnings and an eviction:

2021-09-11T17:23:31.097+0000 7f99c3522700 0 log_channel(cluster) log [WRN] : evicting unresponsive client smithi176 (9756), after 303.924 seconds
2021-09-11T17:23:31.097+0000 7f99c3522700 10 mds.0.server autoclosing stale session client.9756 172.21.15.176:0/3891214934 last renewed caps 303.924s ago

From: /ceph/teuthology-archive/yuriw-2021-09-11_16:21:09-smoke-pacific-distro-basic-smithi/6385038/remote/smithi175/log/ceph-mds.b.log.gz

Fixes: https://tracker.ceph.com/issues/52572
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit c0252063b94d811dc7863058999856ac5614d1eb)

Conflicts:
src/test/libcephfs/test.cc

qa: add test for cephfs upgrade sequence

This also checks max_mds>1 and allow_standby_replay are restored to
previous values.

Future work can add tests for multiple file systems (or volumes).

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit b1420e5771927f5c659e0e5edbc5714035f3df09)

qa: add tasks to check mds upgrade state

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 5a7382214fe4dbd4b79773c6e732512ade22793a)

qa: add note about where caps are generated

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit dbe5573ed4781cb4b214e701c77be7bc2cddabf3)

qa: move CephManager cluster instantiation to subtask

This needs to be available for the cephfs_setup task so administration
mounts can run ceph commands, potentially through `cephadm shell`.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 7812cfb6744fc3bce50e26aa7dd6a4e47a43bb23)

pybind/mgr/cephadm: disable allow_standby_replay during CephFS upgrade

Following procedure in [1].

Also: harden checks for active. Ensure "up" and "in" are both [0]. There
should be no standby-replay daemon.

[1] https://docs.ceph.com/en/pacific/cephfs/upgrading/

Fixes: https://tracker.ceph.com/issues/52654
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit bca21f01ce3bb32e0951f0fe15da88a81750a191)

pybind/mgr/cephadm: always do mds upgrade sequence

Minor versions also require this sequence.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 4affb5c7029f6b83d640aa7b7206d9cf61e75f1d)

mgr/dashboard: make modified API endpoints backward compatible

Fixes: https://tracker.ceph.com/issues/52480
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
Introducing APIVersion class to handle versioning for API-endpints and making
them backward compatible.

mgr/dashboard: clean-up controllers

Fixes: https://tracker.ceph.com/issues/52589
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
Conflicts:
src/pybind/mgr/dashboard/CMakeLists.txt
- Added some testts in the CephTest section

msgr/async: fix unsafe access in unregister_conn()

We were looking at anon_conns and accepting_conns without holding
the lock (deleted_lock is not sufficient).

Drop this test, and move the decrements:

- inc when we add to conns or anon_conns (no changes there)
- dec when we remove from deleted_conns (several different paths!)

Fixes: https://tracker.ceph.com/issues/49237
Signed-off-by: Sage Weil <sage@newdream.net>
(cherry picked from commit d51d80b3234e17690061f65dc7e1515f4244a5a3)
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>

mgr/dashboard: Fix orchestrator/01-hosts.e2e-spec.ts failure

The test is failing on deleting a host because the agent daemon is
present in that host. Its not possible to simply delete a host. We need
to drain it first and then delete it.

Fixes: https://tracker.ceph.com/issues/52764
Signed-off-by: Nizamudeen A <nia@redhat.com>
(cherry picked from commit db5cfb15e55dadf7bd5c381f53a4ea548fcea152)

mgr/dashboard: replace Client connections with active-stdby mgrs

Fixes: https://tracker.ceph.com/issues/52121
Signed-off-by: Avan Thakkar <athakkar@redhat.com>
(cherry picked from commit d388c5e958ddf5447c78db50ca2061bb443d2227)

osd: fix partial recovery become whole object recovery after restart osd

support SERVER_OCTOPUS feature for pg_missing_item::encode()

Fixes: https://tracker.ceph.com/issues/52583
Signed-off-by: Jianwei Zhang <jianwei1216@qq.com>
(cherry picked from commit dcdb188b6f577551fb377ba34145419f81322b03)

os/bluestore: list obj which equals to pend

otherwise we could have failures like

scrub : stat mismatch, got 3/4 objects, 1/2 clones, 3/4 dirty, 3/4 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 whiteouts, 49/56 bytes, 0/0 manifest objects, 0/0 hit_set_archive bytes."

where the numbers of scrubbed object, clones, dirty and omap are always
less than the total number of corresponding numbers, if the PG contains
object(s) whose hash happens to be 0xffffffff.

in this change, if the calculated hash of the upper bound is greater
than the maximum possible number represented by uint32_t, in addition to
setting the hash of the upper bound hobj to 0xffffffff, we also set the
nspace of hobj of the upper bound to "\xff", so that the upper bound
is greater than an hobj whose hash happens to be 0xfffffff. please note,
the nspace of "\xff" is not an ascii string, so it's not likely to be
less than a real-world nspace of an hobj.

with this new *greater* upper bound, we are able to include the previous
missing hobj when listing the objects in a PG. so the scrub won't be
annoyed when the number of objects does not match.

Fixes: https://tracker.ceph.com/issues/52705
Signed-off-by: Mykola Golub <mykola.golub@clyso.com>
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit ffab13bcd9006c1f961a24b8016df9d1fe06ba1d)

os/bluestore: use scope_guard to log latency

simpler this way, and avoid using `goto`.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 715a83822ebc1a3d102d1ec13323b69db0600719)

msg/async/ProtocolV2: replace ltt_recv_stamp with recv_stamp

Fixes: https://tracker.ceph.com/issues/52739
Signed-off-by: dongdong tao <dongdong.tao@canonical.com>
(cherry picked from commit 1b1a91c31ba6078caff045c499b8737e0068460f)

msg/async/ProtocolV2: Set the recv_stamp at the beginning of receiving a message instead of after receiving.

Fixes: https://tracker.ceph.com/issues/52739
Signed-off-by: dongdong tao <dongdong.tao@canonical.com>
(cherry picked from commit 5ca30f396bface2a8e95a0efb1b97f8c1b64de1c)

Merge pull request #43368 from tchaikov/pacific-pr-39602

pacific: mgr/influx: use "N/A" for unknown hostname

Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #43351 from rhcs-dashboard/wip-52772-pacific

pacific: qa/mgr/dashboard: add extra wait to test

Reviewed-by: Nizamudeen A <nia@redhat.com>

Merge pull request #43347 from rhcs-dashboard/wip-52763-pacific

pacific: mgr/dashboard: Move force maintenance test to the workflow test suite

Reviewed-by: Alfonso Martínez <almartin@redhat.com>
Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com>

Merge pull request #43167 from ktdreyer/pacific-52610-cmake-thread-libs-init

pacific: cmake: link Threads::Threads instead of CMAKE_THREAD_LIBS_INIT

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Deepika Upadhyay <dupadhya@redhat.com>

Merge pull request #43199 from vshankar/wip-52627

pacific: mgr/mirroring: remove unnecessary fs_name arg from daemon status command

Reviewed-by: Xiubo Li <xiubli@redhat.com>

Merge pull request #43198 from vshankar/wip-52444

pacific: cephfs-mirror: shutdown ClusterWatcher on termination

Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jos Collin <jcollin@redhat.com>

Merge pull request #43148 from lxbsz/fair_mutex

pacific: mds: switch mds_lock to fair mutex to fix the slow performance issue

Reviewed-by: Jeff Layton <jlayton@redhat.com>

MetricCollector.h: Add check to prevent mgr from crashing

Fixes: https://tracker.ceph.com/issues/52801
Signed-off-by: Aswin Toni <aswin.toni@cern.ch>
(cherry picked from commit 9a05872fdd499575961ee1a8d188d19054841eb8)

qa/mgr/dashboard/test_pool: don't check HEALTH_OK

Fixes: https://tracker.ceph.com/issues/48845
Signed-off-by: Ernesto Puerta <epuertat@redhat.com>
(cherry picked from commit 2283cb068b82033b14587c7bac6a28440221dcd8)

qa/suites/rados: add backfill_toofull test

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit 76743e005866664795e9240460734b31108824e2)

qa/tasks/ceph_manager: fix assertion

The osd may be 0.

Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit e0a926a2c18d76225fd4d4051bc19b9a1917b932)

osd: re-cache peer_bytes on every peering state activate

peer_bytes is used for backfill reservation request and may be
reset if backfill is interrupted, and we want it set back before
continuing backfill and re-sending the reservation request.

Fixes: https://tracker.ceph.com/issues/52448
Signed-off-by: Mykola Golub <mgolub@suse.com>
(cherry picked from commit bdfdf96d2f6c3cf7e5595ae5b8238fd4c0b3c6bc)

Merge pull request #43348 from cfsnyder/wip-52350-pacific

pacific: rgw: fix sts memory leak

Reviewed-by: Casey Bodley <cbodley@redhat.com>

Merge pull request #42643 from cfsnyder/wip-51803-pacific

pacific: rgw/notifications: send correct size in case of delete marker creation

Reviewed-by: Casey Bodley <cbodley@redhat.com>

qa/tasks/mgr: skip test_diskprediction_local on python>=3.8

query the python version before trying to test diskprediction_local

Fixes: https://tracker.ceph.com/issues/50196
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 39b2b5edc008900d531be95ece1ce75a1e036914)

mgr/selftest: add a command for querying python version

so the test driver can skip certain tests based on the version of python
runtime on the test node

Fixes: https://tracker.ceph.com/issues/50196
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 91bc0e54ab816fca12a08817c261bbbf65606726)