git.apps.os.sepia.ceph.com Git

]> git.apps.os.sepia.ceph.com Git - ceph.git/log

Yan, Zheng [Mon, 21 Jan 2019 02:08:51 +0000 (10:08 +0800)]

mds: fix infinite loop in OpTracker::check_ops_in_flight

introduced by backport commit 02faf3dc321
"mds: don't report slow request for blocked filelock request"

Fixes: http://tracker.ceph.com/issues/37977
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 21:56:15 +0000 (13:56 -0800)]

Merge pull request #25307 from trociny/wip-37438-luminous

luminous: crushtool: add --reclassify operation to convert legacy crush maps to use device classes

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 20:38:00 +0000 (12:38 -0800)]

Merge pull request #25949 from neha-ojha/wip-36686-luminous

luminous: osd/mon: pg log hard limit with upgrades fixed

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 20:33:59 +0000 (12:33 -0800)]

Merge pull request #25804 from ashishkumsingh/wip-37758-luminous

luminous: mds: clean up log messages for standby-replay

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 20:33:13 +0000 (12:33 -0800)]

Merge pull request #25904 from pdvian/wip-37829-luminous

luminous : client: fix fuse client hang because its pipe to mds is not ok4

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 20:30:04 +0000 (12:30 -0800)]

Merge pull request #26011 from batrick/i37953

luminous: qa: test_damage needs to silence MDS_READ_ONLY

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 18 Jan 2019 19:01:18 +0000 (14:01 -0500)]

doc: pglog_hardlimit flag recommendations

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 11 Jan 2019 23:35:50 +0000 (18:35 -0500)]

qa/suites/upgrade/jewel-x/stress-split*: require-osd-release luminous after upgrade

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Thu, 10 Jan 2019 02:12:17 +0000 (18:12 -0800)]

include/rados.h: hide CEPH_OSDMAP_PGLOG_HARDLIMIT from ceph -s

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit dfc292923548fe1e1b5329555dd46949feb96b99)

commit | commitdiff | tree

Neha Ojha [Fri, 11 Jan 2019 02:40:30 +0000 (21:40 -0500)]

qa/suites/upgrade/luminous-p2p-stress-split: add split scenario

This commit adds stress-split test cases to test luminous against a point
release of luminous.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 11 Jan 2019 00:25:32 +0000 (19:25 -0500)]

qa/suites/upgrade/jewel-x: add pg log settings

- vary pg log lengths
- test pglog_hardlimit flag

These are qa suites changes specific to luminous.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Neha Ojha [Mon, 7 Jan 2019 23:26:27 +0000 (15:26 -0800)]

mon/OSDMonitor.cc: make a note about reusing jewel feature bit

For OSD_PGLOG_HARDLIMIT, we have reused a jewel feature bit that was retired
in luminous. Therefore, we need to check the release version for
>= CEPH_RELEASE_LUMINOUS, before using it.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 6abcc20dca0ee5a08a3fe7c560750f904fe3fa65)

Conflicts:
src/mon/OSDMonitor.cc: trivial resolution

commit | commitdiff | tree

Neha Ojha [Thu, 20 Dec 2018 17:27:34 +0000 (09:27 -0800)]

mon: add and use OSD_PGLOG_HARDLIMIT feature bit

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 6b0a3ded5baabb19809618de16cdf67c925a8e5a)

Conflicts:
src/mon/OSDMonitor.cc: trivial resolution

commit | commitdiff | tree

Neha Ojha [Tue, 18 Dec 2018 00:20:10 +0000 (16:20 -0800)]

osd/mon: fix upgrades for pg log hard limit

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 6ceeab6e204286148a69eb147fbc4045beddef49)

Conflicts:
src/include/rados.h
src/mon/MonCommands.h
src/mon/OSDMonitor.cc
src/osd/OSDMap.cc
          Luminous does not have CEPH_OSDMAP_NOSNAPTRIM flag.
          In nautilus, CEPH_OSDMAP_PGLOG_HARDLIMIT is set by default,
          which is not the case in luminous.

commit | commitdiff | tree

Neha Ojha [Fri, 14 Dec 2018 23:59:24 +0000 (15:59 -0800)]

osd: bring back old calc_trim_to and rename new method

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 0536aef761fefda9a86d5bc72a12ca8cab931ccb)

commit | commitdiff | tree

xie xingguo [Mon, 30 Jul 2018 10:56:56 +0000 (18:56 +0800)]

osd/PrimaryLogPG: fix potential pg-log overtrimming

In https://github.com/ceph/ceph/pull/21580 I set a trap to catch some wired
and random segmentfaults and in a recent QA run I was able to observe it was
successfully triggered by one of the test case, see:

```
http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log
```

The root cause is that there might be holes on log versions, thus the
approx_size() method should (almost) always overestimate the actual number of log entries.
As a result, we might be at the risk of overtrimming log entries.

https://github.com/ceph/ceph/pull/18338 reveals a probably easier way
to fix the above problem but unfortunately it also can cause big performance regression
and hence comes this pr..

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 3654d56985c67d15506fa37b56ef5b0c04e01a65)

Conflicts:
src/osd/PrimaryLogPG.cc: trivial resolution

commit | commitdiff | tree

xie xingguo [Mon, 3 Sep 2018 07:37:36 +0000 (15:37 +0800)]

osd/PrimaryLogPG: avoid dereferencing invalid complete_to

For the auto-repair (EIO caused) case, we will not reinitialize
**complete_to** (because last_complete is equal to last_update!)
and hence there is chance that **complete_to** should aleady
point to **log.end()** before we call recover_got.

We could simply drop it here as we (already) logged the **complete_to**
iterator change in a more compatible way a few lines below.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
(cherry picked from commit 69a2cc35840939436da09691ca62476d7f599de4)

commit | commitdiff | tree

Neha Ojha [Thu, 16 Aug 2018 18:48:19 +0000 (11:48 -0700)]

osd/PrimaryLogPG.cc: limit trimming at can_rollback_to

This change is motivated by the failures seen in the multimds suite,
where we hit assert(s <= can_rollback_to), while trimming the log in ec
pools.

This is due to the fact that we had removed limits on the trim_to value to
address https://tracker.ceph.com/issues/23979.

But, seems that this could be dangerous for ec pools. So, keep the
can_rollback_to limit, while calculating the trim_to value.

Fixes: http://tracker.ceph.com/issues/21416
Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 4b5c6b88d444e2173e716fe4890717873c8dc8e5)

commit | commitdiff | tree

Neha Ojha [Sat, 4 Aug 2018 00:38:22 +0000 (17:38 -0700)]

osd/PGLog.cc: check if complete_to points to log.end()

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 630daa1d7d233aa01a2d49bf1bc7b858bf73348a)

commit | commitdiff | tree

Neha Ojha [Tue, 31 Jul 2018 00:09:51 +0000 (17:09 -0700)]

src/osd/PG.cc: remove redundant call to trim_log()

This change is motived by the failure tracked in
https://tracker.ceph.com/issues/25198. The failure highlights a case, when a
call to trim_log() after the PG has recovered, races with the previous op,
on a replica OSD. Since the previous operation has not completed, the
last_complete value for that OSD is not valid, when we try to trim the
log. It is also worth noting that the race is due to MOSDPGTrim going through
the strict queue as a peering message vs regular ops going through the
non-strict queue.

During the investigation of this bug, we noticed that, with
https://tracker.ceph.com/issues/23979, we allow pg log trimming to
happen on the primary and replicas, whenever we cross the upper bound of
the pg log. This also ensures that pg log trimming happens while processing
any new op.

Therefore, the function trim_log(), which earlier served the purpose of
trimming logs on the primary and replicas, just before the PG went into
the Recovered state, is no more required. This acted like a last line of
defense to trim logs, when we did not need the logs any more. But, this call
seems redundant now, because, we are limiting the pg log length at all times.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 283b0bde4a52128c1590afe8e5011b266a2e334b)

Conflicts:
src/osd/PG.cc: We do not need the trim_log() call any longer,
due to the explanation provided in the commit message.

commit | commitdiff | tree

Neha Ojha [Mon, 30 Jul 2018 23:42:55 +0000 (16:42 -0700)]

osd/PGLog.cc: use lgeneric_subdout instead of generic_dout

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 1b6dafb351b4f80e1e2c9a304f9e1d508ae8bf72)

commit | commitdiff | tree

Neha Ojha [Tue, 17 Jul 2018 01:11:27 +0000 (18:11 -0700)]

osd/PGLog: allow pg log trim when complete_to is less than trim_to

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit a5329ba8dd169e55deaff47d042354e53d8e722d)

Conflicts:
src/osd/PGLog.cc: Now it is possible to have complete_to version
        less than or equal to trim version, because the pg log length upper
        limit is a hard limit, and trim can proceed even when there is
        pending recovery/backfill. So do not complain when this happens.

commit | commitdiff | tree

Neha Ojha [Tue, 17 Jul 2018 01:01:26 +0000 (18:01 -0700)]

osd: reset complete_to when trimming the log past it

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 38170cdb1b8c3ea7e8b411fabe1fe99abd06cf52)

commit | commitdiff | tree

Neha Ojha [Mon, 16 Jul 2018 23:48:58 +0000 (16:48 -0700)]

osd: allow trim() to proceed when there are missing items

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit de42fee0dff299f4d0377961d05e02fd8f49f21b)

Conflicts:
src/osd/PGLog.cc: The async recovery feature is not present
in luminous. Remove async recovery requirements from this commit.

commit | commitdiff | tree

Neha Ojha [Mon, 16 Jul 2018 23:31:22 +0000 (16:31 -0700)]

osd: handle trim() during backfill

Remove async recovery components: The async recovery feature is not present
in luminous. We do not need commit 22d17fb5aad6ab9d7525d9492c0e96a36d02879e,
which adds a flag to remember async recovery. We have also removed async
recovery requirements from this commit and modified the commit message to
only reflect backfill.

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit e538c31f0f3133f811a8e478fcb25b575cad66bf)

commit | commitdiff | tree

Neha Ojha [Mon, 16 Jul 2018 22:06:12 +0000 (15:06 -0700)]

osd: print pg log length and trim_to

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit f48584a5b468949c31bffba1b507fb13a8756284)

commit | commitdiff | tree

Neha Ojha [Mon, 16 Jul 2018 21:46:21 +0000 (14:46 -0700)]

osd: make calc_trim_to() independent of min_last_complete_ondisk

Signed-off-by: Neha Ojha <nojha@redhat.com>
(cherry picked from commit 1ae5fd32c92ea2f025344c663535d00f71f2cdda)

Conflicts:
src/osd/PrimaryLogPG.cc: min_last_complete_ondisk and
pg_log.get_can_rollback_to() are no longer the limit of the pg log.
Make the head of the pg log the new limit for pg log trimming.

commit | commitdiff | tree

Yuri Weinstein [Fri, 18 Jan 2019 15:47:26 +0000 (07:47 -0800)]

Merge pull request #25833 from vshankar/wip-37762

luminous: config: drop config::lock when invoking config observer

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Wed, 16 Jan 2019 18:52:09 +0000 (10:52 -0800)]

qa: silence read-only WRN for damage testing

Fixes: http://tracker.ceph.com/issues/37944
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ac302de7b7725c259fee1852e3f402533d69f8cf)

commit | commitdiff | tree

Yuri Weinstein [Thu, 17 Jan 2019 18:06:56 +0000 (10:06 -0800)]

Merge pull request #25719 from ashishkumsingh/wip-37553-luminous

luminous: osdc/Objecter: update op_target_t::paused in _calc_target

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 20:31:23 +0000 (12:31 -0800)]

Merge pull request #25967 from batrick/i37922

luminous: qa: test_damage fixes

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 20:30:52 +0000 (12:30 -0800)]

Merge pull request #25968 from batrick/i37899

luminous: mds: purge queue recovery hangs during boot if PQ journal is damaged

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 13:03:19 +0000 (05:03 -0800)]

Merge pull request #25762 from pdvian/wip-37635-luminous

luminous: cephfs: race of updating wanted caps

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 13:02:35 +0000 (05:02 -0800)]

Merge pull request #25826 from Vicente-Cheng/wip-37092-luminous

luminous: mds: "src/mds/MDLog.cc: 281: FAILED ceph_assert(!capped)" during max_mds thrashing

Reviewed-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 16 Jan 2019 00:02:57 +0000 (16:02 -0800)]

Merge pull request #25384 from ifed01/wip-ifed-fix2-expand-luminous

luminous: core: os/bluestore_tool: fix bluefs expand

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Tue, 15 Jan 2019 23:29:22 +0000 (15:29 -0800)]

Merge pull request #25965 from yuriw/wip-yuriw-p2p-luminous

luminous: qa/tests: added v12.2.9 and v12.2.10 to the mix

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Sun, 23 Dec 2018 22:22:49 +0000 (14:22 -0800)]

mds: allow boot on read-only

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit c7ce967b778a0b86b335f6801301e484aaf6ebc3)

Conflicts:
src/mds/MDSRank.cc

commit | commitdiff | tree

Patrick Donnelly [Tue, 18 Dec 2018 23:11:02 +0000 (15:11 -0800)]

mds: setup readonly mode for PurgeQueue

If the PQ faces an error, it should go read-only along with the MDS rank.

Fixes: http://tracker.ceph.com/issues/37543
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 4cccc4dffb0915ef9e7d3b446e9a32f277646562)

Conflicts:
src/mds/PurgeQueue.cc

commit | commitdiff | tree

Patrick Donnelly [Tue, 18 Dec 2018 23:08:11 +0000 (15:08 -0800)]

mds: add missing locks for PurgeQueue methods

These could race with the asynchronous workings of the PQ.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit c7350ac23c73867b52cd9b7bb23b6c618eebe44d)

Conflicts:
src/mds/PurgeQueue.cc

commit | commitdiff | tree

Patrick Donnelly [Tue, 18 Dec 2018 22:00:29 +0000 (14:00 -0800)]

mds: delete on_error context on des

Otherwise it leaks.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 33279822eabb64380f5968cc645735a8f99a3ac1)

commit | commitdiff | tree

Patrick Donnelly [Wed, 9 Jan 2019 00:26:14 +0000 (16:26 -0800)]

qa: fix damage expectation setting

The purge queue expectation was being ignored.

Fixes: http://tracker.ceph.com/issues/37837
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 360550b1ab143bb59fd38c77569f7f5ef762801a)

commit | commitdiff | tree

Patrick Donnelly [Tue, 8 Jan 2019 23:51:53 +0000 (15:51 -0800)]

qa: fix loop variable reference

Otherwise the Mutation for Truncate is done on obj_id of the last iteration of the previous loop.

Fixes: http://tracker.ceph.com/issues/37836
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 909d0f1333a730c648c08cda047bb5f356a61986)

commit | commitdiff | tree

Yuri Weinstein [Tue, 15 Jan 2019 17:37:17 +0000 (09:37 -0800)]

qa/tests: added v12.2.9 and v12.2.10 to the mix

Signed-off-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Josh Durgin [Mon, 14 Jan 2019 21:12:58 +0000 (13:12 -0800)]

Merge branch 'luminous' into wip-ifed-fix2-expand-luminous

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 22:05:44 +0000 (14:05 -0800)]

Merge pull request #25829 from badone/wip-examples-link-order-fix

luminous: examples: fix link order in librados example Makefile

Reviewed-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 22:05:03 +0000 (14:05 -0800)]

Merge pull request #25845 from pdvian/wip-37811-luminous

luminous: mon/OSDMonitor: do not populate void pg_temp into nextmap

Reviewed-by: xie xingguo <xie.xingguo@zte.com.cn>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 22:04:30 +0000 (14:04 -0800)]

Merge pull request #25847 from pdvian/wip-37813-luminous

luminous: mon: shutdown messenger early to avoid accessing deleted logger

commit | commitdiff | tree

Alfredo Deza [Sat, 12 Jan 2019 20:19:31 +0000 (15:19 -0500)]

Merge pull request #25922 from alfredodeza/luminous-ceph-volume-fix-json

luminous ceph-volume: fix JSON output in `inventory`

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:09:05 +0000 (16:09 -0800)]

Merge pull request #25677 from joscollin/wip-37700-luminous

luminous: mds: fix bug filelock stuck at LOCK_XSYN leading client can't read data

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:08:24 +0000 (16:08 -0800)]

Merge pull request #25682 from joscollin/wip-37633-luminous

luminous: mds: remove duplicated l_mdc_num_strays perfcounter set

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:08:07 +0000 (16:08 -0800)]

Merge pull request #25684 from joscollin/wip-37631-luminous

luminous: client: do not move f->pos untill success write

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:07:44 +0000 (16:07 -0800)]

Merge pull request #25686 from joscollin/wip-37737-luminous

luminous: MDSMonitor: allow beacons from stopping MDS that was laggy

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:07:22 +0000 (16:07 -0800)]

Merge pull request #25695 from joscollin/wip-37625-luminous

luminous: pybind/mgr/status: fix ceph fs status in py3 environments.

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:06:57 +0000 (16:06 -0800)]

Merge pull request #25696 from joscollin/wip-36502-luminous

luminous: qa: increase timeout for cleanup

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:06:30 +0000 (16:06 -0800)]

Merge pull request #25779 from pdvian/wip-37694-luminous

luminous: mon: mark REMOVE_SNAPS messages as no_reply

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:05:56 +0000 (16:05 -0800)]

Merge pull request #25784 from ukernel/luminous-37739

luminous: extend reconnect period when mds is busy

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:05:28 +0000 (16:05 -0800)]

Merge pull request #25805 from ashishkumsingh/wip-36504-luminous

luminous: qa: use timeout for fs asok operations

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Sat, 12 Jan 2019 00:00:21 +0000 (16:00 -0800)]

Merge pull request #25928 from neha-ojha/wip-whitelist-slow-request

luminous: qa/tasks/thrashosds-health.yaml: whitelist 'slow request'

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Neha Ojha [Fri, 11 Jan 2019 23:19:27 +0000 (18:19 -0500)]

qa/tasks/thrashosds-health.yaml: whitelist 'slow request'

https://github.com/ceph/ceph/pull/25824 adds slow request to OSD logs.
To deal with it, whitelist 'slow request' instead of 'slow requests'.
This PR is specific to luminous because later versions whitelist it correctly.

Signed-off-by: Neha Ojha <nojha@redhat.com>

commit | commitdiff | tree

Sebastian Wagner [Thu, 22 Nov 2018 17:01:50 +0000 (18:01 +0100)]

ceph-volume: fix JSON output in `inventory`

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
(cherry picked from commit a3e6f569b4fa0419dff4690a72e9be6fe0a255c1)

commit | commitdiff | tree

Yuri Weinstein [Fri, 11 Jan 2019 13:44:35 +0000 (05:44 -0800)]

Merge pull request #25889 from pdvian/wip-37820-luminous

luminous: mds: create heartbeat grace config option

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 11 Jan 2019 13:43:42 +0000 (05:43 -0800)]

Merge pull request #25890 from vshankar/wip-purge-single-mds-multifs-test

luminous: qa: remove single mds yaml for cephfs multifs test

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 11 Jan 2019 13:42:43 +0000 (05:42 -0800)]

Merge pull request #25431 from batrick/i37540

luminous: mds: obsolete MDSMap option configs

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Fri, 11 Jan 2019 13:39:55 +0000 (05:39 -0800)]

Merge pull request #25824 from neha-ojha/wip-1659156-luminous

luminous: osd/OSD.cc: log slow requests in OSD logs

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

IvanGuan [Fri, 4 Jan 2019 04:22:27 +0000 (12:22 +0800)]

client: fix fuse client hang because its pipe to mds is not ok

If fuse client session had been killed by mds and the mds daemon restart
or hot-standby switch happens right away but the client did not receive
any message from monitor due to network or other whatever reason untill
the mds become active again.Thus cause client didn't do closed_mds_session
lead the seession still is STATE_OPEN but client can't send any message to
mds because its pipe is not ok.So we should close the stale session so that
it can be reopened again.

Fixes: http://tracker.ceph.com/issues/36079
Signed-off-by: Guan yunfei <yunfei.guan@xtaotech.com>
(cherry picked from commit 0e137de26e85942f8b40f7b13e564bd4c31b37f9)

Conflicts:
src/client/Client.cc : Resolved in handle_mds_map

commit | commitdiff | tree

Casey Bodley [Mon, 10 Dec 2018 17:38:01 +0000 (12:38 -0500)]

rgw: sanitize customer encryption keys from log output in v4 auth

Fixes: http://tracker.ceph.com/issues/37847
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit ba55e2a96c9dfcc7aa2311431beaaa23cb05c30d)

commit | commitdiff | tree

Abhishek Lekshmanan [Mon, 10 Dec 2018 23:30:46 +0000 (00:30 +0100)]

rgw: mimic gconf changes

As the largeish change from master g_conf() isn't in mimic yet, use the g_conf
global structure, also make rgw_op use the value from req_info ceph context as
we do for all the requests

Signed-off-by: Abhishek Lekshmanan <abhishek@suse.com>
(cherry picked from commit 01d647310ec2d7d423df1196eb2a7aef685d832e)

commit | commitdiff | tree

Joao Eduardo Luis [Thu, 29 Nov 2018 01:05:31 +0000 (01:05 +0000)]

rgw: fix issues with 'enforce bounds' patch

The patch to enforce bounds on max-keys/max-uploads/max-parts had a few
issues that would prevent us from compiling it. Instead of changing the
code provided by the submitter, we're addressing them in a separate
commit to maintain the DCO.

Signed-off-by: Joao Eduardo Luis <joao@suse.de>
(cherry picked from commit 29bc434a6a81a2e5c5b8cfc4c8d5c82ca5bf538a)

commit | commitdiff | tree

Robin H. Johnson [Fri, 21 Sep 2018 21:49:34 +0000 (14:49 -0700)]

rgw: enforce bounds on max-keys/max-uploads/max-parts

RGW S3 listing operations provided a way for authenticated users to
cause a denial of service against OMAPs holding bucket indices.

Bound the min & max values that a user could pass into the max-X
parameters, to keep the system safe. The default of 1000 is chosen to
match AWS S3 behavior.

Affected operations:
- ListBucket, via max-keys
- ListBucketVersions, via max-keys
- ListBucketMultiPartUploads, via max-uploads
- ListMultipartUploadParts, via max-parts

The Swift bucket listing codepath already enforced a limit, so is
unaffected by this issue.

Prior to this commit, the effective limit is the lower of
osd_max_omap_entries_per_request or osd_max_omap_bytes_per_request.

Backport: luminous, mimic
Fixes: http://tracker.ceph.com/issues/35994
Signed-off-by: Robin H. Johnson <rjohnson@digitalocean.com>
(cherry picked from commit d79f68a1e31f4bc917eec1b6bbc8e8446377dc6b)

Conflicts:
src/common/options.cc:
Conflicts due to options from master

commit | commitdiff | tree

Joao Eduardo Luis [Wed, 17 Oct 2018 13:42:15 +0000 (14:42 +0100)]

mon/config-key: limit caps allowed to access the store

Henceforth, we'll require explicit `allow` caps for commands, or for the
config-key service. Blanket caps are no longer allowed for the
config-key service, except for 'allow *'.

(for luminous and mimic, we're also ensuring MonCap's parser is able to
understand forward slashes '/' when parsing prefixes)

Signed-off-by: Joao Eduardo Luis <joao@suse.de>
(cherry picked from commit 5fff611041c5afeaf3c8eb09e4de0cc919d69237)

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 18:04:36 +0000 (10:04 -0800)]

Merge pull request #23902 from VictorDenisov/backport_24826

luminous: run-make-check.sh ccache tweaks

Reviewed-by: Nathan Cutler <ncutler@suse.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 18:04:12 +0000 (10:04 -0800)]

Merge pull request #24543 from wido/luminous-20465

luminous: os/bluestore: avoid frequent allocator dump on bluefs rebalance failure

Reviewed-by: Igor Fedotov <ifedotov@suse.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 18:03:40 +0000 (10:03 -0800)]

Merge pull request #24921 from wjwithagen/luminous

luminous: cmake: link unittest_compression against gtest

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: Nathan Cutler <ncutler@suse.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 18:03:00 +0000 (10:03 -0800)]

Merge pull request #24997 from mcv21/luminous-24996

luminous: debian: correct ceph-common relationship with older radosgw package

Reviewed-by: Kefu Chai <kchai@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 18:01:57 +0000 (10:01 -0800)]

Merge pull request #25086 from smithfarm/wip-37154-luminous

luminous: tests: ceph-admin-commands.sh workunit does not log what it's doing

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 18:01:05 +0000 (10:01 -0800)]

Merge pull request #25173 from pdvian/wip-36391-luminous

luminous: rpm: Use hardened LDFLAGS

Reviewed-by: Nathan Cutler <ncutler@suse.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 18:00:43 +0000 (10:00 -0800)]

Merge pull request #25187 from ifed01/wip-ifed-fix-set-label-luminous

luminous: ceph-bluestore-tool: fix set label functionality for specific keys

Reviewed-by: Sage Weil <sage@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 18:00:18 +0000 (10:00 -0800)]

Merge pull request #25212 from badone/wip-luminous-fix-branch-3.2-placement

luminous: tests: ceph-ansible: Move "branch" out of "vars" section

Reviewed-by: Yuri Weinstein <yweinste@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 17:58:50 +0000 (09:58 -0800)]

Merge pull request #25241 from smithfarm/wip-37383-luminous

luminous: test: Start using GNU awk and fix archiving directory

Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewed-by: David Zafman <dzafman@redhat.com>
Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 16:33:55 +0000 (08:33 -0800)]

Merge pull request #25516 from smithfarm/wip-36577-luminous

luminous: qa: teuthology may hang on diagnostic commands for fuse mount

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 16:33:30 +0000 (08:33 -0800)]

Merge pull request #25558 from smithfarm/wip-37610-luminous

luminous: qa: pjd test appears to require more than 3h timeout for some configurations

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 16:33:04 +0000 (08:33 -0800)]

Merge pull request #25560 from smithfarm/wip-37627-luminous

luminous: mds: fix incorrect l_pq_executing_ops statistics when meet an invalid item in purge queue

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 16:32:36 +0000 (08:32 -0800)]

Merge pull request #25562 from smithfarm/wip-37629-luminous

luminous: mds: do not call Journaler::_trim twice

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Thu, 10 Jan 2019 16:32:08 +0000 (08:32 -0800)]

Merge pull request #25567 from vshankar/wip-37608

luminous: mds: disallow dumping huge caches to formatter

Reviewed-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Venky Shankar [Thu, 10 Jan 2019 04:44:23 +0000 (23:44 -0500)]

qa: remove single mds yaml for cephfs multifs test

commit b98c982 (which backports 3b7233a) does not remove
this yaml file. This is causing failues such as:

Only have 2 MDSs, require 4

Signed-off-by: Venky Shankar <vshankar@redhat.com>

commit | commitdiff | tree

Patrick Donnelly [Mon, 17 Dec 2018 16:34:00 +0000 (08:34 -0800)]

mds: create heartbeat grace config option

Currently the MDS uses the mds_beacon_grace for the heartbeat timeout. If we
need to increase the beacon grace because the MDS is missing beacon replies for
some reason, we still want to see the warnings when the MDS is missing
heartbeats.

Fixes: http://tracker.ceph.com/issues/37674
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 5c143f3039c1967ca83d8a0cce35bf2a12509aef)

Conflicts:
src/mds/MDSRank.cc : Resolved in heartbeat_reset

commit | commitdiff | tree

Yuri Weinstein [Wed, 9 Jan 2019 22:16:51 +0000 (14:16 -0800)]

Merge pull request #25257 from k0ste/luminous_backports2

luminous: mgr/balancer: add crush_compat_metrics param

Reviewed-by: Sage Weil <sage@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 9 Jan 2019 21:54:47 +0000 (13:54 -0800)]

Merge pull request #25258 from k0ste/luminous_backports3

luminous: mgr: balancer: python 3 compat fixes

Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Sebastian Wagner <sebastian.wagner@suse.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 9 Jan 2019 21:53:14 +0000 (13:53 -0800)]

Merge pull request #25296 from rzarzynski/wip-bug-36248-luminous

luminous: common: fix memory leaks in WeightedPriorityQueue.

Reviewed-by: Josh Durgin <jdurgin@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 9 Jan 2019 21:52:43 +0000 (13:52 -0800)]

Merge pull request #25297 from pdvian/wip-37427-luminous

luminous: auth/AuthSessionHandler: no handler if no session key

Reviewed-by: Sage Weil <sage@redhat.com>

commit | commitdiff | tree

Yuri Weinstein [Wed, 9 Jan 2019 21:52:12 +0000 (13:52 -0800)]

Merge pull request #25369 from pdvian/wip-37478-luminous

luminous: mgr: race between daemon state and service map in 'service status'

Reviewed-by: David Galloway <dgallowa@redhat.com>

commit | commitdiff | tree

ningtao [Thu, 3 Jan 2019 15:20:12 +0000 (23:20 +0800)]

mon: shutdown messenger early to avoid accessing deleted logger

In the monitor shutdown process, the MSG thread exits after the logger is released,
causing the null pointer to be accessed. So move the logger release to the MSG thread after it exits

Fixes: http://tracker.ceph.com/issues/37780
Signed-off-by: ningtao <ningtao@sangfor.com.cn>
(cherry picked from commit 47da5a0caa7edec17ff4253e363571b78372506a)

commit | commitdiff | tree

xie xingguo [Fri, 4 Jan 2019 00:39:01 +0000 (08:39 +0800)]

mon/OSDMonitor: do not populate void pg_temp into nextmap

Due to commit ea723fb, pg_temp with clean acting set are added to inc map.
The original intent was to clear out pg_temps during priming, but as
written it would set a new_pg_temp item clearing the pg_temp even if one
didn't already exist. Adding the up != acting condition in there makes us
only take that path if there is an existing pg_temp entry to remove.

Fixes: https://tracker.ceph.com/issues/37784
Signed-off-by: Aleksei Zakharov <zakharov.a.g@yandex.ru>
(cherry picked from commit b1d3ca5e78eaee509c923f06e9024c23cc6ce31a)

commit | commitdiff | tree

Andrew Schoen [Tue, 8 Jan 2019 21:21:43 +0000 (15:21 -0600)]

Merge pull request #25838 from alfredodeza/luminous-rm37805

luminous ceph-volume tests/functional declare ceph-ansible roles instead of importing them

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Alfredo Deza [Mon, 7 Jan 2019 20:15:21 +0000 (15:15 -0500)]

ceph-volume tests/functional declare ceph-ansible roles instead of importing them

Signed-off-by: Alfredo Deza <adeza@redhat.com>
(cherry picked from commit ad6b924e2bf3b06ec66eb5bc4fea065f5babc512)

commit | commitdiff | tree

Andrew Schoen [Tue, 8 Jan 2019 15:41:33 +0000 (09:41 -0600)]

Merge pull request #25776 from alfredodeza/luminous-rm37442

luminous ceph-volume normalize comma to dot for string to int conversions

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Andrew Schoen [Tue, 8 Jan 2019 15:38:11 +0000 (09:38 -0600)]

Merge pull request #25778 from alfredodeza/luminous-rm37486

luminous ceph-volume: set permissions right before prime-osd-dir

Reviewed-by: Andrew Schoen <aschoen@redhat.com>

commit | commitdiff | tree

Kefu Chai [Sun, 30 Dec 2018 13:57:04 +0000 (21:57 +0800)]

osd: unlock osd_lock when tweaking osd settings

unlock osd_lock when serving "debug kick_recovery_wq" command

we need to unlock osd_lock temporarily when updating the osd settings,
otherwise we will run into assert failure. because
OSD::handle_conf_change() acquires the osd_lock which is not a recursive
lock.

Fixes: http://tracker.ceph.com/issues/37762
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 8e94b081506fa9fbdbea09113f1549772bb6ec04)

Conflicts:
src/osd/OSD.cc

commit | commitdiff | tree

Kefu Chai [Sun, 30 Dec 2018 13:46:55 +0000 (21:46 +0800)]

osd: use unlock_guard for unlock osd temporarily

when OSD::do_command() gets called, osd_lock is acquired. but when
serving some of these commands, we need to call methods which also
acquire the osd_lock by themselves. for instance,
OSD::handle_conf_change() gets called by cct->_conf.apply_changes().
to allow them to do so, we unlock osd_lock before calling those methods,
and re-lock it after done with them.

unlock_guard is introduced to unlock and re-lock the lock in a RAII style.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 5c628a1cc9f703351ad3bd708e908df7c9a411bb)

Conflicts:
src/osd/OSD.cc

commit | commitdiff | tree

Venky Shankar [Thu, 26 Jul 2018 03:17:03 +0000 (23:17 -0400)]

wherever: guard handle_conf_change() from concurrent execution

Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit aad318abc9a680d68aab96b051fb7457c8f7feac)

Conflicts:
src/common/ceph_context.cc
src/mds/MDSDaemon.cc
src/mon/Monitor.cc
src/osd/OSD.cc
src/osdc/Objecter.cc

fix conflicts in the form of using `Mutex` in place of `ceph::mutex`
(w/ the appropriate locking/waiting/signalling semantics).

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom